Example
In this simple demonstration, you will see how to calculate ZBLMIp (Z score of the corrected MIp using BLOSUM62 pseudo frequencies) for a Pfam MSA from the Julia REPL or using a MIToS script in the system command line.
MIToS in the Julia REPL
If you load the Pfam
module from MIToS
, you will get access to a set of functions that work with Pfam MSAs. In this case, we are going to use it for download a Stockholm MSA from the Pfam website and read it into Julia.
using MIToS.Pfam
pfam_file = downloadpfam("PF10660")
msa = read(pfam_file, Stockholm, generatemapping=true, useidcoordinates=true)
AnnotatedMultipleSequenceAlignment with 1001 annotations : 487×64 Named Array{MIToS.MSA.Residue,2} Seq ╲ Col │ 32 35 36 37 38 … 103 104 105 106 107 ─────────────────────────┼──────────────────────────────────────────────────── A0A1L8HM45_XENLA/102-167 │ M E S L A … K K K Q Q A0A2I2Y8P5_GORGO/9-41 │ - - - - - K D H R N W5UKX1_ICTPU/1-66 │ - E T I S K R K K - A0A182JXL2_9DIPT/1-64 │ M Q L L S E A R - - F6TSD5_XENTR/1-66 │ M E S I A K K K Q Q A0A158NWR3_ATTCE/11-73 │ M E P I A R - - - - A0A1J1J3N7_9DIPT/1-64 │ M E L I S A S - - - A0A2I3LDM8_PAPAN/42-76 │ - - - - - K K K Q Q A0A091D4B4_FUKDA/1-66 │ M E S V A K K K Q Q ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋮ ⋮ G3H4L8_CRIGR/2-34 │ - - - - - K E N R - G3SVU7_LOXAF/8-41 │ - - - - - K D H R - A0A1S3IS31_LINUN/1-64 │ M E T V S S K K K - A0A484E3E9_BRELC/1-56 │ M D A I R - - - - - A0A194RSG1_PAPMA/1-63 │ M Y F V S - - - - - A0A218UPM6_9PASE/1-66 │ - E T L A K K K Q Q H0VMN3_CAVPO/8-41 │ - - - - - K D H R N A0A2K5YPF8_MANLE/3-41 │ - - - - - K D H R N A0A1A6GRS4_NEOLE/34-94 │ - - - - - … K D N R -
Generation of sequence and column mappings The keyword argument generatemapping
of read
allows to generate sequence and column mappings for the MSA. Column mapping is the map between of each column on the MSA object and the column number in the file. Sequence mappings will use the start and end coordinates in the sequence ids for enumerate each residue in the sequence if useidcoordinates
is true
.
You can plot this MSA and other MIToS’ objects using the Plots package. The installation of Plots is described in the Installation section of this site:
using Plots
gr()
plot(msa)
/juliateam/.julia/packages/GR/8mv9N/src/../deps/gr/bin/gksqt: error while loading shared libraries: libQt5Widgets.so.5: cannot open shared object file: No such file or directory connect: Connection refused GKS: can't connect to GKS socket application GKS: Open failed in routine OPEN_WS GKS: GKS not in proper state. GKS must be either in the state WSOP or WSAC in routine ACTIVATE_WS
The Information
module of MIToS
has functions to calculate measures from the Information Theory, such as Entropy and Mutual Information (MI), on a MSA. In this example, we will estimate covariation between columns of the MSA with a corrected MI that use the BLOSUM62 matrix for calculate pseudo frequencies (BLMI
).
using MIToS.Information
ZBLMIp, BLMIp = BLMI(msa)
ZBLMIp # shows ZBLMIp scores
63×63 Named PairwiseListMatrices.PairwiseListMatrix{Float64,false,Array{Float64,1}} Col1 ╲ Col2 │ 35 36 … 106 107 ────────────┼────────────────────────────────────────────────────────── 35 │ NaN 0.0563188 … -0.100055 -0.23808 36 │ 0.0563188 NaN 0.155435 -0.464417 37 │ 0.00851406 0.00603174 -0.262926 -0.287343 38 │ 0.0583976 0.00799689 -0.163582 -0.458561 39 │ -0.0814246 0.123884 -0.0256342 -0.505752 40 │ 0.0681652 0.0598754 -0.0854044 -0.366635 41 │ 0.00506371 -0.0598257 0.0281494 -0.2211 42 │ 0.165036 -0.0603626 0.095943 -0.274653 43 │ -0.00736844 -0.0556297 -0.0824943 -0.411009 ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ 99 │ -0.0721807 0.0460013 -0.0297737 0.153822 100 │ 0.143868 0.104842 -0.104186 -0.336944 101 │ 0.0214702 0.150564 -0.214496 0.252194 102 │ -0.0838159 -0.11169 0.070556 0.124214 103 │ 0.0533966 0.0834249 -0.06889 -0.267319 104 │ -0.0662117 -0.0165336 -0.0653821 -0.102652 105 │ 0.174843 0.296709 -0.219076 0.139308 106 │ -0.100055 0.155435 NaN 0.116601 107 │ -0.23808 -0.464417 … 0.116601 NaN
Once the Plots package is installed and loaded, you can use its capabilities to visualize this results:
heatmap(ZBLMIp, yflip=true, c=:grays)
┌ Warning: Attribute alias `ratio` detected in the user recipe defined for the signature (::NamedArrays.NamedArray{Float64,2,PairwiseListMatrices.PairwiseListMatrix{Float64,false,Array{Float64,1}},Tuple{OrderedCollections.OrderedDict{String,Int64},OrderedCollections.OrderedDict{String,Int64}}}). To ensure expected behavior it is recommended to use the default attribute `aspect_ratio`. └ @ Plots ~/.julia/packages/Plots/ViMfq/src/pipeline.jl:15 /juliateam/.julia/packages/GR/8mv9N/src/../deps/gr/bin/gksqt: error while loading shared libraries: libQt5Widgets.so.5: cannot open shared object file: No such file or directory connect: Connection refused GKS: can't connect to GKS socket application GKS: Open failed in routine OPEN_WS GKS: GKS not in proper state. GKS must be either in the state WSOP or WSAC in routine ACTIVATE_WS
MIToS in system command line
Calculate ZBLMIp on the system shell is easy using the MIToS script called BLMI.jl
. This script reads a MSA file, and writes a file with the same base name of the input but with the .BLMI.csv
extension.
BLMI.jl PF14972.stockholm.gz