PDB

MIToS.PDBModule

The module PDB defines types and methods to work with protein structures inside Julia. It is useful to link structural and sequential information, and needed for measure the predictive performance at protein contact prediction of mutual information scores.

Features

  • Read and parse PDF and PDBML files
  • Calculate distance and contacts between atoms or residues
  • Determine interaction between residues
using MIToS.PDB

Contents

Types

MIToS.PDB.PDBAtomType

A PDBAtom object contains the information from a PDB atom, without information of the residue. It has the following fields that you can access at any moment for query purposes:

- `coordinates` : x,y,z coordinates, e.g. `Coordinates(109.641,73.162,42.7)`.
- `atom` : Atom name, e.g. `"CA"`.
- `element` : Element type of the atom, e.g. `"C"`.
- `occupancy` : A float number with the occupancy, e.g. `1.0`.
- `B` : B factor as a string, e.g. `"23.60"`.
MIToS.PDB.PDBFileType

PDBFile <: FileFormat

Protein Data Bank (PDB) format. It provides a standard representation for macromolecular structure data derived from X-ray diffraction and NMR studies.

MIToS.PDB.PDBMLType

PDBML <: FileFormat

Protein Data Bank Markup Language (PDBML), a representation of PDB data in XML format.

MIToS.PDB.PDBResidueType

A PDBResidue object contains all the information about a PDB residue. It has the following fields that you can access at any moment for query purposes:

- `id` : A `PDBResidueIdentifier` object.
- `atoms` : A vector of `PDBAtom`s.
MIToS.PDB.PDBResidueIdentifierType

A PDBResidueIdentifier object contains the information needed to identity PDB residues. It has the following fields that you can access at any moment for query purposes:

- `PDBe_number` : It's only used when a PDBML is readed (PDBe number as a string).
- `number` : PDB residue number, it includes insertion codes, e.g. `"34A"`.
- `name` : Three letter residue name in PDB, e.g. `"LYS"`.
- `group` : It can be `"ATOM"` or `"HETATM"`.
- `model` : The model number as a string, e.g. `"1"`.
- `chain` : The chain as a string, e.g. `"A"`.

Constants

MIToS.PDB.covalentradiusConstant

Covalent radius in Å of each element from the Additional file 1 of PICCOLO [1]. Hydrogen was updated using the value on Table 2 from Cordero et. al. [2].

  1. Bickerton, G. R., Higueruelo, A. P., & Blundell, T. L. (2011).

Comprehensive, atomic-level characterization of structurally characterized protein-protein interactions: the PICCOLO database. BMC bioinformatics, 12(1), 313.

  1. Cordero, B., Gómez, V., Platero-Prats, A. E., Revés, M.,

Echeverría, J., Cremades, E., ... & Alvarez, S. (2008). Covalent radii revisited. Dalton Transactions, (21), 2832-2838.

MIToS.PDB.vanderwaalsradiusConstant

van der Waals radius in Å from the Additional file 1 of Bickerton et. al. 2011

  • Bickerton, G. R., Higueruelo, A. P., & Blundell, T. L. (2011).

Comprehensive, atomic-level characterization of structurally characterized protein-protein interactions: the PICCOLO database. BMC bioinformatics, 12(1), 313.

Macros

MIToS.PDB.@atomsMacro

@atoms ... model ... chain ... group ... residue ... atom ...

These return a vector of PDBAtoms with the selected subset of atoms. You can use the type All to avoid filtering that option.

MIToS.PDB.@residuesMacro

@residues ... model ... chain ... group ... residue ...

These return a new vector with the selected subset of residues from a list of residues. You can use the type All to avoid filtering that option.

MIToS.PDB.@residuesdictMacro

@residuesdict ... model ... chain ... group ... residue ...

These return a dictionary (using PDB residue numbers as keys) with the selected subset of residues. You can use the type All to avoid filtering that option.

Methods and functions

Base.angleMethod

angle(a::Coordinates, b::Coordinates, c::Coordinates)

Angle (in degrees) at b between a-b and b-c

Base.anyMethod

any(f::Function, a::PDBResidue, b::PDBResidue, criteria::Function)

Test if the function f is true for any pair of atoms between the residues a and b. This function only test atoms that returns true for the fuction criteria.

Base.anyMethod

any(f::Function, a::PDBResidue, b::PDBResidue)

Test if the function f is true for any pair of atoms between the residues a and b

Base.parseMethod

parse(pdbml, ::Type{PDBML}; chain=All, model=All, group=All, atomname=All, onlyheavy=false, label=true, occupancyfilter=false)

Reads a LightXML.XMLDocument representing a pdb file. Returns a list of PDBResidues (view MIToS.PDB.PDBResidues). Setting chain, model, group, atomname and onlyheavy values can be used to select of a subset of all residues. If not set, all residues are returned. If the keyword argument label (default: true) is false,the auth_ attributes will be use instead of the label_ attributes for chain, atom and residue name fields. The auth_ attributes are alternatives provided by an author in order to match the identification/values used in the publication that describes the structure. If the keyword argument occupancyfilter (default: false) is true, only the atoms with the best occupancy are returned.

Base.parseMethod

parse(io, ::Type{PDBFile}; chain=All, model=All, group=All, atomname=All, onlyheavy=false, occupancyfilter=false)

Reads a text file of a PDB entry. Returns a list of PDBResidue (view MIToS.PDB.PDBResidues). Setting chain, model, group, atomname and onlyheavy values can be used to select of a subset of all residues. Group can be "ATOM" or "HETATM". If not set, all residues are returned. If the keyword argument occupancyfilter (default: false) is true, only the atoms with the best occupancy are returned.

Base.printFunction

print(io, res, format::Type{PDBFile}) print(res, format::Type{PDBFile})

Print a PDBResidue or a vector of PDBResidues in PDB format.

MIToS.PDB.CAmatrixMethod

Returns a matrix with the x, y and z coordinates of the Cα with best occupancy for each PDBResidue of the ATOM group. If a residue doesn't have a Cα, its Cα coordinates are NaNs.

MIToS.PDB.aromaticMethod

There's an aromatic interaction if centriods are at 6.0 Å or less.

MIToS.PDB.atomsFunction

atoms(residue_list, model, chain, group, residue, atom)

These return a vector of PDBAtoms with the selected subset of atoms. You can use the type All (default value of the positional arguments) to avoid filtering a that level.

MIToS.PDB.bestoccupancyMethod

Takes a Vector of PDBAtoms and returns a Vector of the PDBAtoms with best occupancy.

MIToS.PDB.center!Method

center!(A::AbstractMatrix{Float64})

Takes a set of points A as an NxD matrix (N: number of points, D: dimension). Translates A in place so that its centroid is at the origin of coordinates

MIToS.PDB.centeredcoordinatesFunction

Returns a Matrix{Float64} with the centered coordinates of all the atoms in residues. An optional positional argument CA (default: true) defines if only Cα carbons should be used to center the matrix.

MIToS.PDB.centeredresiduesFunction

Returns a new Vector{PDBResidue} with the PDBResidues having centered coordinates. An optional positional argument CA (default: true) defines if only Cα carbons should be used to center the matrix.

MIToS.PDB.change_coordinatesFunction

change_coordinates(residue::PDBResidue, coordinates::AbstractMatrix{Float64}, offset::Int=1)

Returns a new PDBResidues with (x,y,z) from a coordinates AbstractMatrix{Float64} You can give an offset indicating in wich matrix row starts the (x,y,z) coordinates of the residue.

MIToS.PDB.change_coordinatesMethod

change_coordinates(residues::AbstractVector{PDBResidue}, coordinates::AbstractMatrix{Float64})

Returns a new Vector{PDBResidues} with (x,y,z) from a coordinates Matrix{Float64}

MIToS.PDB.change_coordinatesMethod

change_coordinates(atom::PDBAtom, coordinates::Coordinates)

Returns a new PDBAtom but with a new coordinates

MIToS.PDB.contactMethod

contact(a::Coordinates, b::Coordinates, limit::AbstractFloat)

It returns true if the distance is less or equal to the limit. It doesn't call sqrt because it does squared_distance(a,b) <= limit^2.

MIToS.PDB.contactMethod

contact(A::PDBResidue, B::PDBResidue, limit::AbstractFloat; criteria::String="All")

Returns true if the residues A and B are at contact distance (limit). The available distance criteria are: Heavy, All, CA, CB (CA for GLY)

MIToS.PDB.contactMethod

contact(residues::Vector{PDBResidue}, limit::AbstractFloat; criteria::String="All")

If contact takes a Vector{PDBResidue}, It returns a matrix with all the pairwise comparisons (contact map).

MIToS.PDB.covalentMethod

Returns true if the distance between atoms is less than the sum of the covalentradius of each atom.

MIToS.PDB.distanceMethod

distance(residues::Vector{PDBResidue}; criteria::String="All")

If distance takes a Vector{PDBResidue} returns a PairwiseListMatrix{Float64, false} with all the pairwise comparisons (distance matrix).

MIToS.PDB.downloadpdbMethod

It downloads a gzipped PDB file from PDB database. It requires a four character pdbcode. Its default format is PDBML (PDB XML) and It uses the baseurl "http://www.rcsb.org/pdb/files/". filename is the path/name of the output file. This function calls MIToS.Utils.download_file that calls download from the HTTP.jl package. You can use keyword arguments from HTTP.request. Use the headers keyword argument to pass a Dict{String, String} with the header information.

MIToS.PDB.findatomsMethod

findatoms(res::PDBResidue, atom::String)

Returns a index vector of the atoms with the given atom name.

MIToS.PDB.findheavyMethod

Returns a list with the index of the heavy atoms (all atoms except hydrogen) in the PDBResidue

MIToS.PDB.getCAMethod

Returns the Cα with best occupancy in the PDBResidue. If the PDBResidue has no Cα, missing is returned.

MIToS.PDB.getpdbdescriptionMethod

Access general information about a PDB entry (e.g., Header information) using the GraphQL interface of the PDB database. It parses the JSON answer into a Dict.

MIToS.PDB.hydrogenbondMethod

This function only works if there are hydrogens in the structure. The criteria for a hydrogen bond are:

  • d(Ai, Aj) < 3.9Å
  • d(Ah, Aacc) < 2.5Å
  • θ(Adon, Ah, Aacc) > 90°
  • θ(Adon, Aacc, Aacc-antecedent) > 90°
  • θ(Ah, Aacc, Aacc-antecedent) > 90°

Where Ah is the donated hydrogen atom, Adon is the hydrogen bond donor atom, Aacc is the hydrogen bond acceptor atom and Aacc-antecednt is the atom antecedent to the hydrogen bond acceptor atom.

MIToS.PDB.hydrophobicMethod

There's an hydrophobic interaction if two hydrophobic atoms are at 5.0 Å or less.

MIToS.PDB.ionicMethod

There's an ionic interaction if a cationic and an anionic atoms are at 6.0 Å or less.

MIToS.PDB.isanionicMethod

Returns true if the atom, e.g. ("GLU","CD"), is an anionic atom in the residue.

MIToS.PDB.isaromaticMethod

Returns true if the atom, e.g. ("HIS","CG"), is an aromatic atom in the residue.

MIToS.PDB.iscationicMethod

Returns true if the atom, e.g. ("ARG","NE"), is a cationic atom in the residue.

MIToS.PDB.isresidueMethod

It tests if the PDB residue has the indicated model, chain, group and residue number.

MIToS.PDB.kabschMethod

kabsch(A::AbstractMatrix{Float64}, B::AbstractMatrix{Float64})

This function takes two sets of points, A (refrence) and B as NxD matrices, where D is the dimension and N is the number of points. Assumes that the centroids of A and B are at the origin of coordinates. You can call center! on each matrix before calling kabsch to center the matrices in the (0.0, 0.0, 0.0). Rotates B so that rmsd(A,B) is minimized. Returns the rotation matrix. You should do B * RotationMatrix to get the rotated B.

MIToS.PDB.mean_coordinatesMethod

Calculates the average/mean position of each atom in a set of structure. The function takes a vector (AbstractVector) of vectors (AbstractVector{PDBResidue}) or matrices (AbstractMatrix{Float64}) as first argument. As second (optional) argument this function can take an AbstractVector{Float64} of matrix/structure weights to return a weighted mean. When a AbstractVector{PDBResidue} is used, if the keyword argument calpha is false the RMSF is calculated for all the atoms. By default only alpha carbons are used (default: calpha=true).

MIToS.PDB.picationMethod

There's a Π-Cation interaction if a cationic and an aromatic atoms are at 6.0 Å or less

MIToS.PDB.proximitymeanMethod

proximitymean calculates the proximity mean/average for each residue as the average score (from a scores list) of all the residues within a certain physical distance to a given amino acid. The score of that residue is not included in the mean unless you set include to true. The default values are 6.05 for the distance threshold/limit and "Heavy" for the criteria keyword argument. This function allows to calculate pMI (proximity mutual information) and pC (proximity conservation) as in Buslje et. al. 2010.

Buslje, Cristina Marino, Elin Teppa, Tomas Di Doménico, José María Delfino, and Morten Nielsen. Networks of high mutual information define the structural proximity of catalytic sites: implications for catalytic residue identification. PLoS Comput Biol 6, no. 11 (2010): e1000978.

MIToS.PDB.residuepairsmatrixMethod

It creates a NamedArray containing a PairwiseListMatrix where each element (column, row) is identified with a PDBResidue from the input vector. You can indicate the value type of the matrix (default to Float64), if the list should have the diagonal values (default to Val{false}) and the diagonal values (default to NaN).

MIToS.PDB.residuesMethod

residues(residue_list, model, chain, group, residue)

These return a new vector with the selected subset of residues from a list of residues. You can use the type All (default value) to avoid filtering a that level.

MIToS.PDB.residuesdictMethod

residuesdict(residue_list, model, chain, group, residue)

These return a dictionary (using PDB residue numbers as keys) with the selected subset of residues. You can use the type All (default value) to avoid filtering a that level.

MIToS.PDB.rmsdMethod

rmsd(A::AbstractMatrix{Float64}, B::AbstractMatrix{Float64})

Return RMSD between two sets of points A and B, given as NxD matrices (N: number of points, D: dimension).

MIToS.PDB.rmsdMethod

rmsd(A::AbstractVector{PDBResidue}, B::AbstractVector{PDBResidue}; superimposed::Bool=false)

Returns the Cα RMSD value between two PDB structures: A and B. If the structures are already superimposed between them, use superimposed=true to avoid a new superimposition (superimposed is false by default).

MIToS.PDB.rmsfMethod

Calculates the RMSF (Root Mean-Square-Fluctuation) between an atom and its average position in a set of structures. The function takes a vector (AbstractVector) of vectors (AbstractVector{PDBResidue}) or matrices (AbstractMatrix{Float64}) as first argument. As second (optional) argument this function can take an AbstractVector{Float64} of matrix/structure weights to return the root weighted mean-square-fluctuation around the weighted mean structure. When a Vector{PDBResidue} is used, if the keyword argument calpha is false the RMSF is calculated for all the atoms. By default only alpha carbons are used (default: calpha=true).

MIToS.PDB.selectbestoccupancyMethod

Takes a PDBResidue and a Vector of atom indices. Returns the index value of the Vector with maximum occupancy.

MIToS.PDB.squared_distanceMethod

squared_distance(A::PDBResidue, B::PDBResidue; criteria::String="All")

Returns the squared distance between the residues A and B. The available criteria are: Heavy, All, CA, CB (CA for GLY)

MIToS.PDB.superimposeFunction
Asuper, Bsuper, RMSD = superimpose(A, B, matches=nothing)

This function takes A::AbstractVector{PDBResidue} (reference) and B::AbstractVector{PDBResidue}. Translates A and B to the origin of coordinates, and rotates B so that rmsd(A,B) is minimized with the Kabsch algorithm (using only their α carbons). Returns the rotated and translated versions of A and B, and the RMSD value.

Optionally provide matches which iterates over matched index pairs in A and B, e.g., matches = [(3, 5), (4, 6), ...]. The alignment will be constructed using just the matching residues.

MIToS.PDB.vanderwaalsMethod

Test if two atoms or residues are in van der Waals contact using: distance(a,b) <= 0.5 + vanderwaalsradius[a] + vanderwaalsradius[b]. It returns distance <= 0.5 if the atoms aren't in vanderwaalsradius.

MIToS.PDB.vanderwaalsclashMethod

Returns true if the distance between the atoms is less than the sum of the vanderwaalsradius of the atoms. If the atoms aren't on the list (i.e. OXT), the vanderwaalsradius of the element is used. If there is not data in the dict, distance 0.0 is used.