Representing DNA sequences as regular tetrahedrals (Simplex)

Documentation Latest Release
CI Workflow License Work in Progress Downloads Aqua QA


Representing DNA sequences as regular tetrahedrals (Simplex)

This packages has a single public function biosimplex that takes a BioSequence and returns a Simplex representation of a BioSequence. The Simplex representation is a 3D representation of the BioSequence where each base can be represented as unit vectors pointing into a regular tetrahedron (Silverman et al., 1986; Coward, 1997).


BioSimplex is a   Julia Language   package. To install BioSimplex, please open Julia's interactive session (known as REPL) and press ] key in the REPL to use the package mode, then type the following command

pkg> add BioSimplex


using BioSequences, BioSimplex

# Create a BioSequence
seq = dna"ATCG"

# Convert the BioSequence to a Simplex representation 

3×4 Matrix{Float64}:
 0.0   0.942809  -0.471405  -0.471405
 0.0   0.0        0.816497  -0.816497
 1.0  -0.333333  -0.333333  -0.333333


The Simplex representation is useful for to generate a numerical representation of the sequences so that it can be used in machine learning models.


Coward, E. (1997). Equivalence of two Fourier methods for biological sequences. Journal of Mathematical Biology, 36(1), 64–70.

Silverman, B. D., & Linsker, R. (1986). A measure of DNA periodicity. Journal of Theoretical Biology, 118(3), 295–300.