BinBencherBackend

BinBencherBackend.jl is a package for efficient benchmarking and interactive exploration of a set of bins against a reference.

Installation

  • Install Julia - preferably using juliaup: https://github.com/JuliaLang/juliaup
  • Launch Julia: julia
  • Press ] to enter package mode. You can exit package mode with backspace.
  • In package mode, type add https://github.com/jakobnissen/BinBencherBackend.jl to download and install the benchmarking software

Quickstart

using BinBencherBackend
ref =  Reference("files/ref.json")
bins = Binning("files/clusters.tsv", ref)
print_matrix(bins)

Concepts

  • A Sequence is a sequence (e.g. contig) clustered by the binner
  • A Genome is a target genome that should be reconstructed by the binner. It can be a virus, organism, plasmid etc. Every Genome have several Sources, and one parent Clade.
  • A Flag marks the certaincy about a boolean attribute of a genome, like "is this a virus?".
  • Sources are the sequences that Genomes are composed of. These are typically the reference genome sequences originally obtained by assembly of a purified genome (e.g. clonal colony). Sequences map to zero or more Sources at particular spans, i.e. locations.
  • A Clade contain one or more Genomes or Clades. Clades containing genomes are rank 1, and clades containing rank N clades are rank N+1 clades. All genomes descend from a chain of exactly N ranks of clades, where N > 0.
  • A Bin is a set of Sequences created by the binner. Every bin is benchmarked against all genomes and clades in the reference.
  • A Reference is composed of:
    • The genomes, a set of Genomes, each with a set of Sources and Flags
    • The taxmaps, a full set of Clades that encompasses every Genome at N ranks (where N > 0)
    • The sequences, a list of Sequences, each with zero or more mappings to Sources.
  • A Binning is a set of Bins benchmarked against a Reference

See the Reference in the left sidebar.