Base.copy!Method
copy!(::FASTA.Record, ::FASTQ.Record)

Copy the content of the FASTQ record into the FASTA record.

FASTX.descriptionFunction
description(record::Record)::AbstractString

Get the description of record. The description is the entire header line, minus the leading > or @ symbols for FASTA/FASTQ records, respectively, including trailing whitespace. Returns an AbstractString view into the record. If the record is overwritten, the string data will be corrupted.

See also: identifier, sequence

Examples

julia> record = parse(FASTA.Record, ">ident_here some descr \nTAGA");

julia> description(record)
"ident_here some descr "
FASTX.identifierFunction
identifier(record::Record)::AbstractString

Get the sequence identifier of record. The identifier is the description before any whitespace. If the identifier is missing, return an empty string. Returns an AbstractString view into the record. If the record is overwritten, the string data will be corrupted.

See also: description, sequence

Examples

julia> record = parse(FASTA.Record, ">ident_here some descr \nTAGA");

julia> identifier(record)
"ident_here"
FASTX.seqsizeFunction
seqsize(::Record)::Int

Get the number of bytes in the sequence of a Record. Note that in the presence of non-ASCII characters, this may differ from length(sequence(record)).

See also: sequence

Examples

julia> seqsize(parse(FASTA.Record, ">hdr\nKRRLPW\nYHS"))
9

julia> seqsize(parse(FASTA.Record, ">hdr\nαβγδϵ"))
10
FASTX.sequenceFunction
sequence([::Type{S}], record::Record, [part::UnitRange{Int}])::S

Get the sequence of record.

S can be either a subtype of BioSequences.BioSequence, AbstractString or String. If elided, S defaults to an AbstractString subtype. If part argument is given, it returns the specified part of the sequence.

See also: identifier, description

Examples

julia> record = parse(FASTQ.Record, "@read1\nTAGA\n+\n;;]]");

julia> sequence(record)
"TAGA"

julia> sequence(LongDNA{2}, record)
4nt DNA Sequence:
TAGA
FASTX.FASTAModule
FASTA

Module under FASTX with code related to FASTA files.

FASTX.FASTA.IndexType
Index(src::Union{IO, AbstractString})

FASTA index object, which allows constant-time seeking of FASTA files by name. The index is assumed to be in FAI format.

Notable methods:

  • Index(::Union{IO, AbstractString}): Read FAI file from IO or file at path
  • write(::IO, ::Index): Write index in FAI format
  • faidx(::IO)::Index: Index FASTA file
  • seekrecord(::Reader, ::AbstractString): Go to position of seq
  • extract(::Reader, ::AbstractString): Extract part of sequence

Note that the FAI specs are stricter than FASTX.jl's definition of FASTA, such that some valid FASTA records may not be indexable. See the specs at: http://www.htslib.org/doc/faidx.html

See also: FASTA.Reader

Examples

julia> src = IOBuffer("seqname\t9\t14\t6\t8\nA\t1\t3\t1\t2");

julia> fna = IOBuffer(">A\nG\n>seqname\nACGTAC\r\nTTG");

julia> rdr = FASTA.Reader(fna; index=src);

julia> seekrecord(rdr, "seqname");

julia> sequence(String, first(rdr))
"ACGTACTTG"
FASTX.FASTA.ReaderType
FASTA.Reader(input::IO; index=nothing, copy::Bool=true)

Create a buffered data reader of the FASTA file format. The reader is a BioGenerics.IO.AbstractReader, a stateful iterator of FASTA.Record. Readers take ownership of the underlying IO. Mutating or closing the underlying IO not using the reader is undefined behaviour. Closing the Reader also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTA.Record, FASTA.Writer

Arguments

  • input: data source
  • index: Optional random access index (currently fai is supported). index can be nothing, a FASTA.Index, or an IO in which case an index will be parsed from the IO, or AbstractString, in which case it will be treated as a path to a fai file.
  • copy::Bool: iterating returns fresh copies instead of the same Record. Set to false for improved performance, but be wary that iterating mutates records.

Examples

julia> rdr = FASTAReader(IOBuffer(">header\nTAG\n>another\nAGA"));

julia> records = collect(rdr); close(rdr);

julia> foreach(println, map(identifier, records))
header
another

julia> foreach(println, map(sequence, records))
TAG
AGA
FASTX.FASTA.RecordType
FASTA.Record

Mutable struct representing a FASTA record as parsed from a FASTA file. The content of the record can be queried with the following functions: identifier, description, sequence.

FASTA records are un-typed, i.e. they are agnostic to what kind of data they contain.

See also: FASTA.Reader, FASTA.Writer

Examples

julia> rec = parse(FASTARecord, ">some header\nTAqA\nCC");

julia> identifier(rec)
"some"

julia> description(rec)
"some header"

julia> sequence(rec)
"TAqACC"

julia> typeof(description(rec)) == typeof(sequence(rec)) <: AbstractString
true
FASTX.FASTA.RecordMethod
FASTA.Record(description::AbstractString, sequence)

Create a FASTA record object from description and sequence.

FASTX.FASTA.WriterType
FASTA.Writer(output::IO; width=70)

Create a data writer of the FASTA file format. The writer is a BioGenerics.IO.AbstractWriter. Writers take ownership of the underlying IO. Mutating or closing the underlying IO not using the writer is undefined behaviour. Closing the writer also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTA.Record, FASTA.Reader

Arguments

  • output: Data sink to write to
  • width: Wrapping width of sequence characters. If < 1, no wrapping.

Examples

julia> FASTA.Writer(open("some_file.fna", "w")) do writer
    write(writer, record) # a FASTA.Record
end
FASTX.FASTA.extractFunction
extract(reader::Reader, name::AbstractString, range::Union{Nothing, UnitRange})

Extract a subsequence given by index range from the sequence named in a Reader with an index. Returns a String. If range is nothing (the default value), return the entire sequence.

FASTX.FASTA.faidxMethod
faidx(fnapath::AbstractString, [idxpath::AbstractString], check=true)

Index FASTA path at fnapath and write index to idxpath. If idxpath is not given, default to same name as fnapath * ".fai". If check, throw an error if the output file already exists

See also: Index

FASTX.FASTA.faidxMethod
faidx(io::IO)::Index

Read a FASTA.Index from io.

See also: Index

Examples

julia> ind = faidx(IOBuffer(">ab\nTA\nT\n>x y\nGAG\nGA"))
Index:
  ab	3	4	2	3
  x	5	14	3	4
FASTX.FASTA.index!Function
index!(r::FASTA.Reader, ind::Union{Nothing, Index, IO, AbstractString})

Set the index of r, and return r. If ind isa Union{Nothing, Index}, directly set the index to ind. If ind isa IO, parse the index from the FAI-formatted IO first. If ind isa AbstractString, treat it as the path to a FAI file to parse.

See also: Index, FASTA.Reader

FASTX.FASTA.seekrecordFunction
seekrecord(reader::FASTAReader, i::Union{AbstractString, Integer})

Seek Reader to the i'th record. The next iterated record with be the i'th record. i can be the identifier of a sequence, or the 1-based record number in the Index.

The Reader needs to be indexed for this to work.

FASTX.FASTA.validate_fastaMethod
validate_fasta(io::IO) >: Nothing

Check if io is a valid FASTA file. Return nothing if it is, and an instance of another type if not.

Examples

julia> validate_fasta(IOBuffer(">a bc\nTAG\nTA")) === nothing
true

julia> validate_fasta(IOBuffer(">a bc\nT>G\nTA")) === nothing
false
FASTX.FASTQModule
FASTA

Module under FASTX with code related to FASTA files.

FASTX.FASTQ.QualityEncodingType
QualityEncoding(range::StepRange{Char}, offset::Integer)

FASTQ quality encoding scheme. QualityEncoding objects are used to interpret the quality scores of FASTQ records. range is a range of allowed ASCII chars in the encoding, e.g. '!':'~' for the most common encoding scheme. The offset is the ASCII offset, i.e. a character with ASCII value x encodes the value x - offset.

See also: quality_scores

Examples

julia> read = parse(FASTQ.Record, "@hdr\nAGA\n+\nabc");

julia> qe = QualityEncoding('a':'z', 16); # hypothetical encoding

julia> collect(quality_scores(read, qe)) == [Int8(i) - 16 for i in "abc"]
true
FASTX.FASTQ.ReaderType
FASTQ.Reader(input::IO; copy::Bool=true)

Create a buffered data reader of the FASTQ file format. The reader is a BioGenerics.IO.AbstractReader, a stateful iterator of FASTQ.Record. Readers take ownership of the underlying IO. Mutating or closing the underlying IO not using the reader is undefined behaviour. Closing the Reader also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTQ.Record, FASTQ.Writer

Arguments

  • input: data source
  • copy::Bool: iterating returns fresh copies instead of the same Record. Set to false for improved performance, but be wary that iterating mutates records.

Examples

julia> rdr = FASTQReader(IOBuffer("@readname\nGGCC\n+\njk;]"));

julia> record = first(rdr); close(rdr);

julia> identifier(record)
"readname"

julia> sequence(record)
"GGCC"

julia> show(collect(quality_scores(record))) # phred 33 encoding by default
Int8[73, 74, 26, 60]
FASTX.FASTQ.RecordType
FASTQ.Record

Mutable struct representing a FASTQ record as parsed from a FASTQ file. The content of the record can be queried with the following functions: identifier, description, sequence, quality FASTQ records are un-typed, i.e. they are agnostic to what kind of data they contain.

See also: FASTQ.Reader, FASTQ.Writer

Examples

julia> rec = parse(FASTQRecord, "@ill r1\nGGC\n+\njjk");

julia> identifier(rec)
"ill"

julia> description(rec)
"ill r1"

julia> sequence(rec)
"GGC"

julia> show(collect(quality_scores(rec)))
Int8[73, 73, 74]

julia> typeof(description(rec)) == typeof(sequence(rec)) <: AbstractString
true
FASTX.FASTQ.RecordMethod
FASTQ.Record(description, sequence, quality; offset=33)

Create a FASTQ record from description, sequence and quality. Arguments:

  • description::AbstractString
  • sequence::Union{AbstractString, BioSequence},
  • quality::Union{AbstractString, Vector{<:Number}}
  • Keyword argument offset (if quality isa Vector): PHRED offset
FASTX.FASTQ.WriterType
FASTQ.Writer(output::IO; quality_header::Union{Nothing, Bool}=nothing)

Create a data writer of the FASTQ file format. The writer is a BioGenerics.IO.AbstractWriter. Writers take ownership of the underlying IO. Mutating or closing the underlying IO not using the writer is undefined behaviour. Closing the writer also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTQ.Record, FASTQ.Reader

Arguments

  • output: Data sink to write to
  • quality_header: Whether to print second header on the + line. If nothing (default), check the individual Record objects for whether they contain a second header.

Examples

julia> FASTQ.Writer(open("some_file.fq", "w")) do writer
    write(writer, record) # a FASTQ.Record
end
FASTX.FASTQ.qualityFunction
quality([T::Type{String, StringView}], record::FASTQ.Record, [part::UnitRange])

Get the ASCII quality of record at positions part as type T. If not passed, T defaults to StringView. If not passed, part defaults to the entire quality string.

Examples

julia> rec = parse(FASTQ.Record, "@hdr\nUAGUCU\n+\nCCDFFG");

julia> qual = quality(rec)
"CCDFFG"

julia> qual isa AbstractString
true
FASTX.FASTQ.quality_header!Method
quality_header!(record::Record, x::Bool)

Set whether the record repeats its header on the quality comment line, i.e. the line with +.

Examples

julia> record = parse(FASTQ.Record, "@A B\nT\n+\nJ");

julia> string(record)
"@A B\nT\n+\nJ"

julia> quality_header!(record, true);

julia> string(record)
"@A B\nT\n+A B\nJ"
FASTX.FASTQ.quality_scoresFunction
quality_scores(record::FASTQ.Record, [encoding::QualityEncoding], [part::UnitRange])

Get an iterator of PHRED base quality scores of record at positions part. This iterator is corrupted if the record is mutated. By default, part is the whole sequence. By default, the encoding is PHRED33 Sanger encoding, but may be specified with a QualityEncoding object

FASTX.FASTQ.quality_scoresFunction
quality(record::Record, encoding_name::Symbol, [part::UnitRange])::Vector{UInt8}

Get an iterator of base quality of the slice part of record's quality.

The encoding_name can be either :sanger, :solexa, :illumina13, :illumina15, or :illumina18.

FASTX.FASTQ.validate_fastqMethod
validate_fastq(io::IO) >: Nothing

Check if io is a valid FASTQ file. Return nothing if it is, and an instance of another type if not.

Examples

julia> validate_fastq(IOBuffer("@i1 r1\nuuag\n+\nHJKI")) === nothing
true

julia> validate_fastq(IOBuffer("@i1 r1\nu;ag\n+\nHJKI")) === nothing
false