BGEN.one_255th
— Constantfirst_dosage_fast!(data, p, d, idx, layout)
Dosage retrieval for 8-bit biallele case, no floating-point operations!
BGEN.Bgen
— MethodBgen(path; sample_path=nothing, delay_parsing=false)
Read in the Bgen file information: header, list of samples. Variants and genotypes are read separately.
path
: path to the ".bgen" file.sample_path
: path to ".sample" file, if applicable.idx_path
: path to ".bgi" file, defaults topath * ".bgi
.
BGEN.BgenVariant
— MethodBgenVariant(b::Bgen, offset::Integer)
BgenVariant(io, offset, compression, layout, expected_n)
Parse information of a single variant beginning from offset
.
BGEN.BgenVariantIteratorFromOffsets
— TypeBgenVariantIteratorFromOffsets(b::Bgen, offsets::Vector{UInt})
BgenVariant iterator that iterates over a vector of offsets
BGEN.BgenVariantIteratorFromStart
— TypeBgenVariantIteratorFromStart(b::Bgen)
Variant iterator that iterates from the beginning of the Bgen file
BGEN.Genotypes
— MethodGenotypes{T}(p::Preamble, d::Vector{UInt8}) where T <: AbstractFloat
Create Genotypes
struct from the preamble and decompressed data string.
BGEN.chroms
— Methodchroms(vi)
chroms(bgen; offsets=nothing)
Get chromosome list of all variants.
Arguments:
vi
: a collection ofVariant
sbgen
:Bgen
objectoffsets
: offset of each variant to be returned
BGEN.clear!
— Methoddestroy_genotypes!(v::BgenVariant)
Destroy any parsed genotype information.
BGEN.clear!
— Methodclear!(g::Genotypes)
clear!(v::BgenVariant)
Clears cached decompressed byte representation, probabilities, and dose. If BgenVariant
is given, it removes the corresponding .genotypes
altogether.
BGEN.clear_decompressed!
— Methodclear_decompressed!(g::Genotypes)
Clears cached decompressed byte representation.
BGEN.decompress
— Methoddecompress(io, v, h; decompressed=nothing)
Decompress the compressed byte string for genotypes.
BGEN.find_minor_allele
— Methodfind_minor_allele(data, p)
Find minor allele index, returns 1 (first) or 2 (second)
BGEN.first_dosage_phased!
— Methodfirst_dosage_phased!(data, p, d, idx, layout)
Dosage computation for phased genotypes.
BGEN.first_dosage_slow!
— Methodfirst_dosage_slow!(data, p, d, idx, layout)
Dosage computation for general case.
BGEN.hardcall!
— Methodhardcall!(c::AbstractArray{I}, d::AbstractArray{T}; threshold=0.1) where {I, T}
Hard genotype calls for dosages. d
is the dosage vector, and c
is filled with the hard called genotypes with values 0, 1, 2, or 9 (for missing). threshold
determines maximum distance between the hardcall and the dosage. threshold
must be in [0, 0.5).
BGEN.hardcall
— Methodhardcall(d::AbstractArray{T}; threshold=0.1) where {I, T}
Hard genotype calls for dosages. d
is the dosage vector, the return UInt8 vector is filled with the hard called genotypes with values 0, 1, 2, or 9 (for missing). threshold
determines maximum distance between the hardcall and the dosage. threshold
must be in [0, 0.5).
BGEN.hwe
— Methodhwe(b::Bgen, v::BgenVariant; T=Float32, decompressed=nothing)
hwe(p::Preamble, d::Vector{UInt8}, idx::Vector{<:Integer}, layout::UInt8,
rmask::Union{Nothing, Vector{UInt16}})
Hardy-Weinberg equilibrium test for diploid biallelic case
BGEN.hwe
— Methodhwe(n00, n01, n11)
Hardy-Weinberg equilibrium test. n00
, n01
, n11
are counts of homozygotes and heterozygoes respectively. Output is the p-value of type Float64.
BGEN.info_score
— Methodinfo_score(b::Bgen, v::BgenVariant; T=Float32, decompressed=nothing)
info_score(p::Preamble, d::Vector{UInt8}, idx::Vector{<:Integer}, layout::UInt8,
rmask::Union{Nothing, Vector{UInt16}})
Information score of the variant.
BGEN.minor_allele_dosage!
— Methodminor_allele_dosage!(b::Bgen, v::BgenVariant; T=Float32,
mean_impute=false, clear_decompressed=false)
Given a Bgen
struct and a BgenVariant
, compute minor allele dosage. The result is stored inside v.genotypes.dose
, which can be cleared using clear!(v)
.
T
: type for the resultsmean_impute
: impute missing values with the mean of nonmissing valuesclear_decompressed
: clears decompressed byte string after execution if settrue
BGEN.minor_certain
— Functionminor_certain(freq, n_checked, z)
Check if minor allele is certain.
freq
: frequency of minor or major allelen_checked
: number of individuals checked so farz
: cutoff of "z" value, defaults to5.0
BGEN.offset_first_variant
— Methodoffset_first_variant(x)
returns the offset of the first variant
BGEN.parse_layout1!
— Methodparse_layout1!(data, p, d, startidx)
Parse probabilities from layout 1.
BGEN.parse_layout2!
— Methodparse_layout2!(data, p, d, startidx) Parse probabilities from layout 2.
BGEN.parse_ploidy
— Methodparse_ploidy(ploidy, d, idx, n_samples)
Parse ploidy part of the preamble.
BGEN.parse_preamble
— Methodparse_preamble(d, idx, h, v)
Parse preamble of genotypes.
BGEN.parse_variants
— Methodparse_variants(b::Bgen; offsets=offsets)
Parse variants of the file.
BGEN.positions
— Methodpositions(vi)
positions(bgen; offsets=nothing)
Get base pair positions of all variants.
Arguments:
vi
: a collection ofVariant
sbgen
:Bgen
objectoffsets
: offset of each variant to be returned
BGEN.probabilities!
— Methodprobabilities!(b::Bgen, v::BgenVariant; T=Float32, clear_decompressed=false)
Given a Bgen
struct and a BgenVariant
, compute probabilities. The result is stored inside v.genotypes.probs
, which can be cleared using clear!(v)
.
- T: type for the resutls
clear_decompressed
: clears decompressed byte string after execution if settrue
BGEN.rsids
— Methodrsids(vi)
rsids(b; offsets=nothing, from_bgen_start=false)
Get rsid list of all variants.
Arguments:
vi
: a collection ofVariant
sbgen
:Bgen
objectoffsets
: offset of each variant to be returned
BGEN.second_dosage!
— Methodsecond_dosage!(data, p)
Switch first allele dosage data
to second allele dosage.
BGEN.select_region
— Methodselect_region(bgen, chrom; start=nothing, stop=nothing)
Select variants from a region. Returns variant start offsets on the file. Returns a BgenVariantIteratorFromOffsets
object.
BGEN.variant_by_index
— Functionvariant_by_index(bgen, n)
get the n
-th variant (1-based).
BGEN.variant_by_pos
— Methodvariant_by_pos(bgen, pos)
Get the variant of bgen variant given pos
in the index file
BGEN.variant_by_rsid
— Methodvariant_by_rsid(bgen, rsid)
Find a variant by rsid
Base.Iterators.filter
— Functionfilter(dest::AbstractString, b::Bgen, variant_mask::BitVector, sample_mask::BitVector;
dest_sample=dest[1:end-5] * ".sample",
sample_path=nothing, sample_names=b.samples,
offsets=nothing, from_bgen_start=false)
Filters the input Bgen instance b
based on variant_mask
and sample_mask
. The result is saved in the new bgen file dest
. Sample information is stored in dest_sample
. sample_path
is the path of the .sample
file for the input BGEN file, and sample_names
stores the sample names in the BGEN file. offsets
and from_bgen_start
are arguments for the iterator
function of b
.
Only supports layout 2 and probibility bit depths should always be a multiple of 8. The output is always compressed in ZSTD. The sample names are stored in a separate .sample file, but not in the output .bgen file.
Base.Iterators.filter
— MethodBGEN.filter(itr; min_maf=NaN, min_hwe_pval=NaN, min_success_rate_per_variant=NaN,
cmask=trues(n_variants(itr.b)), rmask=trues(n_variants(itr.b)))
"Filtered" iterator for variants based on minmaf, minhwepval, minsuccessrateper_variant, cmask, and rmask.
GeneticVariantBase.iterator
— Methoditerator(b::Bgen; offsets=nothing, from_bgen_start=nothing)
Retrieve a variant iterator for b
.
- If
offsets
is provided, or.bgen.bgi
is provided and
from_bgen_start
is false
, it returns a VariantIteratorFromOffsets
, iterating over the list of offsets.
- Otherwise, it returns a
VariantIteratorFromStart
, iterating from the start
of bgen file to the end of it sequentially.
GeneticVariantBase.maf
— Methodmaf(b::Bgen, v::BgenVariant; T=Float32, decompressed=nothing)
maf(p::Preamble, d::Vector{UInt8}, idx::Vector{<:Integer}, layout::UInt8,
rmask::Union{Nothing, Vector{UInt16}})
Minor-allele frequency for diploid biallelic case