model

In this section, we list the available models to be fit with either mcmc or vi. To see examples on how to implement these models, please check the examples tab.

Single-dataset single-environment

BarBay.model.fitness_normalFunction

fitness_normal(R̲̲::Matrix{Int64}, n̲ₜ::Vector{Int64}, n_neutral::Int, n_bc::Int; kwargs...)

Defines a model to estimate fitness effects in a competitive fitness experiment across growth-dilution cycles.

Model summary

  • Prior on population mean fitness at time t, π(sₜ)

sₜ ~ Normal(params=s_pop_prior)

  • Prior on population log mean fitness associated error at time t, π(logσₜ)

logσₜ ~ Normal(params=logσ_pop_prior)

  • Prior on non-neutral relative fitness for barcoe m, π(s⁽ᵐ⁾)

s⁽ᵐ⁾ ~ Normal(params=s_bc_prior)

  • Prior on non-neutral log relative fitness associated error for barcode m π(logσ⁽ᵐ⁾)

logσ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)

  • Prior on log Poisson distribtion parameters for barcode m at time t , π(logλₜᵣ⁽ᵐ⁾)

logλₜ⁽ᵐ⁾ ~ Normal(params=logλ_prior)

  • Probability of total number of barcodes read given the Poisson distribution parameters at time t π(nₜ | logλ̲ₜ)

nₜ ~ Poisson(∑ₘ exp(λₜ⁽ᵐ⁾))

  • Barcode j frequency at time t (deterministic relationship from the Poisson parameters)

fₜ⁽ʲ⁾ = λₜ⁽ʲ⁾ / ∑ₖ λₜ⁽ᵏ⁾

  • log frequency ratio for barcode j at time t (deterministic relationship from barcode frequencies)

logγₜ⁽ʲ⁾ = log(f₍ₜ₊₁₎⁽ʲ⁾ / fₜ⁽ʲ⁾)

  • Probability of number of reads at time t for all barcodes given the total number of reads and the barcode frequencies π(r̲ₜ | nₜ, f̲ₜ)

r̲ₜ ~ Multinomial(nₜ, f̲ₜ)

  • Probability of neutral barcodes frequency ratio for barcode n at time t π(logγₜ⁽ⁿ⁾| sₜ, σₜ)

logγₜ⁽ⁿ⁾ ~ Normal(μ = -sₜ, σ = exp(logσₜ))

  • Probability of non-neutral barcodes frequency ratio for barcode m at time t π(logγₜ⁽ᵐ⁾| s⁽ᵐ⁾, σ⁽ᵐ⁾, sₜ)

logγₜ⁽ᵐ⁾ ~ Normal(μ = s⁽ᵐ⁾ - sₜ, σ = exp(logσ⁽ᵐ⁾))

Arguments

  • R̲̲::Matrix{Int64}: T × B matrix–split into a vector of vectors for computational efficiency–where T is the number of time points in the data set and B is the number of barcodes. Each column represents the barcode count trajectory for a single lineage.
  • n̲ₜ::Vector{Int64}: Vector with the total number of barcode counts for each time point. NOTE: This vector must be equivalent to computing vec(sum(R̲̲, dims=2)).
  • n_neutral::Int: Number of neutral lineages in dataset.
  • n_bc::Int: Number of mutant lineages in dataset.

Optional Keyword Arguments

  • s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_pop_prior[1] = mean, s_pop_prior[2] = standard deviation, Matrix: s_pop_prior[:, 1] = mean, s_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness values. If typeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.
  • logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_pop_prior[1] = mean, logσ_pop_prior[2] = standard deviation, Matrix: σ_pop_prior[:, 1] = mean, logσ_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness log-error utilized in the log-likelihood function. If typeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.
  • s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness values. If typeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.
  • logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_bc_prior[1] = mean, logσ_bc_prior[2] = standard deviation, Matrix: logσ_bc_prior[:, 1] = mean, logσ_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness log-error utilized in the log-likelihood function. If typeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.
  • logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector: logλ_prior[1] = mean, logλ_prior[2] = standard deviation, Matrix: logλ_prior[:, 1] = mean, logλ_prior[:, 2] = standard deviation) for a Normal prior on the log of the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode count n⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). If typeof(λ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points in the dataset.

Latent Variables

  • Population mean fitness per timepoint.
  • Mutant fitness effects.
  • λ dispersion parameters per barcode and timepoint.

Notes

  • Models fitness effects as normally distributed.
  • Utilizes a Poisson observation model for barcode counts.
  • Setting informative priors is recommended for stable convergence.

Single-dataset multi-environment

BarBay.model.multienv_fitness_normalFunction

multienv_fitness_normal(R̲̲::Matrix{Int64}, n̲ₜ::Vector{Int64}, n_neutral::Int, n_bc::Int; kwargs...)

Defines a model to estimate fitness effects in a competitive fitness experiment with different environments across growth-dilution cycles.

Model summary

  • Prior on population mean fitness at time t, π(sₜ)

sₜ ~ Normal(params=s_pop_prior)

  • Prior on population log mean fitness associated error at time t, π(logσₜ)

logσₜ ~ Normal(params=logσ_pop_prior)

  • Prior on non-neutral relative fitness for barcode m in environment i π(sᵢ⁽ᵐ⁾)

sᵢ⁽ᵐ⁾ ~ Normal(params=s_bc_prior)

  • Prior on non-neutral log relative fitness associated error for barcode m in environment i π(logσᵢ⁽ᵐ⁾)

logσᵢ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)

  • Prior on log Poisson distribtion parameters for barcode m at time t , π(logλₜᵣ⁽ᵐ⁾)

logλₜ⁽ᵐ⁾ ~ Normal(params=logλ_prior)

  • Probability of total number of barcodes read given the Poisson distribution parameters at time t π(nₜ | logλ̲ₜ)

nₜ ~ Poisson(∑ₘ exp(λₜ⁽ᵐ⁾))

  • Barcode j frequency at time t (deterministic relationship from the Poisson parameters)

fₜ⁽ʲ⁾ = λₜ⁽ʲ⁾ / ∑ₖ λₜ⁽ᵏ⁾

  • log frequency ratio for barcode j at time t (deterministic relationship from barcode frequencies)

logγₜ⁽ʲ⁾ = log(f₍ₜ₊₁₎⁽ʲ⁾ / fₜ⁽ʲ⁾)

  • Probability of number of reads at time t for all barcodes given the total number of reads and the barcode frequencies π(r̲ₜ | nₜ, f̲ₜ)

r̲ₜ ~ Multinomial(nₜ, f̲ₜ)

  • Probability of neutral barcodes frequency ratio for barcode n at time t π(logγₜ⁽ⁿ⁾| sₜ, logσₜ)

logγₜ⁽ⁿ⁾ ~ Normal(μ = -sₜ, σ = exp(logσₜ))

  • Probability of non-neutral barcodes frequency ratios π(logγₜ⁽ᵐ⁾| s⁽ᵐ⁾, logσ⁽ᵐ⁾, sₜ). Note: This is done grouping by corresponding environment such that if time t is associated with environment i, sᵢ⁽ᵐ⁾ is used as the fitness value.

logγₜ⁽ᵐ⁾ ~ Normal(μ = sᵢ⁽ᵐ⁾ - sₜ, σ = exp(σ⁽ᵐ⁾))

Arguments

  • R̲̲::Matrix{Int64}:: T × B matrix–split into a vector of vectors for computational efficiency–where T is the number of time points in the data set and B is the number of barcodes. Each column represents the barcode count trajectory for a single lineage.
  • n̲ₜ::Vector{Int64}: Vector with the total number of barcode counts for each time point. NOTE: This vector must be equivalent to computing vec(sum(R̲̲, dims=2)).
  • n_neutral::Int: Number of neutral lineages in dataset.
  • n_bc::Int: Number of mutant lineages in dataset.

Keyword Arguments

  • envs::Vector{<:Any}: List of environments for each time point in dataset. NOTE: The length must be equal to that of n̲t to have one environment per time point.

Optional Keyword Arguments

  • s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_pop_prior[1] = mean, s_pop_prior[2] = standard deviation, Matrix: s_pop_prior[:, 1] = mean, s_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness values. If typeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.
  • logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_pop_prior[1] = mean, logσ_pop_prior[2] = standard deviation, Matrix: σ_pop_prior[:, 1] = mean, logσ_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness log-error utilized in the log-likelihood function. If typeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.
  • s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness values. If typeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.
  • logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_bc_prior[1] = mean, logσ_bc_prior[2] = standard deviation, Matrix: logσ_bc_prior[:, 1] = mean, logσ_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness log-error utilized in the log-likelihood function. If typeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.
  • logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector: logλ_prior[1] = mean, logλ_prior[2] = standard deviation, Matrix: logλ_prior[:, 1] = mean, logλ_prior[:, 2] = standard deviation) for a Normal prior on the log of the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode count n⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). If typeof(λ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points in the dataset.

Latent Variables

  • Population mean fitness per timepoint
  • Mutant fitness effects per environment
  • λ dispersion parameters per barcode and timepoint

Notes

  • Models fitness effects as normally distributed.
  • Utilizes a Poisson observation model for barcode counts.
  • Can estimate time-varying and environment-specific fitness effects.
  • Setting informative priors is recommended for stable convergence.

Single-dataset hierarchical model on genotypes

BarBay.model.genotype_fitness_normalFunction

genotype_fitness_normal(R̲̲::Vector{Matrix{Int64}}, n̲ₜ::Vector{Vector{Int64}}, n_neutral::Int, n_bc::Int; kwargs...)

Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment where multiple barcodes belong to a specific "genotype." This means that different barcodes are grouped together through a fitness hyperparameter where each barcode samples from the distribution of this hyperparameter.

Model summary

  • Prior on population mean fitness at time t, π(sₜ)

sₜ ~ Normal(params=s_pop_prior)

  • Prior on population log mean fitness associated error π(logσₜ)

logσₜ ~ Normal(params=logσ_pop_prior)

  • Prior on non-neutral relative hyper-fitness for genotype i π(θᵢ)

θᵢ ~ Normal(params=s_bc_prior)

  • prior on non-centered samples that allow local fitness to vary in the positive and negative direction for genotype i π(θ̃ᵢ⁽ᵐ⁾). Note, this is a standard normal with mean zero and standard deviation one.

θ̃ᵢ⁽ᵐ⁾ ~ Normal(μ = 0, σ = 1)

  • prior on log deviations of local fitness of barcode m from hyper-fitness for genotype i π(logτᵢ⁽ᵐ⁾)

logτᵢ⁽ᵐ⁾ ~ Normal(params=logτ_prior)

  • local relative fitness for non-neutral barcode m with genotype i (deterministic relationship from hyper-priors)

sᵢ⁽ᵐ⁾ = θᵢ + θ̃ᵢ⁽ᵐ⁾ * exp(logτᵢ⁽ᵐ⁾)

  • Prior on non-neutral log relative fitness associated error for non-neutral barcode m with genotype i, π(logσᵢ⁽ᵐ⁾)

logσᵢ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)

  • Prior on log Poisson distribtion parameters for barcode m at time t in replicate r, π(logλₜᵣ⁽ᵐ⁾)

logλₜᵣ⁽ᵐ⁾ ~ Normal(params=logλ_prior)

  • Probability of total number of barcodes read given the Poisson distribution parameters π(nₜ | logλ̲ₜ)

nₜ ~ Poisson(∑ₖ exp(λₜ⁽ᵏ⁾))

  • Barcode j frequency at time t (deterministic relationship from the Poisson parameters)

fₜ⁽ʲ⁾ = λₜ⁽ʲ⁾ / ∑ₖ λₜ⁽ᵏ⁾

    • log frequency ratio for barcode j at time t (deterministic relationship from barcode frequencies)

logγₜ⁽ʲ⁾ = log(fₜ₊₁⁽ʲ⁾ / fₜ⁽ʲ⁾)

  • Probability of number of reads at time t for all barcodes given the total number of reads and the barcode frequencies π(r̲ₜ | nₜ, f̲ₜ)

r̲ₜ ~ Multinomial(nₜ, f̲ₜ)

  • Probability of neutral barcodes n frequency ratio at time t π(logγₜ⁽ⁿ⁾| sₜ, logσₜ)

logγₜ⁽ⁿ⁾ ~ Normal(μ = -sₜ, σ = exp(logσₜ))

  • Probability of non-neutral barcode m with genotype i frequency ratio at time t π(logγₜ⁽ᵐ⁾| sᵢ⁽ᵐ⁾, logσᵢ⁽ᵐ⁾, sₜ)

logγₜ⁽ᵐ⁾ ~ Normal(μ = sᵢ⁽ᵐ⁾ - sₜ, σ = exp(logσᵢ⁽ᵐ⁾))

Arguments

  • R̲̲::Vector{Matrix{Int64}}:: Length R vector wth T × B matrices where T is the number of time points in the data set, B is the number of barcodes, and R is the number of experimental replicates. For each matrix in the vector, each column represents the barcode count trajectory for a single lineage.
  • n̲ₜ::Vector{Vector{Int64}}: Vector of vectors with the total number of barcode counts for each time point on each replicate. NOTE: This vector must be equivalent to computing vec.(sum.(R̲̲, dims=2)).
  • n_neutral::Int: Number of neutral lineages in dataset.
  • n_bc::Int: Number of mutant lineages in dataset.

Keyword Arguments

  • genotypes::Vector{Vector{<:Any}}: Vector with the list of genotypes for each non-neutral barcode.

Optional Keyword Arguments

  • s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_pop_prior[1] = mean, s_pop_prior[2] = standard deviation, Matrix: s_pop_prior[:, 1] = mean, s_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness values. If typeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.
  • logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_pop_prior[1] = mean, logσ_pop_prior[2] = standard deviation, Matrix: logσ_pop_prior[:, 1] = mean, logσ_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. If typeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.
  • s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness values. If typeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.
  • logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_bc_prior[1] = mean, logσ_bc_prior[2] = standard deviation, Matrix: logσ_bc_prior[:, 1] = mean, logσ_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. If typeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.
  • logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector: logλ_prior[1] = mean, logλ_prior[2] = standard deviation, Matrix: logλ_prior[:, 1] = mean, logλ_prior[:, 2] = standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode count n⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). If typeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.

Latent Variables

  • Population mean fitness per timepoint.
  • Genotype hyper-fitness effects.
  • Non-centered samples for each of the non-neutral barcodes.
  • Deviations from the hyper parameter value for each non-neutral barcode.
  • λ dispersion parameters per barcode and timepoint.

Notes

  • Models hyper-fitness effects as normally distributed.
  • Models fitness effects as normally distributed.
  • Utilizes a Poisson observation model for barcode counts.
  • Setting informative priors is recommended for stable convergence.

Multi-replicate single-environment hierarchical model for experimental replicates

BarBay.model.replicate_fitness_normalFunction

replicate_fitness_normal(R̲̲::Array{Int64,3}, n̲ₜ::Matrix{Int64}, n_neutral::Int, n_bc::Int; kwargs...)

Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment across growth-dilution cycles over multiple experimental replicates.

Model summary

  • Prior on population mean fitness at time t in replicate r, π(sₜᵣ)

sₜᵣ ~ Normal(params=s_pop_prior)

  • Prior on population log mean fitness associated error at time t in replicate r, π(logσₜ)

logσₜᵣ ~ Normal(params=logσ_pop_prior)

  • Prior on non-neutral relative hyper-fitness for barcode m π(θ⁽ᵐ⁾)

θ⁽ᵐ⁾ ~ Normal(params=s_bc_prior)

  • Prior on non-centered samples that allow local fitness to vary in the positive and negative direction for barcode m in experimental replicate r π(θ̃ᵣ⁽ᵐ⁾). Note, this is a standard normal with mean zero and standard deviation one.

θ̃ᵣ⁽ᵐ⁾ ~ Normal(μ = 0, σ = 1)

  • Prior on log deviations of local fitness from hyper-fitness for barcode m in replicate r π(logτᵣ⁽ᵐ⁾)

logτᵣ⁽ᵐ⁾ ~ Normal(params=logτ_prior)

  • Local relative fitness for non-neutral barcode m in replicate r (deterministic relationship from hyper-priors)

sᵣ⁽ᵐ⁾ = θ⁽ᵐ⁾ + θ̃ᵣ⁽ᵐ⁾ * exp(logτᵣ⁽ᵐ⁾)

  • Prior on non-neutral log relative fitness associated error for non-neutral barcode m in replcate r, π(logσᵣ⁽ᵐ⁾)

logσᵣ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)

  • Prior on log Poisson distribtion parameters for barcode m at time t in replicate r, π(logλₜᵣ⁽ᵐ⁾)

logλₜᵣ⁽ᵐ⁾ ~ Normal(params=logλ_prior)

  • Probability of total number of barcodes read given the Poisson distribution parameters at time t in replicate r π(nₜᵣ | logλ̲ₜᵣ)

nₜᵣ ~ Poisson(∑ₘ exp(λₜᵣ⁽ᵐ⁾))

  • Barcode j frequency at time t in replciate r (deterministic relationship from the Poisson parameters)

fₜᵣ⁽ʲ⁾ = λₜᵣ⁽ʲ⁾ / ∑ₖ λₜᵣ⁽ᵏ⁾

  • Log frequency ratio for barcode j at time t in replicate r (deterministic relationship from barcode frequencies)

logγₜᵣ⁽ʲ⁾ = log(f₍ₜ₊₁₎ᵣ⁽ʲ⁾ / fₜᵣ⁽ʲ⁾)

  • Probability of number of reads at time t for all barcodes in replicate r given the total number of reads and the barcode frequencies π(r̲ₜᵣ | nₜᵣ, f̲ₜᵣ)

r̲ₜᵣ ~ Multinomial(nₜᵣ, f̲ₜᵣ)

  • Probability of neutral barcodes n frequency ratio at time t in replicate r, π(logγₜᵣ⁽ⁿ⁾| sₜᵣ, logσₜᵣ)

logγₜᵣ⁽ⁿ⁾ ~ Normal(μ = -sₜᵣ, σ = exp(logσₜᵣ))

  • Probability of non-neutral barcode m frequency ratio at time t in replicate r π(logγₜᵣ⁽ᵐ⁾| sᵣ⁽ᵐ⁾, logσᵣ⁽ᵐ⁾, sₜᵣ)

logγₜ⁽ᵐ⁾ ~ Normal(μ = sᵣ⁽ᵐ⁾ - sₜᵣ, σ = exp(logσᵣ⁽ᵐ⁾))

Arguments

  • R̲̲::Array{Int64, 3}:: T × B × R where T is the number of time points in the data set, B is the number of barcodes, and R is the number of experimental replicates. For each slice in the R axis, each column represents the barcode count trajectory for a single lineage.
  • n̲ₜ::Matrix{Int64}: Matrix with the total number of barcode counts for each time point on each replicate. NOTE: This matrix must be equivalent to computing vec(sum(R̲̲, dims=2)).
  • n_neutral::Int: Number of neutral lineages in dataset.
  • n_bc::Int: Number of mutant lineages in dataset.

Optional Keyword Arguments

  • s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_pop_prior[1] = mean, s_pop_prior[2] = standard deviation, Matrix: s_pop_prior[:, 1] = mean, s_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness values. If typeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.
  • logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_pop_prior[1] = mean, logσ_pop_prior[2] = standard deviation, Matrix: logσ_pop_prior[:, 1] = mean, logσ_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. If typeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.
  • s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness values. If typeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.
  • logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. If typeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.
  • logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector: logλ_prior[1] = mean, logλ_prior[2] = standard deviation, Matrix: logλ_prior[:, 1] = mean, logλ_prior[:, 2] = standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode count n⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). If typeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.

Latent Variables

  • Population mean fitness per timepoint.
  • Mutant hyper-fitness effects.
  • Mutant fitness effects per experimental replicate.
  • λ dispersion parameters per barcode and timepoint.

Notes

  • Models hyper-fitness effects as normally distributed.
  • Models fitness effects as normally distributed.
  • Utilizes a Poisson observation model for barcode counts.
  • Setting informative priors is recommended for stable convergence.

replicatefitnessnormal(R̲̲::Vector{Matrix{Int64}}, n̲ₜ::Vector{Vector{Int64}}, nneutral::Int, nbc::Int; kwargs...)

Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment across growth-dilution cycles over multiple experimental replicates.

Arguments

  • R̲̲::Vector{Matrix{Int64}}:: Length R vector wth T × B matrices where T is the number of time points in the data set, B is the number of barcodes, and R is the number of experimental replicates. For each matrix in the vector, each column represents the barcode count trajectory for a single lineage.
  • n̲ₜ::Vector{Vector{Int64}}: Vector of vectors with the total number of barcode counts for each time point on each replicate. NOTE: This vector must be equivalent to computing vec.(sum.(R̲̲, dims=2)).
  • n_neutral::Int: Number of neutral lineages in dataset.
  • n_bc::Int: Number of mutant lineages in dataset.

Optional Keyword Arguments

  • s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_pop_prior[1] = mean, s_pop_prior[2] = standard deviation, Matrix: s_pop_prior[:, 1] = mean, s_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness values. If typeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.
  • logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_pop_prior[1] = mean, logσ_pop_prior[2] = standard deviation, Matrix: logσ_pop_prior[:, 1] = mean, logσ_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. If typeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.
  • s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness values. If typeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.
  • logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. If typeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.
  • logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector: logλ_prior[1] = mean, logλ_prior[2] = standard deviation, Matrix: logλ_prior[:, 1] = mean, logλ_prior[:, 2] = standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode count n⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). If typeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.

Latent Variables

  • Population mean fitness per timepoint.
  • Mutant hyper-fitness effects.
  • Mutant fitness effects per experimental replicate.
  • λ dispersion parameters per barcode and timepoint.

Notes

  • Models hyper-fitness effects as normally distributed.
  • Models fitness effects as normally distributed.
  • Utilizes a Poisson observation model for barcode counts.
  • Setting informative priors is recommended for stable convergence.

Multi-replicate multi-environment hierarchical model for experimental replicates

BarBay.model.multienv_replicate_fitness_normalFunction

multienv_replicate_fitness_normal(R̲̲::Matrix{Int64}, n̲ₜ::Vector{Int64}, n_neutral::Int, n_bc::Int; kwargs...)

Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment with different environments across growth-dilution cycles over multiple experimental replicates.

Model summary

  • Prior on population mean fitness at time t in replicate r, π(sₜᵣ)

sₜᵣ ~ Normal(params=s_pop_prior)

  • Prior on population log mean fitness associated error at time t in replicate r, π(logσₜ)

logσₜᵣ ~ Normal(params=logσ_pop_prior)

  • Prior on non-neutral relative hyper-fitness for barcode m in environment i π(θᵢ⁽ᵐ⁾)

θᵢ⁽ᵐ⁾ ~ Normal(params=s_bc_prior)

  • Prior on non-centered samples that allow local fitness to vary in the positive and negative direction for barcode m in experimental replicate r in environment i π(θ̃ᵣᵢ⁽ᵐ⁾). Note, this is a standard normal with mean zero and standard deviation one.

θ̃ᵣᵢ⁽ᵐ⁾ ~ Normal(μ = 0, σ = 1)

  • Prior on log deviations of local fitness from hyper-fitness for barcode m in replicate r in environment i π(logτᵣᵢ⁽ᵐ⁾)

logτᵣᵢ⁽ᵐ⁾ ~ Normal(params=logτ_prior)

  • Local relative fitness for non-neutral barcode m in replicate r in environment i (deterministic relationship from hyper-priors)

sᵣᵢ⁽ᵐ⁾ = θᵢ⁽ᵐ⁾ + θ̃ᵣᵢ⁽ᵐ⁾ * exp(logτᵣᵢ⁽ᵐ⁾)

  • Prior on non-neutral log relative fitness associated error for non-neutral barcode m in replcate r in environment i, π(logσᵣᵢ⁽ᵐ⁾)

logσᵣᵢ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)

  • Prior on log Poisson distribtion parameters for barcode m at time t in replicate r, π(logλₜᵣ⁽ᵐ⁾)

logλₜᵣ⁽ᵐ⁾ ~ Normal(params=logλ_prior)

  • Probability of total number of barcodes read given the Poisson distribution parameters at time t in replicate r π(nₜᵣ | logλ̲ₜᵣ)

nₜᵣ ~ Poisson(∑ₘ exp(λₜᵣ⁽ᵐ⁾))

  • Barcode j frequency at time t in replicate r (deterministic relationship from the Poisson parameters)

fₜᵣ⁽ʲ⁾ = λₜᵣ⁽ʲ⁾ / ∑ₖ λₜᵣ⁽ᵏ⁾

  • Log frequency ratio for barcode j at time t in replicate r (deterministic relationship from barcode frequencies)

logγₜᵣ⁽ʲ⁾ = log(f₍ₜ₊₁₎ᵣ⁽ʲ⁾ / fₜᵣ⁽ʲ⁾)

  • Probability of number of reads at time t for all barcodes in replicate r given the total number of reads and the barcode frequencies π(r̲ₜᵣ | nₜᵣ, f̲ₜᵣ)

r̲ₜᵣ ~ Multinomial(nₜᵣ, f̲ₜᵣ)

  • Probability of neutral barcodes n frequency ratio at time t in replicate r, π(logγₜᵣ⁽ⁿ⁾| sₜᵣ, logσₜᵣ)

logγₜᵣ⁽ⁿ⁾ ~ Normal(μ = -sₜᵣ, σ = exp(logσₜᵣ))

  • Probability of non-neutral barcodes frequency ratio for barcode m in replicate r π(logγₜᵣ⁽ᵐ⁾| sᵣ⁽ᵐ⁾, logσᵣ⁽ᵐ⁾, sₜᵣ). Note: This is done grouping by corresponding environment such that if time t is associated with environment i, sᵣᵢ⁽ᵐ⁾ is used as the fitness value.

logγₜᵣ⁽ᵐ⁾ ~ Normal(μ = sᵣᵢ⁽ᵐ⁾ - sₜᵣ, σ = exp(logσᵣᵢ⁽ᵐ⁾))

Arguments

  • R̲̲::Array{Int64, 3}:: T × B × R where T is the number of time points in the data set, B is the number of barcodes, and R is the number of experimental replicates. For each slice in the R axis, each column represents the barcode count trajectory for a single lineage.
  • n̲ₜ::Matrix{Int64}: Matrix with the total number of barcode counts for each time point on each replicate. NOTE: This matrix must be equivalent to computing vec(sum(R̲̲, dims=2)).
  • n_neutral::Int: Number of neutral lineages in dataset.
  • n_bc::Int: Number of mutant lineages in dataset.

Keyword Arguments

  • envs::Vector{<:Any}: List of environments for each time point in dataset. NOTE: The length must be equal to that of the number of rows in n̲t to have one environment per time point.

Optional Keyword Arguments

  • s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_pop_prior[1] = mean, s_pop_prior[2] = standard deviation, Matrix: s_pop_prior[:, 1] = mean, s_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness values. If typeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.
  • logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_pop_prior[1] = mean, logσ_pop_prior[2] = standard deviation, Matrix: logσ_pop_prior[:, 1] = mean, logσ_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. If typeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.
  • s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness values. If typeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.
  • logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_bc_prior[1] = mean, logσ_bc_prior[2] = standard deviation, Matrix: logσ_bc_prior[:, 1] = mean, logσ_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. If typeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.
  • logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector: logλ_prior[1] = mean, logλ_prior[2] = standard deviation, Matrix: logλ_prior[:, 1] = mean, logλ_prior[:, 2] = standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode count n⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). If typeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.

Latent Variables

  • Population mean fitness per timepoint.
  • Mutant hyper-fitness effects per environment.
  • Non-centered samples for each of the experimental replicates.
  • Deviations from the hyper parameter value for each experimental replicate.
  • λ dispersion parameters per barcode and timepoint.

Notes

  • All barcodes must be included in all replicates.
  • Models hyper-fitness effects as normally distributed.
  • Models fitness effects as normally distributed.
  • Utilizes a Poisson observation model for barcode counts.
  • Can estimate time-varying and environment-specific fitness effects.
  • Setting informative priors is recommended for stable convergence.

multienvreplicatefitnessnormal(R̲̲::Vector{Matrix{Int64}}, n̲ₜ::Vector{Vector{Int64}}, nneutral::Int, n_bc::Int; kwargs...)

Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment with different environments across growth-dilution cycles over multiple experimental replicates.

Arguments

  • R̲̲::Vector{Matrix{Int64}}:: Length R vector wth T × B matrices where T is the number of time points in the data set, B is the number of barcodes, and R is the number of experimental replicates. For each matrix in the vector, each column represents the barcode count trajectory for a single lineage.
  • n̲ₜ::Vector{Vector{Int64}}: Vector of vectors with the total number of barcode counts for each time point on each replicate. NOTE: This vector must be equivalent to computing vec.(sum.(R̲̲, dims=2)).
  • n_neutral::Int: Number of neutral lineages in dataset.
  • n_bc::Int: Number of mutant lineages in dataset.

Keyword Arguments

  • envs::Vector{Vector{<:Any}}: Length R vector with the list of environments for each time point in each replicate. NOTE: The length must be equal to that of the number of rows in n̲ₜ to have one environment per time point.

Optional Keyword Arguments

  • s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_pop_prior[1] = mean, s_pop_prior[2] = standard deviation, Matrix: s_pop_prior[:, 1] = mean, s_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness values. If typeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.
  • logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_pop_prior[1] = mean, logσ_pop_prior[2] = standard deviation, Matrix: logσ_pop_prior[:, 1] = mean, logσ_pop_prior[:, 2] = standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. If typeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.
  • s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector: s_bc_prior[1] = mean, s_bc_prior[2] = standard deviation, Matrix: s_bc_prior[:, 1] = mean, s_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness values. If typeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.
  • logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector: logσ_bc_prior[1] = mean, logσ_bc_prior[2] = standard deviation, Matrix: logσ_bc_prior[:, 1] = mean, logσ_bc_prior[:, 2] = standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. If typeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.
  • logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector: logλ_prior[1] = mean, logλ_prior[2] = standard deviation, Matrix: logλ_prior[:, 1] = mean, logλ_prior[:, 2] = standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode count n⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). If typeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.

Latent Variables

  • Population mean fitness per timepoint.
  • Mutant hyper-fitness effects per environment.
  • Non-centered samples for each of the experimental replicates.
  • Deviations from the hyper parameter value for each experimental replicate.
  • λ dispersion parameters per barcode and timepoint.

Notes

  • All barcodes must be included in all replicates.
  • All environments must appear at least once on each replicate.
  • Models hyper-fitness effects as normally distributed.
  • Models fitness effects as normally distributed.
  • Utilizes a Poisson observation model for barcode counts.
  • Can estimate time-varying and environment-specific fitness effects.
  • Setting informative priors is recommended for stable convergence.