AnovaBase.jl

Models

AnovaBase.AnovaModelType
abstract type AnovaModel{M, N} end

An abstract type as super type of any models for ANOVA.

AnovaBase.FullModelType
FullModel{M, N} <: AnovaModel{M, N}

A wrapper of a regression model for conducting ANOVA.

  • M is a type of regression model.
  • N is the number of predictors.

Fields

  • model: a regression model.
  • pred_id: the index of terms included in ANOVA. The source iterable can be obtained by predictors(model). This value may depend on type for certain model, e.g. type 1 ANOVA for a gamma regression model with inverse link.
  • type: type of ANOVA, either 1, 2 or 3.

Constructor

FullModel(model::RegressionModel, type::Int, null::Bool, test_intercept::Bool)
  • model: a regression model.
  • type: type of ANOVA, either 1, 2 or 3.
  • null: whether y ~ 0 is allowed.
  • test_intercept: whether intercept is going to be tested.
AnovaBase.NestedModelsType
NestedModels{M, N} <: AnovaModel{M, N}

A wrapper of nested models of the same types for conducting ANOVA.

  • M is a type of regression model.
  • N is the number of models.

Fields

  • model: a tuple of models.

Constructors

NestedModels(model::Vararg{M, N}) where {M, N}
NestedModels(model::NTuple{N, M}) where {M, N}
AnovaBase.MixedAovModelsType
MixedAovModels{M, N} <: AnovaModel{M, N}

A wrapper of nested models of multiple types for conducting ANOVA.

  • M is a union type of regression models.
  • N is the number of models.

Fields

  • model: a tuple of models.

Constructors

MixedAovModels{M}(model...) where M 
MixedAovModels{M}(model::T) where {M, T <: Tuple}
AnovaBase.MultiAovModelsType
const MultiAovModels{M, N} = Union{NestedModels{M, N}, MixedAovModels{M, N}} where {M, N}

Wrappers of mutiple models.

AnovaBase.nestedmodelsMethod
nestedmodels(<model>; <keyword arguments>)
nestedmodels(<model type>, formula, data; <keyword arguments>)

Create nested models NestedModels from a model or modeltype, formula and data.

ANOVA

AnovaBase.AnovaResultType
AnovaResult{M, T, N}

Returned object of anova.

  • M is NestedModels or FullModel.
  • T is a subtype of GoodnessOfFit; either FTest or LRT.
  • N is the length of parameters.

Fields

  • anovamodel: NestedModels, MixedAovModels, or FullModel.
  • dof: degrees of freedom of models or predictors.
  • deviance: deviance(s) for calculating test statistics. See deviance for more details.
  • teststat: value(s) of test statiscics.
  • pval: p-value(s) of test statiscics.
  • otherstat: NamedTuple contained extra statistics.

Constructor

AnovaResult(
        anovamodel::M,
        ::Type{T},
        dof::NTuple{N, Int},
        deviance::NTuple{N, Float64},
        teststat::NTuple{N, Float64},
        pval::NTuple{N, Float64},
        otherstat::NamedTuple
) where {N, M <: AnovaModel{<: RegressionModel, N}, T <: GoodnessOfFit}
AnovaBase.anovaMethod
anova(Test::Type{<: GoodnessOfFit}, <anovamodel>; <keyword arguments>)
anova(<models>...; test::Type{<: GoodnessOfFit}, <keyword arguments>)
anova(Test::Type{<: GoodnessOfFit}, <model>; <keyword arguments>)
anova(Test::Type{<: GoodnessOfFit}, <models>...; <keyword arguments>)

Analysis of variance.

Return AnovaResult{M, Test, N}. See AnovaResult for details.

  • anovamodel: a AnovaModel.
  • models: RegressionModel(s). If mutiple models are provided, they should be nested, fitted with the same data and the last one is the most complex.
  • Test: test statistics for goodness of fit. Available tests are LikelihoodRatioTest (LRT) and FTest.

Attributes

AnovaBase.anova_typeMethod
anova_type(aov::AnovaResult)
anova_type(model::MultiAovModels)
anova_type(model::FullModel)

Type of anova, either 1, 2 or 3.

StatsAPI.devianceMethod
deviance(aov::AnovaResult)

Return the stored devaince. The value repressents different statistics for different models and tests. It may be deviance, Δdeviance, -2loglikelihood or other measures of model performance.

StatsAPI.dofMethod
dof(aov::AnovaResult)

Degrees of freedom of each models or predictors.

StatsAPI.nobsMethod
nobs(aov::AnovaResult)
nobs(aov::AnovaResult{<: MultiAovModels})

Number of observations.

Goodness of fit

AnovaBase.FTestType
struct FTest <: GoodnessOfFit end

Type indicates conducting ANOVA by F-test. It can be the first argument or keyword argument test.

AnovaBase.LikelihoodRatioTestType
struct LikelihoodRatioTest <: GoodnessOfFit end
const LRT = LikelihoodRatioTest

Type indicates conducting ANOVA by likelihood-ratio test. It can be the first argument or keyword argument test.

AnovaBase.canonicalgoodnessoffitFunction
canonicalgoodnessoffit(::FixDispDist) = LRT
canonicalgoodnessoffit(::UnivariateDistribution) = FTest

const FixDispDist = Union{Bernoulli, Binomial, Poisson}

Return LRT if the distribution has a fixed dispersion; otherwise, FTest.

Other interface

AnovaBase.ftest_nestedFunction
ftest_nested(models::MultiAovModels{M, N}, df, dfr, dev, σ²) where {M <: RegressionModel, N}

Calculate F-statiscics and p-values based on given parameters.

  • models: nested models
  • df: degrees of freedoms of each models
  • dfr: degrees of freedom of residuals of each models
  • dev: deviances of each models, i.e. unit deviance
  • σ²: squared dispersion of each models

F-statiscic is (devᵢ - devᵢ₋₁) / (dfᵢ₋₁ - dfᵢ) / σ² for the ith predictor.

AnovaBase.lrt_nestedFunction
lrt_nested(models::MultiAovModels{M, N}, df, dev, σ²) where {M <: RegressionModel, N}

Calculate likelihood ratio and p-values based on given parameters.

  • models: nested models
  • df: degrees of freedom of each models
  • dev: deviances of each models, i.e. unit deviance
  • σ²: squared dispersion of each models

The likelihood ratio of the ith predictor is LRᵢ = (devᵢ - devᵢ₋₁) / σ².

If dev is alternatively -2loglikelihood, σ² should be set to 1.

StatsAPI.dof_residualMethod
dof_residual(aov::AnovaResult)    
dof_residual(aov::AnovaResult{<: MultiAovModels})

Degrees of freedom of residuals.

By default, it applies dof_residual to models in aov.anovamodel.

dof_residual(aov::AnovaResult)    
dof_residual(aov::AnovaResult{<: MultiAovModels})

Degrees of freedom of residuals.

By default, it applies dof_residual to models in aov.anovamodel.

AnovaBase.predictorsMethod
predictors(model::RegressionModel)
predictors(anovamodel::FullModel)

Return a tuple of Terms which are predictors of the model or anovamodel.

By default, it returns formula(model).rhs.terms; if the formula has special structures, this function should be overloaded.

AnovaBase.anovatableMethod
anovatable(aov::AnovaResult{<: FullModel, Test}; rownames = prednames(aov))
anovatable(aov::AnovaResult{<: MultiAovModels, Test}; rownames = string.(1:N))
anovatable(aov::AnovaResult{<: MultiAovModels, FTest, N}; rownames = string.(1:N)) where N
anovatable(aov::AnovaResult{<: MultiAovModels, LRT, N}; rownames = string.(1:N)) where N

Return a table with coefficients and related statistics of ANOVA.

When displaying aov in repl, rownames will be prednames(aov) for FullModel and string.(1:N) for MultiAovModels.

For MultiAovModels, there are two default methods for FTest and LRT; one can also define new methods dispatching on ::AnovaResult{NestedModels{M}} or ::AnovaResult{MixedAovModels{M}} where M is a model type.

For FullModel, no default api is implemented.

The returned AnovaTable object implements the Tables.jl interface, and can be converted e.g. to a DataFrame via using DataFrames; DataFrame(anovatable(aov)).

Developer utility

AnovaBase.dof_asgnFunction
dof_asgn(v::Vector{Int})

Calculate degrees of freedom of each predictors. 'assign' can be obtained by StatsModels.asgn(f::FormulaTerm). For a given trm::RegressionModel, it is as same as trm.mm.assign.

The index of the output matches values in the orinal assign. If any index value is not in assign, the default is 0.

Examples

julia> dof_asgn([1, 2, 2, 3, 3, 3])
3-element Vector{Int64}:
 1
 2
 3

julia> dof_asgn([2, 2, 3, 3, 3])
3-element Vector{Int64}:
 0
 2
 3
AnovaBase.prednamesFunction
prednames(<term>)

Return the name(s) of predictor(s). Return value is either a String, an iterable of Strings or nothing.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ SepalWidth + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  SepalWidth(continuous)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> prednames(f)
["(Intercept)", "SepalWidth", "PetalLength", "PetalWidth", "PetalLength & PetalWidth"]

julia> prednames(InterceptTerm{false}())

prednames(aov::AnovaResult)
prednames(anovamodel::FullModel) 
prednames(anovamodel::MultiAovModels)
prednames(<model>)

Return the name of predictors as a vector of strings. When there are multiple models, return value is nothing.

AnovaBase.any_not_aliased_with_1Function
any_not_aliased_with_1(<terms>)

Return true if there are any terms not aliased with the intercept, e.g. ContinuousTerm or FunctionTerm.

Terms without schema are considered aliased with the intercept.

AnovaBase.gettermsFunction
getterms(<term>)

Return the symbol of term(s) as a vector of Expr or Symbol.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> getterms(f)
(Expr[:(log(SepalLength))], [:Species, :PetalLength, :PetalWidth])

julia> getterms(InterceptTerm{true}())
Symbol[]
AnovaBase.isinteractFunction
isinteract(m::MatrixTerm, id1::Int, id2::Int)
isinteract(f::TupleTerm, id1::Int, id2::Int)

Determine if f[id2] is an interaction term of f[id1] and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> isinteract(f.rhs, 1, 2)
true

julia> isinteract(f.rhs, 3, 4)
false

julia> isinteract(f.rhs, 4, 5)
true
AnovaBase.select_super_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
AnovaBase.select_sub_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
AnovaBase.select_not_super_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
AnovaBase.select_not_sub_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
AnovaBase.subformulaFunction
subformula(f::FormulaTerm, id; kwargs...)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id::Int; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::NTuple{N, AbstractTerm}, id::Int; rhs_id::Int = 1, reschema::Bool = false)

Create formula from existing lhs and rhs (or rhs[tuple_id]) truncated to 1:id or excluded collection id. When id is 0, all terms in rhs (or rhs[tuple_id]) will be removed.

If reschema is true, all terms' schema will be removed.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> subformula(f, 2)
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)

julia> subformula(f, [3, 5]; reschema = true)
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalWidth(unknown)

julia> f = formula(fit(LinearMixedModel, @formula(SepalLength ~ SepalWidth + (SepalWidth|Species)), iris))
FormulaTerm
Response:
  SepalLength(continuous)
Predictors:
  1
  SepalWidth(continuous)
  (1 + SepalWidth | Species)

julia> subformula(f, 0)
FormulaTerm
Response:
  SepalLength(continuous)
Predictors:
  0
  (1 + SepalWidth | Species)
AnovaBase.clear_schemaFunction
clear_schema(<terms with schema>) = <terms without schema>

Clear any applied schema on terms.

AnovaBase.extract_contrastsFunction
extract_contrasts(f::FormulaTerm)

Extract a dictionary of contrasts. The keys are symbols of term; the values are contrasts (AbstractContrasts).

AnovaBase._diffFunction
_diff(t::NTuple)

Return a tuple of difference between adjacent elements of a tuple(later - former).

AnovaBase._diffnFunction
_diff(t::NTuple)

Return a tuple of difference between adjacent elements of a tuple(former - later).

AnovaBase.AnovaTableType
AnovaTable

A table with coefficients and related statistics of ANOVA. It is mostly modified from StatsBase.CoefTable.

Fields

  • cols: values of each statiscics.
  • colnms: names of statiscics.
  • rownms: names of each row.
  • pvalcol: the index of column repressenting p-value.
  • teststatcol: the index of column representing test statiscics.

Constructor

AnovaTable(cols::Vector, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)
AnovaTable(mat::Matrix, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)
AnovaBase.testnameFunction
testname(::Type{FTest}) = "F test"
testname(::Type{LRT}) = "Likelihood-ratio test"

Name of tests.