AnovaBase.jl

Models

AnovaBase.AnovaModel — Type

abstract type AnovaModel{M, N} end

An abstract type as super type of any models for ANOVA.

AnovaBase.FullModel — Type

FullModel{M, N} <: AnovaModel{M, N}

A wrapper of a regression model for conducting ANOVA.

M is a type of regression model.
N is the number of predictors.

Fields

model: a regression model.
pred_id: the index of terms included in ANOVA. The source iterable can be obtained by predictors(model). This value may depend on type for certain model, e.g. type 1 ANOVA for a gamma regression model with inverse link.
type: type of ANOVA, either 1, 2 or 3.

Constructor

FullModel(model::RegressionModel, type::Int, null::Bool, test_intercept::Bool)

model: a regression model.
type: type of ANOVA, either 1, 2 or 3.
null: whether y ~ 0 is allowed.
test_intercept: whether intercept is going to be tested.

AnovaBase.NestedModels — Type

NestedModels{M, N} <: AnovaModel{M, N}

A wrapper of nested models of the same types for conducting ANOVA.

M is a type of regression model.
N is the number of models.

Fields

model: a tuple of models.

Constructors

NestedModels(model::Vararg{M, N}) where {M, N}
NestedModels(model::NTuple{N, M}) where {M, N}

AnovaBase.MixedAovModels — Type

MixedAovModels{M, N} <: AnovaModel{M, N}

A wrapper of nested models of multiple types for conducting ANOVA.

M is a union type of regression models.
N is the number of models.

Fields

model: a tuple of models.

Constructors

MixedAovModels{M}(model...) where M 
MixedAovModels{M}(model::T) where {M, T <: Tuple}

AnovaBase.MultiAovModels — Type

const MultiAovModels{M, N} = Union{NestedModels{M, N}, MixedAovModels{M, N}} where {M, N}

Wrappers of mutiple models.

AnovaBase.nestedmodels — Method

nestedmodels(<model>; <keyword arguments>)
nestedmodels(<model type>, formula, data; <keyword arguments>)

Create nested models NestedModels from a model or modeltype, formula and data.

ANOVA

AnovaBase.AnovaResult — Type

AnovaResult{M, T, N}

Returned object of anova.

M is NestedModels or FullModel.
T is a subtype of GoodnessOfFit; either FTest or LRT.
N is the length of parameters.

Fields

anovamodel: NestedModels, MixedAovModels, or FullModel.
dof: degrees of freedom of models or predictors.
deviance: deviance(s) for calculating test statistics. See deviance for more details.
teststat: value(s) of test statiscics.
pval: p-value(s) of test statiscics.
otherstat: NamedTuple contained extra statistics.

Constructor

AnovaResult(
        anovamodel::M,
        ::Type{T},
        dof::NTuple{N, Int},
        deviance::NTuple{N, Float64},
        teststat::NTuple{N, Float64},
        pval::NTuple{N, Float64},
        otherstat::NamedTuple
) where {N, M <: AnovaModel{<: RegressionModel, N}, T <: GoodnessOfFit}

AnovaBase.anova — Method

anova(Test::Type{<: GoodnessOfFit}, <anovamodel>; <keyword arguments>)
anova(<models>...; test::Type{<: GoodnessOfFit}, <keyword arguments>)
anova(Test::Type{<: GoodnessOfFit}, <model>; <keyword arguments>)
anova(Test::Type{<: GoodnessOfFit}, <models>...; <keyword arguments>)

Analysis of variance.

Return AnovaResult{M, Test, N}. See AnovaResult for details.

anovamodel: a AnovaModel.
models: RegressionModel(s). If mutiple models are provided, they should be nested, fitted with the same data and the last one is the most complex.
Test: test statistics for goodness of fit. Available tests are LikelihoodRatioTest (LRT) and FTest.

Goodness of fit

AnovaBase.GoodnessOfFit — Type

abstract type GoodnessOfFit end

An abstract type as super type of goodness of fit.

AnovaBase.FTest — Type

struct FTest <: GoodnessOfFit end

Type indicates conducting ANOVA by F-test. It can be the first argument or keyword argument test.

AnovaBase.LikelihoodRatioTest — Type

struct LikelihoodRatioTest <: GoodnessOfFit end
const LRT = LikelihoodRatioTest

Type indicates conducting ANOVA by likelihood-ratio test. It can be the first argument or keyword argument test.

AnovaBase.canonicalgoodnessoffit — Function

canonicalgoodnessoffit(::FixDispDist) = LRT
canonicalgoodnessoffit(::UnivariateDistribution) = FTest

const FixDispDist = Union{Bernoulli, Binomial, Poisson}

Return LRT if the distribution has a fixed dispersion; otherwise, FTest.

Other interface

AnovaBase.ftest_nested — Function

ftest_nested(models::MultiAovModels{M, N}, df, dfr, dev, σ²) where {M <: RegressionModel, N}

Calculate F-statiscics and p-values based on given parameters.

models: nested models
df: degrees of freedoms of each models
dfr: degrees of freedom of residuals of each models
dev: deviances of each models, i.e. unit deviance
σ²: squared dispersion of each models

F-statiscic is (devᵢ - devᵢ₋₁) / (dfᵢ₋₁ - dfᵢ) / σ² for the ith predictor.

AnovaBase.lrt_nested — Function

lrt_nested(models::MultiAovModels{M, N}, df, dev, σ²) where {M <: RegressionModel, N}

Calculate likelihood ratio and p-values based on given parameters.

models: nested models
df: degrees of freedom of each models
dev: deviances of each models, i.e. unit deviance
σ²: squared dispersion of each models

The likelihood ratio of the ith predictor is LRᵢ = (devᵢ - devᵢ₋₁) / σ².

If dev is alternatively -2loglikelihood, σ² should be set to 1.

StatsAPI.dof_residual — Method

dof_residual(aov::AnovaResult)    
dof_residual(aov::AnovaResult{<: MultiAovModels})

Degrees of freedom of residuals.

By default, it applies dof_residual to models in aov.anovamodel.

dof_residual(aov::AnovaResult)    
dof_residual(aov::AnovaResult{<: MultiAovModels})

Degrees of freedom of residuals.

By default, it applies dof_residual to models in aov.anovamodel.

AnovaBase.predictors — Method

predictors(model::RegressionModel)
predictors(anovamodel::FullModel)

Return a tuple of Terms which are predictors of the model or anovamodel.

By default, it returns formula(model).rhs.terms; if the formula has special structures, this function should be overloaded.

AnovaBase.anovatable — Method

anovatable(aov::AnovaResult{<: FullModel, Test}; rownames = prednames(aov))
anovatable(aov::AnovaResult{<: MultiAovModels, Test}; rownames = string.(1:N))
anovatable(aov::AnovaResult{<: MultiAovModels, FTest, N}; rownames = string.(1:N)) where N
anovatable(aov::AnovaResult{<: MultiAovModels, LRT, N}; rownames = string.(1:N)) where N

Return a table with coefficients and related statistics of ANOVA.

When displaying aov in repl, rownames will be prednames(aov) for FullModel and string.(1:N) for MultiAovModels.

For MultiAovModels, there are two default methods for FTest and LRT; one can also define new methods dispatching on ::AnovaResult{NestedModels{M}} or ::AnovaResult{MixedAovModels{M}} where M is a model type.

For FullModel, no default api is implemented.

The returned AnovaTable object implements the Tables.jl interface, and can be converted e.g. to a DataFrame via using DataFrames; DataFrame(anovatable(aov)).

Developer utility

AnovaBase.dof_asgn — Function

dof_asgn(v::Vector{Int})

Calculate degrees of freedom of each predictors. 'assign' can be obtained by StatsModels.asgn(f::FormulaTerm). For a given trm::RegressionModel, it is as same as trm.mm.assign.

The index of the output matches values in the orinal assign. If any index value is not in assign, the default is 0.

Examples

julia> dof_asgn([1, 2, 2, 3, 3, 3])
3-element Vector{Int64}:
 1
 2
 3

julia> dof_asgn([2, 2, 3, 3, 3])
3-element Vector{Int64}:
 0
 2
 3

AnovaBase.prednames — Function

prednames(<term>)

Return the name(s) of predictor(s). Return value is either a String, an iterable of Strings or nothing.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ SepalWidth + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  SepalWidth(continuous)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> prednames(f)
["(Intercept)", "SepalWidth", "PetalLength", "PetalWidth", "PetalLength & PetalWidth"]

julia> prednames(InterceptTerm{false}())

prednames(aov::AnovaResult)
prednames(anovamodel::FullModel) 
prednames(anovamodel::MultiAovModels)
prednames(<model>)

Return the name of predictors as a vector of strings. When there are multiple models, return value is nothing.

AnovaBase.any_not_aliased_with_1 — Function

any_not_aliased_with_1(<terms>)

Return true if there are any terms not aliased with the intercept, e.g. ContinuousTerm or FunctionTerm.

Terms without schema are considered aliased with the intercept.

AnovaBase.getterms — Function

getterms(<term>)

Return the symbol of term(s) as a vector of Expr or Symbol.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> getterms(f)
(Expr[:(log(SepalLength))], [:Species, :PetalLength, :PetalWidth])

julia> getterms(InterceptTerm{true}())
Symbol[]

AnovaBase.isinteract — Function

isinteract(m::MatrixTerm, id1::Int, id2::Int)
isinteract(f::TupleTerm, id1::Int, id2::Int)

Determine if f[id2] is an interaction term of f[id1] and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> isinteract(f.rhs, 1, 2)
true

julia> isinteract(f.rhs, 3, 4)
false

julia> isinteract(f.rhs, 4, 5)
true

AnovaBase.select_super_interaction — Function

select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

returned terms are interaction terms of f[id] and other terms.
f[id] is an interaction term of returned terms and other terms.
returned terms not interaction terms of f[id] and other terms.
f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2

AnovaBase.select_sub_interaction — Function

select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

returned terms are interaction terms of f[id] and other terms.
f[id] is an interaction term of returned terms and other terms.
returned terms not interaction terms of f[id] and other terms.
f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2

AnovaBase.select_not_super_interaction — Function

select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

returned terms are interaction terms of f[id] and other terms.
f[id] is an interaction term of returned terms and other terms.
returned terms not interaction terms of f[id] and other terms.
f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2

AnovaBase.select_not_sub_interaction — Function

select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

returned terms are interaction terms of f[id] and other terms.
f[id] is an interaction term of returned terms and other terms.
returned terms not interaction terms of f[id] and other terms.
f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2

AnovaBase.subformula — Function

subformula(f::FormulaTerm, id; kwargs...)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id::Int; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::NTuple{N, AbstractTerm}, id::Int; rhs_id::Int = 1, reschema::Bool = false)

Create formula from existing lhs and rhs (or rhs[tuple_id]) truncated to 1:id or excluded collection id. When id is 0, all terms in rhs (or rhs[tuple_id]) will be removed.

If reschema is true, all terms' schema will be removed.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> subformula(f, 2)
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)

julia> subformula(f, [3, 5]; reschema = true)
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalWidth(unknown)

julia> f = formula(fit(LinearMixedModel, @formula(SepalLength ~ SepalWidth + (SepalWidth|Species)), iris))
FormulaTerm
Response:
  SepalLength(continuous)
Predictors:
  1
  SepalWidth(continuous)
  (1 + SepalWidth | Species)

julia> subformula(f, 0)
FormulaTerm
Response:
  SepalLength(continuous)
Predictors:
  0
  (1 + SepalWidth | Species)

AnovaBase.clear_schema — Function

clear_schema(<terms with schema>) = <terms without schema>

Clear any applied schema on terms.

AnovaBase.extract_contrasts — Function

extract_contrasts(f::FormulaTerm)

Extract a dictionary of contrasts. The keys are symbols of term; the values are contrasts (AbstractContrasts).

AnovaBase._diff — Function

_diff(t::NTuple)

Return a tuple of difference between adjacent elements of a tuple(later - former).

AnovaBase._diffn — Function

_diff(t::NTuple)

Return a tuple of difference between adjacent elements of a tuple(former - later).

AnovaBase.AnovaTable — Type

AnovaTable

A table with coefficients and related statistics of ANOVA. It is mostly modified from StatsBase.CoefTable.

Fields

cols: values of each statiscics.
colnms: names of statiscics.
rownms: names of each row.
pvalcol: the index of column repressenting p-value.
teststatcol: the index of column representing test statiscics.

Constructor

AnovaTable(cols::Vector, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)
AnovaTable(mat::Matrix, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)

AnovaBase.testname — Function

testname(::Type{FTest}) = "F test"
testname(::Type{LRT}) = "Likelihood-ratio test"

Name of tests.

AnovaBase.jl

Models

ANOVA

Attributes

Goodness of fit

Other interface

Developer utility