General Interface
Understanding the interface
CRRao exports the fit
function, which is used to train all types of models supported by the package. As of now, the function supports the following signatures.
fit(formula, data, modelClass)
fit(formula, data, modelClass, link)
fit(formula, data, modelClass, prior)
fit(formula, data, modelClass, link, prior)
It should be noted that not all model classes support every type of signature. The parameters passed above mean the following.
The parameter
formula
must be a formula of typeStatsModels.FormulaTerm
. Any formula has an LHS and an RHS. The LHS represents the response variable, and the RHS represents the independent variables.The parameter
data
must be aDataFrame
. This variable represents the dataset on which the model must be trained.modelClass
represents the type of the statistical model to be used. Currently, CRRao supports four regression models, and the type ofmodelClass
must be one of the following:Certain model classes (like Logistic Regression) support link functions; this is represented by the
link
parameter. Currently four link functions are supported: Logit, Probit, Cloglog and Cauchit. So, the type oflink
must be one of the following:CRRao also supports Bayesian models, and the priors to be can be specified while calling
fit
. Currently CRRao supports six different kinds of priors, and the type of theprior
parameter must be one of the following.
Model Classes and Data Models
CRRao.LinearRegression
— TypeLinearRegression
Type representing the Linear Regression model class.
\[y =\alpha + X \beta+ \varepsilon,\]
where
\[\varepsilon \sim N(0,\sigma^2),\]
- $y$ is the response vector of size $n$,
- $X$ is the matrix of predictor variable of size $n \times p$,
- $n$ is the sample size, and $p$ is the number of predictors,
- $\alpha$ is the intercept of the model,
- $\beta$ is the regression coefficients of the model, and
- $\sigma$ is the standard deviation of the noise $\varepsilon$.
CRRao.LogisticRegression
— TypeLogisticRegression
Type representing the Logistic Regression model class.
\[y_i \sim Bernoulli(p_i), \]
where $i=1,2,\cdots,n, 0 < p_i < 1$,
- $\mathbb{E}(y_i)=p_i$,
- $\mathbb{P}(y_i=1) = p_i$ and $\mathbb{P}(y_i=0) = 1-p_i$, such that
\[\mathbb{E}(y_i)= p_i =g(\alpha +\mathbf{x}_i^T\beta),\]
- $g(.)$ is the link-function,
- $y_i$ is the $i^{th}$ element of the response vector $y$,
- $\mathbf{x}_i=(x_{i1},x_{i2},\cdots,x_{in})$ is the $i^{th}$ row of the design matix of size $n \times p$,
- $\alpha$ is the intercept of the model, and
- $\beta$ is the regression coefficients of the model.
CRRao.NegBinomRegression
— TypeNegBinomRegression
Type representing the Negative Binomial Regression model class.
\[y_i \sim NegativeBinomial(\mu_i,\phi), i=1,2,\cdots,n\]
where
\[\mu_i = \exp(\alpha +\mathbf{x}_i^T\beta),\]
- $y_i$ is the $i^{th}$ element of the response vector $y$,
- $\mathbf{x}=(x_{i1},x_{i2},\cdots,x_{in})$ is the $i^{th}$ row of the design matix of size $n \times p$,
- $\alpha$ is the intercept of the model, and
- $\beta$ is the regression coefficients of the model.
CRRao.PoissonRegression
— TypePoissonRegression
Type representing the Poisson Regression model class.
\[y_i \sim Poisson(\lambda_i), i=1,2,\cdots,n\]
where
\[\lambda_i = \exp(\alpha +\mathbf{x}_i^T\beta),\]
- $y_i$ is the $i^{th}$ element of the response vector $y$,
- $\mathbf{x}=(x_{i1},x_{i2},\cdots,x_{in})$ is the $i^{th}$ row of the design matix of size $n \times p$,
- $\alpha$ is the intercept of the model, and
- $\beta$ is the regression coefficients of the model.
Link functions.
CRRao.CRRaoLink
— TypeCRRaoLink
Abstract type representing link functions which are used to dispatch to appropriate calls.
CRRao.Logit
— TypeLogit <: CRRaoLink
A type representing the Logit link function, which is defined by the formula
\[z\mapsto \dfrac{1}{1 + \exp(-z)}\]
CRRao.Probit
— TypeProbit <: CRRaoLink
A type representing the Probit link function, which is defined by the formula
\[z\mapsto \mathbb{P}[Z\le z]\]
where $Z\sim \text{Normal}(0, 1)$.
CRRao.Cloglog
— TypeCloglog <: CRRaoLink
A type representing the Cloglog link function, which is defined by the formula
\[z\mapsto 1 - \exp(-\exp(z))\]
CRRao.Cauchit
— TypeCauchit <: CRRaoLink
A type representing the Cauchit link function, which is defined by the formula
\[z\mapsto \dfrac{1}{2} + \dfrac{\text{atan}(z)}{\pi}\]
Prior Distributions
CRRao.Prior_Gauss
— TypePrior_Gauss
Type representing the Gaussian Prior. Users have specific prior mean and standard deviation, for $\alpha$ and $\beta$ for linear regression model.
Prior model
\[\sigma \sim InverseGamma(a_0,b_0),\]
\[\alpha | \sigma,v \sim Normal(\alpha_0,\sigma_{\alpha_0}),\]
\[\beta | \sigma,v \sim Normal_p(\beta_0,\sigma_{\beta_0}),\]
Likelihood or data model
\[\mu_i= \alpha + \mathbf{x}_i^T\beta\]
\[y_i \sim N(\mu_i,\sigma),\]
Note: $N()$ is Gaussian distribution of $y_i$, where
- $\mathbf{E}(y_i)=g(\mu_i)$, and
- $Var(y_i)=\sigma^2$.
CRRao.Prior_Ridge
— TypePrior_Ridge
Type representing the Ridge Prior.
Prior model
\[v \sim InverseGamma(h,h),\]
\[\sigma \sim InverseGamma(a_0,b_0),\]
\[\alpha | \sigma,v \sim Normal(0,v*\sigma),\]
\[\beta | \sigma,v \sim Normal_p(0,v*\sigma),\]
Likelihood or data model
\[\mu_i= \alpha + \mathbf{x}_i^T\beta\]
\[y_i \sim D(\mu_i,\sigma),\]
Note: $D()$ is appropriate distribution of $y_i$ based on the modelClass
, where
- $\mathbf{E}(y_i)=g(\mu_i)$, and
- $Var(y_i)=\sigma^2$.
CRRao.Prior_Laplace
— TypePrior_Laplace
Type representing the Laplace Prior.
Prior model
\[v \sim InverseGamma(h,h),\]
\[\sigma \sim InverseGamma(a_0,b_0),\]
\[\alpha | \sigma,v \sim Laplace(0,v*\sigma),\]
\[\beta | \sigma,v \sim Laplace(0,v*\sigma),\]
Likelihood or data model
\[\mu_i= \alpha + \mathbf{x}_i^T\beta\]
\[y_i \sim D(\mu_i,\sigma),\]
Note: $D()$ is appropriate distribution of $y_i$ based on the modelClass
, where
- $\mathbf{E}(y_i)=g(\mu_i)$, and
- $Var(y_i)=\sigma^2$.
CRRao.Prior_Cauchy
— TypePrior_Cauchy
Type representing the Cauchy Prior.
Prior model
\[\sigma \sim Half-Cauchy(0,1),\]
\[\alpha | \sigma \sim Cauchy(0,\sigma),\]
\[\beta | \sigma \sim Cauchy(0,v*\sigma),\]
Likelihood or data model
\[\mu_i= \alpha + \mathbf{x}_i^T\beta\]
\[y_i \sim D(\mu_i,\sigma),\]
Note: $D()$ is appropriate distribution of $y_i$ based on the modelClass
, where
- $\mathbf{E}(y_i)=g(\mu_i)$, and
- $Var(y_i)=\sigma^2$.
CRRao.Prior_TDist
— TypePrior_TDist
Type representing the T-Distributed Prior.
Prior model
\[v \sim InverseGamma(h,h),\]
\[\sigma \sim InverseGamma(a_0,b_0),\]
\[\alpha | \sigma,v \sim \sigma t(v),\]
\[\beta | \sigma,v \sim \sigma t(v),\]
Likelihood or data model
\[\mu_i= \alpha + \mathbf{x}_i^T\beta\]
\[y_i \sim D(\mu_i,\sigma),\]
Note: $D()$ is appropriate distribution of $y_i$ based on the modelClass
, where
- $\mathbf{E}(y_i)=g(\mu_i)$, and
- $Var(y_i)=\sigma^2$.
- The $t(v)$ is $t$ distribution with $v$ degrees of freedom.
CRRao.Prior_HorseShoe
— TypePrior_HorseShoe
Type representing the HorseShoe Prior.
Prior model
\[\tau \sim HalfCauchy(0,1),\]
\[\lambda_j \sim HalfCauchy(0,1), j=1,2,\cdots,p\]
\[\sigma \sim HalfCauchy(0,1),\]
\[\alpha | \sigma,\tau \sim N(0,\tau *\sigma),\]
\[\beta_j | \sigma,\lambda_j ,\tau \sim Normal(0,\lambda_j *\tau *\sigma),\]
Likelihood or data model
\[\mu_i= \alpha + \mathbf{x}_i^T\beta\]
\[y_i \sim D(\mu_i,\sigma), i=1,2,\cdots,n\]
Note: $D()$ is appropriate distribution of $y_i$ based on the modelClass
, where
- $\mathbf{E}(y_i)=g(\mu_i)$,
- $Var(y_i)=\sigma^2$, and
- $\beta$=($\beta_1,\beta_2,\cdots,\beta_p$)
Setting Random Number Generators
CRRao.set_rng
— Functionset_rng(rng)
Set the random number generator. This is useful if you want to work with reproducible results. rng
must be a random number generator.
Example
using StableRNGs
CRRao.set_rng(StableRNG(1234))