Empirikos.CressieSeheultModule
CressieSeheult

A household survey involved the participants in completing answers on question forms which were then collected and put into batches for coding. A quality control programme was implemented to check on coding accuracy for one question. The table shows the numbers of errors after sampling 42 coded questionnaires from each of 91 batches

This dataset is from the following reference:

> Cressie, Noel, and Allan Seheult. "Empirical Bayes estimation in sampling inspection."
Biometrika 72, no. 2 (1985): 451-458.
Empirikos.AshPriorsConstant
AshPriors

Empirical Bayes priors that are used in the simulations of:

Stephens, M., 2017. False discovery rates: a new deal. Biostatistics, 18(2), pp.275-294.

Empirikos.EfronPriorsConstant
EfronPriors

Empirical Bayes priors that are used in the simulations of:

Efron, B., 2016. Empirical Bayes deconvolution estimates. Biometrika, 103(1), pp.1-20.

Empirikos.IWPriorsConstant
IWPriors

Empirical Bayes priors that are used in the simulations of:

Ignatiadis, N. and Wager, S., 2019. Bias-aware confidence intervals for empirical Bayes analysis. arXiv preprint arXiv:1902.02774.

Empirikos.MarronWandGaussianMixturesConstant
MarronWandGaussianMixtures

Flexible Gaussian Mixture distributions described in

Marron, J. Steve, and Matt P. Wand. Exact mean integrated squared error.

The Annals of Statistics (1992): 712-736.

Empirikos.AMARIType
AMARI(convexclass::Empirikos.ConvexPriorClass,
      flocalization::Empirikos.FLocalization,
      solver,
      plugin_G = KolmogorovSmirnovMinimumDistance(convexclass, solver))

Affine Minimax Anderson-Rubin intervals for empirical Bayes estimands. Here flocalization is a pilot Empirikos.FLocalization, convexclass is a Empirikos.ConvexPriorClass, solver is a JuMP.jl compatible solver. plugin_G is a Empirikos.EBayesMethod used as an initial estimate of the marginal distribution of the i.i.d. samples $Z$.

References

@ignatiadis2022confidence

Empirikos.BinomialSampleType
BinomialSample(Z, n)

An observed sample $Z$ drawn from a Binomial distribution with n trials.

\[Z \sim \text{Binomial}(n, p)\]

$p$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $p$.

julia> BinomialSample(2, 10)          # 2 out of 10 trials successful
ℬ𝒾𝓃(2; p, n=10)
Empirikos.ChiSquaredFLocalizationType
ChiSquaredFLocalization(α) <: FLocalization

The $\chi^2$ F-localization at confidence level $1-\alpha$ for a discrete random variable taking values in $\{0,\dotsc, N\}$. It is equal to:

\[f: \sum_{x=0}^N \frac{(n \hat{f}_n(x) - n f(x))^2}{n f(x)} \leq \chi^2_{N,1-\alpha},\]

where $\chi^2_{N,1-\alpha}$ is the $1-\alpha$ quantile of the Chi-squared distribution with $N$ degrees of freedom, $n$ is the sample size, $\hat{f}_n(x)$ is the proportion of samples equal to $x$ and $f(x)$ is then population pmf.

Empirikos.DeLaValleePoussinKernelType
DeLaValleePoussinKernel(h) <: InfiniteOrderKernel

Implements the DeLaValleePoussinKernel with bandwidth h to be used for kernel density estimation through the KernelDensity.jl package. The De La Vallée-Poussin kernel is defined as follows:

\[K_V(x) = \frac{\cos(x)-\cos(2x)}{\pi x^2}\]

Its use case is similar to the SincKernel, however it has the advantage of being integrable (in the Lebesgue sense) and having bounded total variation. Its Fourier transform is the following:

\[K^*_V(t) = \begin{cases} 1, & \text{ if } |t|\leq 1 \\ 0, &\text{ if } |t| \geq 2 \\ 2-|t|,& \text{ if } |t| \in [1,2] \end{cases}\]

Empirikos.DeltaTunerType
DeltaTuner

Abstract type used to represent ways of picking $\delta$ at which to solve the modulus problem, cf. Manuscript. Different choices of $\delta$ correspond to different choices of the Bias-Variance tradeoff with every choice leading to Pareto-optimal tradeoff.

Empirikos.DiscretePriorClassType
DiscretePriorClass(support) <: Empirikos.ConvexPriorClass

Type representing the family of all discrete distributions supported on a subset of support, i.e., it represents all DiscreteNonParametric distributions with support = support and probs taking values on the probability simplex.

Note that DiscretePriorClass(support)(probs) == DiscreteNonParametric(support, probs).

Examples

julia> gcal = DiscretePriorClass([0,0.5,1.0])
DiscretePriorClass | support = [0.0, 0.5, 1.0]

julia> gcal([0.2,0.2,0.6])
DiscreteNonParametric{Float64, Float64, Vector{Float64}, Vector{Float64}}(support=[0.0, 0.5, 1.0], p=[0.2, 0.2, 0.6])
Empirikos.DvoretzkyKieferWolfowitzType
DvoretzkyKieferWolfowitz(;α = 0.05, max_constraints = 1000) <: FLocalization

The Dvoretzky-Kiefer-Wolfowitz band (based on the Kolmogorov-Smirnov distance) at confidence level 1-α that bounds the distance of the true distribution function to the ECDF $\widehat{F}_n$ based on $n$ samples. The constant of the band is the sharp constant derived by Massart:

\[F \text{ distribution}: \sup_{t \in \mathbb R}\lvert F(t) - \widehat{F}_n(t) \rvert \leq \sqrt{\log(2/\alpha)/(2n)}\]

The supremum above is enforced discretely on at most max_constraints number of points.

Empirikos.EBayesSampleType
EBayesSample{T}

Abstract type representing empirical Bayes samples with realizations of type T.

Empirikos.EBayesTargetType

Abstract type that describes Empirical Bayes estimands (which we want to estimate or conduct inference for).

Empirikos.EmpiricalPartiallyBayesTTestType
EmpiricalPartiallyBayesTTest(; multiple_test = BenjaminiHochberg(), α = 0.05, prior = DiscretePriorClass(), solver = Hypatia.Optimizer, discretize_marginal = false, prior_grid_size = 300, lower_quantile = 0.01)

Performs empirical partially Bayes multiple testing.

Fields

  • multiple_test: Multiple testing procedure from MultipleTesting.jl (default: BenjaminiHochberg()).
  • α: Significance level (default: 0.05).
  • prior: Prior distribution. Default: DiscretePriorClass(). Alternatives include Empirikos.Limma() or a distribution from Distributions.jl. Note: Other fields are ignored if using these alternatives.
  • solver: Optimization solver (default: Hypatia.Optimizer). Not used with alternative prior choices.
  • discretize_marginal: If true, discretizes marginal distribution (default: false). Not used with alternative prior choices.
  • prior_grid_size: Grid size for prior distribution (default: 300). Not used with alternative prior choices.
  • lower_quantile: Lower quantile for sample variances (default: 0.01).

References

@ignatiadis2023empirical

Empirikos.FLocalizationIntervalType
FLocalizationInterval(flocalization::Empirikos.FLocalization,
                      convexclass::Empirikos.ConvexPriorClass,
                      solver,
                      n_bisection = 100)

Method for computing frequentist confidence intervals for empirical Bayes estimands. Here flocalization is a Empirikos.FLocalization, convexclass is a Empirikos.ConvexPriorClass, solver is a JuMP.jl compatible solver.

n_bisection is relevant only for combinations of target, flocalization and convexclass for which the Charnes-Cooper transformation is not applicable/implemented. Instead, a quasi-convex optimization problem is solved by bisection and increasing n_bisection increases accuracy (at the cost of more computation).

Empirikos.FittedFLocalizationType

Abstract type representing a fitted F-Localization (i.e., wherein the F-localization has already been determined by data).

Empirikos.FittedInfinityNormDensityBandType
FittedInfinityNormDensityBand

The result of running julia StatsBase.fit(opt::InfinityNormDensityBand, Zs) Here opt is an instance of InfinityNormDensityBand and Zs is a vector of AbstractNormalSamples distributed according to a density $f$..

Fields:

  • a_min,a_max, kernel: These are the same as the fields in opt::InfinityNormDensityBand.
  • C∞: The half-width of the L∞ band.
  • fitted_kde: The fitted KernelDensity object.
Empirikos.FlatTopKernelType
FlatTopKernel(h) < InfiniteOrderKernel

Implements the FlatTopKernel with bandwidth h to be used for kernel density estimation through the KernelDensity.jl package. The flat-top kernel is defined as follows:

\[K(x) = \frac{\sin^2(1.1x/2)-\sin^2(x/2)}{\pi x^2/ 20}.\]

Its use case is similar to the SincKernel, however it has the advantage of being integrable (in the Lebesgue sense) and having bounded total variation. Its Fourier transform is the following:

\[K^*(t) = \begin{cases} 1, & \text{ if } t|\leq 1 \\ 0, &\text{ if } |t| \geq 1.1 \\ 11-10|t|,& \text{ if } |t| \in [1,1.1] \end{cases}\]

julia> Empirikos.FlatTopKernel(0.1)
FlatTopKernel | bandwidth = 0.1
Empirikos.FoldedNormalSampleType
FoldedNormalSample(Z,σ)

An observed sample $Z$ equal to the absolute value of a draw from a Normal distribution with known variance $\sigma^2 > 0$.

\[Z = |Y|, Y\sim \mathcal{N}(\mu, \sigma^2)\]

$\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

Empirikos.GaussianScaleMixtureClassType
GaussianScaleMixtureClass(σs) <: Empirikos.ConvexPriorClass

Type representing the family of mixtures of Gaussians with mean 0 and standard deviations equal to σs. GaussianScaleMixtureClass(σs) represents the same class of distributions as MixturePriorClass.(Normal.(0, σs))

julia> gcal = GaussianScaleMixtureClass([1.0,2.0])
GaussianScaleMixtureClass | σs = [1.0, 2.0]

julia> gcal([0.2,0.8])
MixtureModel{Normal{Float64}}(K = 2)
components[1] (prior = 0.2000): Normal{Float64}(μ=0.0, σ=1.0)
components[2] (prior = 0.8000): Normal{Float64}(μ=0.0, σ=2.0)
Empirikos.HalfCIWidthType
HalfCIWidth(n::Integer, α::Float64) <: DeltaTuner

A DeltaTuner that chooses the δ ≧ δ_min the optimizes the worst-case confidence interval width. Here n is the sample size used for estimation.

Empirikos.InfinityNormDensityBandType
InfinityNormDensityBand(;a_min,
                         a_max,
                         kernel  =  Empirikos.FlatTopKernel(),
                         bootstrap = :Multinomial,
                         nboot = 1000,
                         α = 0.05,
                         rng = Random.MersenneTwister(1)
                    )  <: FLocalization

This struct contains hyperparameters that will be used for constructing a neighborhood of the marginal density. The steps of the method (and corresponding hyperparameter meanings) are as follows

  • First a kernel density estimate $\bar{f}$ with kernel is fit to the data.
  • Second, a bootstrap (options: :Multinomial or Poisson) with nboot bootstrap replicates will be used to estimate $c_n$, such that:

\[\liminf_{n \to \infty}\mathbb{P}\left[\sup_{x \in [a_{\text{min}} , a_{\text{max}}]} | \bar{f}(x) - f(x)| \leq c_ n\right] \geq 1-\alpha\]

Note that the bound is valid from a_min to a_max. α is the nominal level and finally rng sets the seed for the bootstrap samples.

Empirikos.KolmogorovSmirnovMinimumDistanceType
KolmogorovSmirnovMinimumDistance(convexclass, solver) <: Empirikos.EBayesMethod

Given $n$ i.i.d. samples from the empirical Bayes problem with prior $G$ known to lie in the convexclass $\mathcal{G}$ , estimate $G$ as follows:

\[\widehat{G}_n \in \operatorname{argmin}_{G \in \mathcal{G}}\{\sup_{t \in \mathbb R}\lvert F_G(t) - \widehat{F}_n(t)\rvert\},\]

where $\widehat{F}_n$ is the ECDF of the samples. The optimization is conducted by a JuMP compatible solver.

Empirikos.LinearEBayesTargetType
LinearEBayesTarget <: EBayesTarget

Abstract type that describes Empirical Bayes estimands that are linear functionals of the prior G.

Empirikos.MarginalDensityType
MarginalDensity(Z::EBayesSample) <: LinearEBayesTarget

Example call

MarginalDensity(StandardNormalSample(2.0))

Description

Describes the marginal density evaluated at $Z=z$ (e.g. $Z=2$ in the example above). In the example above the sample is drawn from the hierarchical model

\[\mu \sim G, Z \sim \mathcal{N}(0,1)\]

In other words, letting $\varphi$ the Standard Normal pdf

\[L(G) = \varhi \star dG(z)\]

Note that 2.0 has to be wrapped inside StandardNormalSample(2.0) since this target depends not only on G and the location, but also on the likelihood.

Empirikos.MixturePriorClassType
MixturePriorClass(components) <: Empirikos.ConvexPriorClass

Type representing the family of all mixture distributions with mixing components equal to components, i.e., it represents all MixtureModel distributions with components = components and probs taking values on the probability simplex.

Note that MixturePriorClass(components)(probs) == MixtureModel(components, probs).

Examples

julia> gcal = MixturePriorClass([Normal(0,1), Normal(0,2)])
MixturePriorClass (K = 2)
Normal{Float64}(μ=0.0, σ=1.0)
Normal{Float64}(μ=0.0, σ=2.0)

julia> gcal([0.2,0.8])
MixtureModel{Normal{Float64}}(K = 2)
components[1] (prior = 0.2000): Normal{Float64}(μ=0.0, σ=1.0)
components[2] (prior = 0.8000): Normal{Float64}(μ=0.0, σ=2.0)
Empirikos.NPMLEType
NPMLE(convexclass, solver) <: Empirikos.EBayesMethod

Given $n$ independent samples $Z_i$ from the empirical Bayes problem with prior $G$ known to lie in the convexclass $\mathcal{G}$, estimate $G$ by Nonparametric Maximum Likelihood (NPMLE)

\[\widehat{G}_n \in \operatorname{argmax}_{G \in \mathcal{G}}\left\{\sum_{i=1}^n \log( f_{i,G}(Z_i)) \right\},\]

where $f_{i,G}(z) = \int p_i(z \mid \mu) dG(\mu)$ is the marginal density of the $i$-th sample. The optimization is conducted by a JuMP compatible solver.

Empirikos.NonCentralHypergeometricSampleType
NonCentralHypergeometricSample

Empirical Bayes sample type used to represent a 2×2 contigency table drawn from Fisher's noncentral hypergeometric distribution conditionally on table margins. The goal is to conduct inference for the log odds ratio θ.

More concretely, suppose we observe the following contigency table. | | Outcome 1 | Outcome 2 | | |:–––-:|:––––-:|:––––-:|:–––-:| |Stratum 1| Z₁ | X₁ | n₁ | |Stratum 2| Z₂ | X₂ | n₂ | | | Z₁pZ₂ | . | . |

This table can be turned into an empirical Bayes sample through either of the following calls:

NonCentralHypergeometricSample(Z₁, n₁, n₂, Z₁pZ₂)
NonCentralHypergeometricSample(Z₁, X₁, Z₂, X₂; margin_entries=false)

The likelihood of the above as a function of the log odds ratio θ is given by:

\[\frac{\binom{n_1}{Z_1}\binom{n_2}{Z_2} \exp(\theta Z_1)}{\sum_{t}\binom{n_1}{t}\binom{n_2}{Z_1pZ_2 - t}\exp(\theta t)}.\]

Empirikos.NormalChiSquareSampleType
NormalChiSquareSample(Z, S², ν)

This type represents a tuple $(Z, S^2)$ consisting of the following two measurements:

  • Z, a Gaussian measurement $Z \sim \mathcal{N}(\mu, \sigma^2)$ centered around $\mu$ with variance $\sigma^2$,
  • , an independent unbiased measurement $S^2$ of $\sigma^2$ whose law is the scaled $\chi^2$ distribution with ν ($\nu \geq 1$) degrees of freedom:

\[(Z, S) \, \sim \, \mathcal{N}(\mu, \sigma^2) \otimes \frac{\sigma^2}{\nu} \chi^2_{\nu}.\]

Here $\sigma^2 > 0$ and $\mu \in \mathbb R$ are assumed unknown. $(Z, S^2)$ is to be used for estimation or inference of $\mu$ and $\sigma^2$.

Empirikos.NormalSampleType
NormalSample(Z,σ)

An observed sample $Z$ drawn from a Normal distribution with known variance $\sigma^2 > 0$.

\[Z \sim \mathcal{N}(\mu, \sigma^2)\]

$\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> NormalSample(0.5, 1.0)          #Z=0.5, σ=1
N(0.5; μ, σ=1.0)
Empirikos.PoissonSampleType
PoissonSample(Z, E)

An observed sample $Z$ drawn from a Poisson distribution,

\[Z \sim \text{Poisson}(\mu \cdot E).\]

The multiplying intensity $E$ is assumed to be known (and equal to 1.0 by default), while $\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> PoissonSample(3)
𝒫ℴ𝒾(3; μ)
julia> PoissonSample(3, 1.5)
𝒫ℴ𝒾(3; μ⋅1.5)
Empirikos.PosteriorDensityType
PosteriorDensity(Z::EBayesSample, μ) <: AbstractPosteriorTarget

Type representing the posterior density given Z at $\mu$, i.e.,

\[p_G(\mu \mid Z_i = z)\]

Empirikos.PosteriorMeanType
PosteriorMean(Z::EBayesSample) <: AbstractPosteriorTarget

Type representing the posterior mean, i.e.,

\[E_G[\mu_i \mid Z_i = z]\]

Empirikos.PosteriorProbabilityType
PosteriorProbability(Z::EBayesSample, s) <: AbstractPosteriorTarget

Type representing the posterior probability, i.e.,

\[\Prob_G[\mu_i \in s \mid Z_i = z]\]

Empirikos.PosteriorSecondMomentType
PosteriorSecondMoment(Z::EBayesSample) <: AbstractPosteriorTarget

Type representing the second moment of the posterior centered around c, i.e.,

\[E_G[(\mu_i-c)^2 \mid Z_i = z]\]

Empirikos.PosteriorVarianceType
PosteriorVariance(Z::EBayesSample) <: AbstractPosteriorTarget

Type representing the posterior variance, i.e.,

\[V_G[\mu_i \mid Z_i = z]\]

Empirikos.PriorDensityType
PriorDensity(z::Float64) <: LinearEBayesTarget

Example call

julia> PriorDensity(2.0)
PriorDensity{Float64}(2.0)

Description

This is the evaluation functional of the density of $G$ at z, i.e., $L(G) = G'(z) = g(z)$ or in Julia code L(G) = pdf(G, z).

Empirikos.RMSEType
RMSE(n::Integer) <: DeltaTuner

A DeltaTuner to optimizes the worst-case (root) mean squared error. Here n is the sample size used for estimation.

Empirikos.ScaledChiSquareSampleType
ScaledChiSquareSample(Z, ν)

An observed sample $Z$ drawn from a scaled chi-square distribution with unknown scale $\sigma^2 > 0$.

\[Z \sim \frac{\sigma^2}{\nu}}\chi^2_{\nu}\]

$\sigma^2$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

Empirikos.SincKernelType
SincKernel(h) <: InfiniteOrderKernel

Implements the SincKernel with bandwidth h to be used for kernel density estimation through the KernelDensity.jl package. The sinc kernel is defined as follows:

\[K_{\text{sinc}}(x) = \frac{\sin(x)}{\pi x}\]

It is not typically used for kernel density estimation, because this kernel is not a density itself. However, it is particularly well suited to deconvolution problems and estimation of very smooth densities because its Fourier transform is the following:

\[K^*_{\text{sinc}}(t) = \mathbf 1( t \in [-1,1])\]

Empirikos.StandardNormalSampleType
StandardNormalSample(Z)

An observed sample $Z$ drawn from a Normal distribution with known variance $\sigma^2 =1$.

\[Z \sim \mathcal{N}(\mu, 1)\]

$\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

julia> StandardNormalSample(0.5)          #Z=0.5
N(0.5; μ, σ=1.0)
Empirikos.SymmetricDiscretePriorClassType
SymmetricDiscretePriorClass(support) <: Empirikos.ConvexPriorClass

Type representing the family of all symmetric discrete distributions supported on a subset of support-support, i.e., it represents all DiscreteNonParametric distributions with support = [support;-support] and probs taking values on the probability simplex (so that components with same magnitude, but opposite sign have the same probability). support should include the nonnegative support points only.

Empirikos.TruncatedPoissonSampleType
TruncatedPoissonSample(Z, E)

An observed sample $Z$ drawn from a truncated Poisson distribution,

\[Z \sim \text{Poisson}(\mu \cdot E) \mid Z \geq 1.\]

The multiplying intensity $E$ is assumed to be known (and equal to 1.0 by default), while $\mu$ is assumed unknown. The type above is used when the sample $Z$ is to be used for estimation or inference of $\mu$.

TruncatedPoissonSample(3)
TruncatedPoissonSample(3, 1.5)
Base.denominatorMethod
Base.denominator(target::AbstractPosteriorTarget)

Suppose a posterior target $\theta_G(z)$, such as the posterior mean can be written as:

\[\theta_G(z) = \frac{ a_G(z)}{f_G(z)} = \frac{ \int h(\mu)dG(\mu)}{\int p(z \mid \mu)dG(\mu)}.\]

For example, for the posterior mean $h(\mu) = \mu \cdot p(z \mid \mu)$. Then Base.denominator returns the linear functional representing $G \mapsto f_G(z)$ (i.e., typically the marginal density). Also see Base.numerator(::AbstractPosteriorTarget).

Base.numeratorMethod
Base.numerator(target::AbstractPosteriorTarget)

Suppose a posterior target $\theta_G(z)$, such as the posterior mean can be written as:

\[\theta_G(z) = \frac{ a_G(z)}{f_G(z)} = \frac{ \int h(\mu)dG(\mu)}{\int p(z \mid \mu)dG(\mu)}.\]

For example, for the posterior mean $h(\mu) = \mu \cdot p(z \mid \mu)$. Then Base.numerator returns the linear functional representing $G \mapsto a_G(z)$.

Distributions.ccdfMethod
ccdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, evaluate the complementary CDF of the marginal distribution of $Z$ at response(Z).

Distributions.cdfMethod
cdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, evaluate the CDF of the marginal distribution of $Z$ at response(Z).

Distributions.cfMethod
cf(::LinearEBayesTarget, t)

The characteristic function of $L(\cdot)$, a LinearEBayesTarget, which we define as follows:

For $L(\cdot)$ which may be written as $L(G) = \int \psi(\mu)dG\mu$ (for a measurable function $\psi$) this returns the Fourier transform of $\psi$ evaluated at t, i.e., $\psi^*(t) = \int \exp(it x)\psi(x)dx$. Note that $\psi^*(t)$ is such that for distributions $G$ with density $g$ (and $g^*$ the Fourier Transform of $g$) the following holds:

\[L(G) = \frac{1}{2\pi}\int g^*(\mu)\psi^*(\mu) d\mu\]

Distributions.pdfMethod
pdf(prior::Distribution, Z::EBayesSample)

Given a prior $G$ and EBayesSample $Z$, compute the marginal density of Z.

Examples

julia> Z = StandardNormalSample(1.0)
N(1.0; μ, σ=1.0)
julia> prior = Normal(2.0, sqrt(3))
Normal{Float64}(μ=2.0, σ=1.7320508075688772)
julia> pdf(prior, Z)
0.17603266338214976
julia> pdf(Normal(2.0, 2.0), 1.0)
0.17603266338214976
Empirikos.likelihood_distributionFunction
likelihood_distribution(Z::EBayesSample, μ::Number)

Returns the distribution $p(\cdot \mid \mu)$ of $Z \mid \mu$ (the return type being a Distributions.jl Distribution).

Examples

julia> likelihood_distribution(StandardNormalSample(1.0), 2.0)
Normal{Float64}(μ=2.0, σ=1.0)
Empirikos.marginalizeFunction
marginalize(Z::EBayesSample, prior::Distribution)

Given a prior distribution $G$ and EBayesSample $Z$, return that marginal distribution of $Z$. Works for EBayesSample{Missing}`, i.e., no realization is needed.

Examples

jldoctest julia> marginalize(StandardNormalSample(1.0), Normal(2.0, sqrt(3))) Normal{Float64}(μ=2.0, σ=1.9999999999999998)`

StatsAPI.confintMethod
StatsBase.confint(method::AMARI,
                  target::Empirikos.EBayesTarget,
                  Zs;
                  level=0.95)

Form a confidence interval for the Empirikos.EBayesTarget target with coverage level based on the samples Zs using the AMARI method.

StatsAPI.fitMethod
fit(test::EmpiricalPartiallyBayesMultipleTest, Zs::AbstractArray{<:NormalChiSquareSample})

Fit the empirical partially Bayes multiple testing model.

Arguments

  • test: An EmpiricalPartiallyBayesMultipleTest object.
  • Zs: An array of NormalChiSquareSample objects.

Returns

A named tuple containing the following fields:

  • method: The EmpiricalPartiallyBayesMultipleTest object.
  • prior: The estimated prior distribution.
  • pvalue: An array of empirical partially Bayes p-values.
  • cutoff: The cutoff value (such that all hypotheses with pvalue ≤ cutoff are rejected).
  • adjp: An array of adjusted p-values.
  • rj_idx: An array of rejection indicators.
  • total_rejections: The total number of rejections.
StatsAPI.responseMethod
response(Z::EBayesSample{T})

Returns the concrete realization of Z as type T, thus dropping the information about the likelihood.

Examples

julia> response(StandardNormalSample(1.0))
1.0
Empirikos.NeighborhoodsModule
Neighborhoods

The impact of Neighborhoods: Moving to opportunity

The reference for this dataset is the following:

Raj Chetty and Nathaniel Hendren.

The impacts of neighborhoods on intergenerational mobility II: County-level estimates. The Quarterly Journal of Economics, 133(3):1163– 1228, 2018.

Empirikos.ThyrionModule
Thyrion

LA ROYALE BELGE

A statistic covering vehicles in the category 'Tourism and Business' and
belonging to the 2 lower classes of the rate, observed all during an entire year,
gave the following results in which:
Empirikos.ProstateModule
Prostate

The dataset is from the following reference:

Dinesh Singh, Phillip G. Febbo, Kenneth Ross, Donald G. Jackson, Judith Manola,

Christine Ladd, Pablo Tamayo, Andrew A. Renshaw, Anthony V. D’Amico, Jerome P. Richie, Eric S. Lander, Massimo Loda, Philip W. Kantoff, Todd R. Golub, and William R. Sellers. Gene expression correlates of clinical prostate cancer behavior. Cancer cell, 1(2): 203–209, 2002.

See the following monograph for further illustrations of empirical Bayes methods on this dataset:

Bradley Efron. Large-scale inference: Empirical Bayes methods for estimation, testing,

and prediction. Cambridge University Press, 2012