Examples
Contents
Ratio of two means
Consider a setting where independent pairs of random variables $(X_1, Y_1), \ldots, (X_n, Y_n)$ are observed, and suppose that interest is in the ratio of the mean of $Y_i$ to the mean of $X_i$, that is $\theta = \mu_Y / \mu_X$, with $\mu_X = E(X_i)$ and $\mu_Y = E(Y_i) \ne 0$$(i = 1, \ldots, n)$.
Assuming that sampling is from an infinite population, one way of estimating $\theta$ without any further assumptions about the joint distribution of $(X_i, Y_i)$ is to set the unbiased estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The resulting $M$-estimator is then $\hat\theta = s_Y/s_X$ where $s_X = \sum_{i = 1}^n X_i$ and $s_Y = \sum_{i = 1}^n Y_i$.
The estimator $\hat\theta$ is generally biased, as can be shown, for example, by an application of the Jensen inequality assuming that $X_i$is independent of $Y_i$, and its bias can be reduced using the empirically adjusted estimating functions approach in Kosmidis & Lunardon (2020).
This example illustrates how GEEBRA can be used to calculate the $M$-estimator and its reduced-bias version.
julia> using GEEBRA, Random
Define a data type for ratio estimation problems
julia> struct ratio_data
y::Vector
x::Vector
end;
Write a function to compute the number of observations for objects of type ratio_data
.
julia> function ratio_nobs(data::ratio_data)
nx = length(data.x)
ny = length(data.y)
if (nx != ny)
error("length of x is not equal to the length of y")
end
nx
end;
Generate some data to test things out
julia> Random.seed!(123);
julia> my_data = ratio_data(randn(10), rand(10));
julia> ratio_nobs(my_data)
10
The estimating function for the ratio $\theta$ is
$\sum_{i = 1}^n (Y_i - \theta X_i)$
So, the contribution to the estimating function can be implemented as
julia> function ratio_ef(theta::Vector,
data::ratio_data,
i::Int64)
data.y[i] .- theta * data.x[i]
end;
The estimating_function_template
for the ratio estimation problem can now be set up using ratio_nobs
and ratio_ef
.
julia> ratio_template = estimating_function_template(ratio_nobs, ratio_ef);
We are now ready use ratio_template
and my_data
to compute the $M$-estimator of $\theta$ by solving the esitmating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The starting value for the nonlinear solver is set to 0.1
.
julia> result_m = fit(ratio_template, my_data, [0.1])
M-estimation with estimating function contributions ratio_ef
────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1] 1.07548 0.573615 1.87492 0.0608 -0.0487819 2.19975
────────────────────────────────────────────────────────────────────────
Estimating functions: [-2.051803171809752e-12]
fit
uses methods from the NLsolve package for solving the estimating equations. Arguments can be passed directly to NLsolve.nlsolve
through keyword arguments to the fit
method. For example,
julia> result_m = fit(ratio_template, my_data, [0.1], show_trace = true)
Iter f(x) inf-norm Step 2-norm
------ -------------- --------------
0 4.321212e+00 NaN
1 3.878230e+00 1.000000e-01
2 2.992266e+00 2.000000e-01
3 1.220339e+00 4.000000e-01
4 2.051803e-12 2.754829e-01
M-estimation with estimating function contributions ratio_ef
────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1] 1.07548 0.573615 1.87492 0.0608 -0.0487819 2.19975
────────────────────────────────────────────────────────────────────────
Estimating functions: [-2.051803171809752e-12]
Bias reduction in general $M$-estimation can be achieved by solving the adjusted estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) + A(\theta, Y, X) = 0$, where $A(\theta)$ are empirical bias-reducing adjustments depending on the first and second derivatives of the estimating function contributions. GEEBRA can use ratio_template
and automatic differentiation (see, ForwardDiff) to construct $A(\theta, Y, X)$ and, then, solve the bias-reducing adjusted estimating equations. All this is simply done by
julia> result_br = fit(ratio_template, my_data, [0.1], estimation_method = "RBM")
RBM-estimation with estimating function contributions ratio_ef
Bias reduction method: implicit_trace
────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1] 1.06754 0.573499 1.86146 0.0627 -0.0564928 2.19158
────────────────────────────────────────────────────────────────────────
Adjusted estimating functions: [-1.3371248552829229e-14]
where RBM
stands for reduced-bias M
-estimation.
Kosmidis & Lunardon (2020) show that the reduced-bias estimator of $\theta$ is $\tilde\theta = (s_Y + s_{XY}/s_{X})/(s_X + s_{XX}/s_{X})$. The code chunks below tests that this is indeed the result GEEBRA returns.
julia> sx = sum(my_data.x);
julia> sxx = sum(my_data.x .* my_data.x);
julia> sy = sum(my_data.y);
julia> sxy = sum(my_data.x .* my_data.y);
julia> isapprox(sy/sx, result_m.theta[1])
true
julia> isapprox((sy + sxy/sx)/(sx + sxx/sx), result_br.theta[1])
true
Logistic regression
Using objective_function_template
Here, we use GEEBRA's objective_function_template
to estimate a logistic regression model using maximum likelihood and maximum penalized likelihood, with the empirical bias-reducing penalty in Kosmidis & Lunardon (2020).
julia> using GEEBRA
julia> using Random
julia> using Distributions
julia> using Optim
A data type for logistic regression models (consisting of a response vector y
, a model matrix x
, and a vector of weights m
) is
julia> struct logistic_data
y::Vector
x::Array{Float64}
m::Vector
end
A function to compute the number of observations from logistic_data
objects is
julia> function logistic_nobs(data::logistic_data)
nx = size(data.x)[1]
ny = length(data.y)
nm = length(data.m)
if (nx != ny)
error("number of rows in of x is not equal to the length of y")
elseif (nx != nm)
error("number of rows in of x is not equal to the length of m")
elseif (ny != nm)
error("length of y is not equal to the length of m")
end
nx
end
logistic_nobs (generic function with 1 method)
The logistic regression log-likelihood contribution at a parameter theta
for the $i$th observations of data data
is
julia> function logistic_loglik(theta::Vector,
data::logistic_data,
i::Int64)
eta = sum(data.x[i, :] .* theta)
mu = exp.(eta)./(1 .+ exp.(eta))
data.y[i] .* log.(mu) + (data.m[i] - data.y[i]) .* log.(1 .- mu)
end
logistic_loglik (generic function with 1 method)
Let's simulate some logistic regression data with $10$ covariates
julia> Random.seed!(123);
julia> n = 100;
julia> m = 1;
julia> p = 10
10
julia> x = Array{Float64}(undef, n, p);
julia> x[:, 1] .= 1.0;
julia> for j in 2:p
x[:, j] .= rand(n);
end
julia> true_betas = randn(p) * sqrt(p);
julia> y = rand.(Binomial.(m, cdf.(Logistic(), x * true_betas)));
julia> my_data = logistic_data(y, x, fill(m, n));
and set up an objective_function_template
for logistic regression
julia> logistic_template = objective_function_template(logistic_nobs, logistic_loglik)
objective_function_template(Main.ex-2.logistic_nobs, Main.ex-2.logistic_loglik)
The maximum likelihood estimates starting at true_betas
are
julia> o1_ml = fit(logistic_template, my_data, true_betas, optim_method = NelderMead())
M-estimation with objective contributions logistic_loglik
──────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
──────────────────────────────────────────────────────────────────────────
theta[1] 1.03786 2.46744 0.42062 0.6740 -3.79824 5.87395
theta[2] -6.33539 1.53213 -4.13502 <1e-4 -9.33832 -3.33247
theta[3] -6.76459 1.63753 -4.13096 <1e-4 -9.9741 -3.55508
theta[4] 0.994292 1.1077 0.89762 0.3694 -1.17676 3.16534
theta[5] 5.4302 1.6185 3.35508 0.0008 2.25799 8.6024
theta[6] 6.58354 1.92567 3.41884 0.0006 2.8093 10.3578
theta[7] 4.92474 1.39467 3.53111 0.0004 2.19123 7.65824
theta[8] 1.29824 1.25277 1.0363 0.3001 -1.15714 3.75362
theta[9] -3.00981 1.43218 -2.10156 0.0356 -5.81682 -0.202796
theta[10] -4.96494 1.85256 -2.68005 0.0074 -8.59589 -1.334
──────────────────────────────────────────────────────────────────────────
Maximum objetive: -31.9044
Takeuchi information criterion: 83.6381
Akaike information criterion: 83.8088
fit
uses methods from the Optim package internally. Here, we used the Optim.NelderMead
method. Alternative optimization methods and options can be supplied directly through the keyword argumentsmethod
and optim.Options
, respectively. For example,
julia> o2_ml = fit(logistic_template, my_data, true_betas, optim_method = LBFGS(), optim_options = Optim.Options(g_abstol = 1e-05))
M-estimation with objective contributions logistic_loglik
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 1.03787 2.46744 0.420628 0.6740 -3.79822 5.87397
theta[2] -6.33546 1.53214 -4.13505 <1e-4 -9.3384 -3.33253
theta[3] -6.76458 1.63752 -4.13098 <1e-4 -9.97407 -3.55509
theta[4] 0.994237 1.1077 0.897571 0.3694 -1.17681 3.16528
theta[5] 5.43023 1.61849 3.35512 0.0008 2.25805 8.60241
theta[6] 6.58354 1.92566 3.41885 0.0006 2.80932 10.3578
theta[7] 4.9247 1.39466 3.53111 0.0004 2.19122 7.65818
theta[8] 1.29827 1.25278 1.03632 0.3001 -1.15712 3.75367
theta[9] -3.00978 1.43216 -2.10156 0.0356 -5.81676 -0.202795
theta[10] -4.96494 1.85254 -2.68006 0.0074 -8.59585 -1.33402
───────────────────────────────────────────────────────────────────────────
Maximum objetive: -31.9044
Takeuchi information criterion: 83.6381
Akaike information criterion: 83.8088
The reduced-bias estimates starting at the maximum likelihood ones are
julia> o1_br = fit(logistic_template, my_data, coef(o1_ml), estimation_method = "RBM")
RBM-estimation with objective contributions logistic_loglik
Bias reduction method: implicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.928169 2.20201 0.42151 0.6734 -3.38769 5.24402
theta[2] -5.4578 1.20924 -4.51341 <1e-5 -7.82787 -3.08773
theta[3] -5.81835 1.26797 -4.58872 <1e-5 -8.30352 -3.33318
theta[4] 0.81906 0.980035 0.835746 0.4033 -1.10177 2.73989
theta[5] 4.64568 1.30015 3.5732 0.0004 2.09744 7.19393
theta[6] 5.66444 1.54657 3.66258 0.0002 2.63321 8.69566
theta[7] 4.16222 1.11835 3.72174 0.0002 1.97028 6.35415
theta[8] 1.14271 1.13618 1.00575 0.3145 -1.08416 3.36957
theta[9] -2.58574 1.2083 -2.13999 0.0324 -4.95396 -0.217525
theta[10] -4.22949 1.49518 -2.82876 0.0047 -7.15999 -1.299
───────────────────────────────────────────────────────────────────────────
Maximum penalized objetive: -36.5538
Takeuchi information criterion: 81.9311
Akaike information criterion: 84.284
Using estimating_function_template
The same results as above can be returned using an estimating_function_template
for logistic regression.
The contribution to the derivatives of the log-likelihood for logistic regression is
julia> function logistic_ef(theta::Vector,
data::logistic_data,
i::Int64)
eta = sum(data.x[i, :] .* theta)
mu = exp.(eta)./(1 .+ exp.(eta))
data.x[i, :] * (data.y[i] - data.m[i] * mu)
end
logistic_ef (generic function with 1 method)
Then, solving the bias-reducing adjusted estimating equations
julia> logistic_template_ef = estimating_function_template(logistic_nobs, logistic_ef);
julia> e1_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM")
RBM-estimation with estimating function contributions logistic_ef
Bias reduction method: implicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.928169 2.20201 0.42151 0.6734 -3.38769 5.24402
theta[2] -5.4578 1.20924 -4.51341 <1e-5 -7.82787 -3.08773
theta[3] -5.81835 1.26797 -4.58872 <1e-5 -8.30352 -3.33318
theta[4] 0.81906 0.980035 0.835746 0.4033 -1.10177 2.73989
theta[5] 4.64568 1.30015 3.5732 0.0004 2.09744 7.19393
theta[6] 5.66444 1.54657 3.66258 0.0002 2.63321 8.69566
theta[7] 4.16222 1.11835 3.72174 0.0002 1.97028 6.35415
theta[8] 1.14271 1.13618 1.00575 0.3145 -1.08416 3.36957
theta[9] -2.58574 1.2083 -2.13999 0.0324 -4.95396 -0.217525
theta[10] -4.22949 1.49518 -2.82876 0.0047 -7.15999 -1.299
───────────────────────────────────────────────────────────────────────────
Adjusted estimating functions: [-5.200950781159008e-12; -7.726777551120279e-12; … ; -5.21382936824466e-12; -4.383812757247085e-12]
returns the reduced-bias estimates from maximum penalized likelihood:
julia> isapprox(coef(o1_br), coef(e1_br))
true
Bias-reduction methods
GEEBRA currently implements 2 alternative bias reduction methods, called implicit_trace
and explicit_trace
. implicit_trace
will adjust the estimating functions or penalize the objectives, as we have seen earlier. explicit_trace
, on the other hand, will form an estimate of the bias of the $M$-estimator and subtract that from the $M$-estimates. The default method is implicit_trace
.
For example, for logistic regression via estimating functions
julia> e2_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with estimating function contributions logistic_ef
Bias reduction method: explicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.895301 2.10014 0.426305 0.6699 -3.2209 5.0115
theta[2] -5.11645 1.11115 -4.60463 <1e-5 -7.29427 -2.93863
theta[3] -5.46494 1.15431 -4.73437 <1e-5 -7.72736 -3.20253
theta[4] 0.74801 0.936884 0.798402 0.4246 -1.08825 2.58427
theta[5] 4.34256 1.20242 3.6115 0.0003 1.98585 6.69926
theta[6] 5.29117 1.42302 3.71828 0.0002 2.50211 8.08023
theta[7] 3.87517 1.0339 3.74812 0.0002 1.84877 5.90158
theta[8] 1.07407 1.09374 0.98202 0.3261 -1.06962 3.21776
theta[9] -2.42976 1.13518 -2.14042 0.0323 -4.65466 -0.204855
theta[10] -3.93176 1.37586 -2.85768 0.0043 -6.62839 -1.23512
───────────────────────────────────────────────────────────────────────────
Adjusted estimating functions: [0.04705609374710429; -0.006424930231929443; … ; -0.0238329565743529; -0.034942423671889]
which gives slightly different estimates that what are in the implict_trace
fit in e1_br
.
The same can be done using objective functions, but numerical differentiation (using the FiniteDiff package) is used to approximate the gradient of the bias-reducing penalty (i.e. $A(\theta)$).
julia> o2_br = fit(logistic_template, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with objective contributions logistic_loglik
Bias reduction method: explicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.895301 2.10014 0.426305 0.6699 -3.2209 5.0115
theta[2] -5.11645 1.11115 -4.60463 <1e-5 -7.29427 -2.93863
theta[3] -5.46494 1.15431 -4.73437 <1e-5 -7.72736 -3.20253
theta[4] 0.74801 0.936884 0.798402 0.4246 -1.08825 2.58427
theta[5] 4.34256 1.20242 3.6115 0.0003 1.98585 6.69926
theta[6] 5.29117 1.42302 3.71828 0.0002 2.50211 8.08023
theta[7] 3.87517 1.0339 3.74812 0.0002 1.84877 5.90158
theta[8] 1.07407 1.09374 0.98202 0.3261 -1.06962 3.21776
theta[9] -2.42976 1.13518 -2.14042 0.0323 -4.65466 -0.204855
theta[10] -3.93176 1.37586 -2.85768 0.0043 -6.62839 -1.23512
───────────────────────────────────────────────────────────────────────────
Maximum penalized objetive: -36.607
Takeuchi information criterion: 81.6646
Akaike information criterion: 84.7635
julia> isapprox(coef(e2_br), coef(o2_br))
true