# Examples

## Ratio of two means

Consider a setting where independent pairs of random variables $(X_1, Y_1), \ldots, (X_n, Y_n)$ are observed, and suppose that interest is in the ratio of the mean of $Y_i$ to the mean of $X_i$, that is $\theta = \mu_Y / \mu_X$, with $\mu_X = E(X_i)$ and $\mu_Y = E(Y_i) \ne 0$$(i = 1, \ldots, n)$.

Assuming that sampling is from an infinite population, one way of estimating $\theta$ without any further assumptions about the joint distribution of $(X_i, Y_i)$ is to set the unbiased estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The resulting $M$-estimator is then $\hat\theta = s_Y/s_X$ where $s_X = \sum_{i = 1}^n X_i$ and $s_Y = \sum_{i = 1}^n Y_i$.

The estimator $\hat\theta$ is generally biased, as can be shown, for example, by an application of the Jensen inequality assuming that $X_i$is independent of $Y_i$, and its bias can be reduced using the empirically adjusted estimating functions approach in Kosmidis & Lunardon (2020).

This example illustrates how GEEBRA can be used to calculate the $M$-estimator and its reduced-bias version.

julia> using GEEBRA, Random

Define a data type for ratio estimation problems

julia> struct ratio_data
y::Vector
x::Vector
end;

Write a function to compute the number of observations for objects of type ratio_data.

julia> function ratio_nobs(data::ratio_data)
nx = length(data.x)
ny = length(data.y)
if (nx != ny)
error("length of x is not equal to the length of y")
end
nx
end;

Generate some data to test things out

julia> Random.seed!(123);

julia> my_data = ratio_data(randn(10), rand(10));

julia> ratio_nobs(my_data)
10

The estimating function for the ratio $\theta$ is

$\sum_{i = 1}^n (Y_i - \theta X_i)$

So, the contribution to the estimating function can be implemented as

julia> function ratio_ef(theta::Vector,
data::ratio_data,
i::Int64)
data.y[i] .- theta * data.x[i]
end;

The estimating_function_template for the ratio estimation problem can now be set up using ratio_nobs and ratio_ef.

julia>     ratio_template = estimating_function_template(ratio_nobs, ratio_ef);

We are now ready use ratio_template and my_data to compute the $M$-estimator of $\theta$ by solving the esitmating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The starting value for the nonlinear solver is set to 0.1.

julia> result_m = fit(ratio_template, my_data, [0.1])
M-estimation with estimating function contributions ratio_ef

────────────────────────────────────────────────────────────────────────
Estimate  Std. Error  z value  Pr(>|z|)   Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1]   1.07548    0.573615  1.87492    0.0608  -0.0487819    2.19975
────────────────────────────────────────────────────────────────────────
Estimating functions:	[-2.051803171809752e-12]

fit uses methods from the NLsolve package for solving the estimating equations. Arguments can be passed directly to NLsolve.nlsolve through keyword arguments to the fit method. For example,

julia> result_m = fit(ratio_template, my_data, [0.1], show_trace = true)
Iter     f(x) inf-norm    Step 2-norm
------   --------------   --------------
0     4.321212e+00              NaN
1     3.878230e+00     1.000000e-01
2     2.992266e+00     2.000000e-01
3     1.220339e+00     4.000000e-01
4     2.051803e-12     2.754829e-01
M-estimation with estimating function contributions ratio_ef

────────────────────────────────────────────────────────────────────────
Estimate  Std. Error  z value  Pr(>|z|)   Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1]   1.07548    0.573615  1.87492    0.0608  -0.0487819    2.19975
────────────────────────────────────────────────────────────────────────
Estimating functions:	[-2.051803171809752e-12]

Bias reduction in general $M$-estimation can be achieved by solving the adjusted estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) + A(\theta, Y, X) = 0$, where $A(\theta)$ are empirical bias-reducing adjustments depending on the first and second derivatives of the estimating function contributions. GEEBRA can use ratio_template and automatic differentiation (see, ForwardDiff) to construct $A(\theta, Y, X)$ and, then, solve the bias-reducing adjusted estimating equations. All this is simply done by

julia> result_br = fit(ratio_template, my_data, [0.1], estimation_method = "RBM")
RBM-estimation with estimating function contributions ratio_ef
Bias reduction method: implicit_trace

────────────────────────────────────────────────────────────────────────
Estimate  Std. Error  z value  Pr(>|z|)   Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1]   1.06754    0.573499  1.86146    0.0627  -0.0564928    2.19158
────────────────────────────────────────────────────────────────────────
Adjusted estimating functions:	[-1.3371248552829229e-14]

where RBM stands for reduced-bias M-estimation.

Kosmidis & Lunardon (2020) show that the reduced-bias estimator of $\theta$ is $\tilde\theta = (s_Y + s_{XY}/s_{X})/(s_X + s_{XX}/s_{X})$. The code chunks below tests that this is indeed the result GEEBRA returns.

julia> sx = sum(my_data.x);

julia> sxx = sum(my_data.x .* my_data.x);

julia> sy = sum(my_data.y);

julia> sxy = sum(my_data.x .* my_data.y);

julia> isapprox(sy/sx, result_m.theta[1])
true

julia> isapprox((sy + sxy/sx)/(sx + sxx/sx), result_br.theta[1])
true

## Logistic regression

### Using objective_function_template

Here, we use GEEBRA's objective_function_template to estimate a logistic regression model using maximum likelihood and maximum penalized likelihood, with the empirical bias-reducing penalty in Kosmidis & Lunardon (2020).

julia> using GEEBRA

julia> using Random

julia> using Distributions

julia> using Optim

A data type for logistic regression models (consisting of a response vector y, a model matrix x, and a vector of weights m) is

julia> struct logistic_data
y::Vector
x::Array{Float64}
m::Vector
end

A function to compute the number of observations from logistic_data objects is

julia> function logistic_nobs(data::logistic_data)
nx = size(data.x)[1]
ny = length(data.y)
nm = length(data.m)
if (nx != ny)
error("number of rows in of x is not equal to the length of y")
elseif (nx != nm)
error("number of rows in of x is not equal to the length of m")
elseif (ny != nm)
error("length of y is not equal to the length of m")
end
nx
end
logistic_nobs (generic function with 1 method)

The logistic regression log-likelihood contribution at a parameter theta for the $i$th observations of data data is

julia> function logistic_loglik(theta::Vector,
data::logistic_data,
i::Int64)
eta = sum(data.x[i, :] .* theta)
mu = exp.(eta)./(1 .+ exp.(eta))
data.y[i] .* log.(mu) + (data.m[i] - data.y[i]) .* log.(1 .- mu)
end
logistic_loglik (generic function with 1 method)

Let's simulate some logistic regression data with $10$ covariates

julia> Random.seed!(123);

julia> n = 100;

julia> m = 1;

julia> p = 10
10

julia> x = Array{Float64}(undef, n, p);

julia> x[:, 1] .= 1.0;

julia> for j in 2:p
x[:, j] .= rand(n);
end

julia> true_betas = randn(p) * sqrt(p);

julia> y = rand.(Binomial.(m, cdf.(Logistic(), x * true_betas)));

julia> my_data = logistic_data(y, x, fill(m, n));

and set up an objective_function_template for logistic regression

julia> logistic_template = objective_function_template(logistic_nobs, logistic_loglik)
objective_function_template(Main.ex-2.logistic_nobs, Main.ex-2.logistic_loglik)

The maximum likelihood estimates starting at true_betas are

julia> o1_ml = fit(logistic_template, my_data, true_betas, optim_method = NelderMead())
M-estimation with objective contributions logistic_loglik

──────────────────────────────────────────────────────────────────────────
Estimate  Std. Error   z value  Pr(>|z|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────────
theta[1]    1.03786      2.46744   0.42062    0.6740   -3.79824   5.87395
theta[2]   -6.33539      1.53213  -4.13502    <1e-4    -9.33832  -3.33247
theta[3]   -6.76459      1.63753  -4.13096    <1e-4    -9.9741   -3.55508
theta[4]    0.994292     1.1077    0.89762    0.3694   -1.17676   3.16534
theta[5]    5.4302       1.6185    3.35508    0.0008    2.25799   8.6024
theta[6]    6.58354      1.92567   3.41884    0.0006    2.8093   10.3578
theta[7]    4.92474      1.39467   3.53111    0.0004    2.19123   7.65824
theta[8]    1.29824      1.25277   1.0363     0.3001   -1.15714   3.75362
theta[9]   -3.00981      1.43218  -2.10156    0.0356   -5.81682  -0.202796
theta[10]  -4.96494      1.85256  -2.68005    0.0074   -8.59589  -1.334
──────────────────────────────────────────────────────────────────────────
Maximum objetive:		-31.9044
Takeuchi information criterion:	83.6381
Akaike information criterion:	83.8088

fit uses methods from the Optim package internally. Here, we used the Optim.NelderMead method. Alternative optimization methods and options can be supplied directly through the keyword argumentsmethod and optim.Options, respectively. For example,

julia> o2_ml = fit(logistic_template, my_data, true_betas, optim_method = LBFGS(), optim_options = Optim.Options(g_abstol = 1e-05))
M-estimation with objective contributions logistic_loglik

───────────────────────────────────────────────────────────────────────────
Estimate  Std. Error    z value  Pr(>|z|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1]    1.03787      2.46744   0.420628    0.6740   -3.79822   5.87397
theta[2]   -6.33546      1.53214  -4.13505     <1e-4    -9.3384   -3.33253
theta[3]   -6.76458      1.63752  -4.13098     <1e-4    -9.97407  -3.55509
theta[4]    0.994237     1.1077    0.897571    0.3694   -1.17681   3.16528
theta[5]    5.43023      1.61849   3.35512     0.0008    2.25805   8.60241
theta[6]    6.58354      1.92566   3.41885     0.0006    2.80932  10.3578
theta[7]    4.9247       1.39466   3.53111     0.0004    2.19122   7.65818
theta[8]    1.29827      1.25278   1.03632     0.3001   -1.15712   3.75367
theta[9]   -3.00978      1.43216  -2.10156     0.0356   -5.81676  -0.202795
theta[10]  -4.96494      1.85254  -2.68006     0.0074   -8.59585  -1.33402
───────────────────────────────────────────────────────────────────────────
Maximum objetive:		-31.9044
Takeuchi information criterion:	83.6381
Akaike information criterion:	83.8088

The reduced-bias estimates starting at the maximum likelihood ones are

julia> o1_br = fit(logistic_template, my_data, coef(o1_ml), estimation_method = "RBM")
RBM-estimation with objective contributions logistic_loglik
Bias reduction method: implicit_trace

───────────────────────────────────────────────────────────────────────────
Estimate  Std. Error    z value  Pr(>|z|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1]    0.928169    2.20201    0.42151     0.6734   -3.38769   5.24402
theta[2]   -5.4578      1.20924   -4.51341     <1e-5    -7.82787  -3.08773
theta[3]   -5.81835     1.26797   -4.58872     <1e-5    -8.30352  -3.33318
theta[4]    0.81906     0.980035   0.835746    0.4033   -1.10177   2.73989
theta[5]    4.64568     1.30015    3.5732      0.0004    2.09744   7.19393
theta[6]    5.66444     1.54657    3.66258     0.0002    2.63321   8.69566
theta[7]    4.16222     1.11835    3.72174     0.0002    1.97028   6.35415
theta[8]    1.14271     1.13618    1.00575     0.3145   -1.08416   3.36957
theta[9]   -2.58574     1.2083    -2.13999     0.0324   -4.95396  -0.217525
theta[10]  -4.22949     1.49518   -2.82876     0.0047   -7.15999  -1.299
───────────────────────────────────────────────────────────────────────────
Maximum penalized objetive:	-36.5538
Takeuchi information criterion:	81.9311
Akaike information criterion:	84.284

### Using estimating_function_template

The same results as above can be returned using an estimating_function_template for logistic regression.

The contribution to the derivatives of the log-likelihood for logistic regression is

julia> function logistic_ef(theta::Vector,
data::logistic_data,
i::Int64)
eta = sum(data.x[i, :] .* theta)
mu = exp.(eta)./(1 .+ exp.(eta))
data.x[i, :] * (data.y[i] - data.m[i] * mu)
end
logistic_ef (generic function with 1 method)

Then, solving the bias-reducing adjusted estimating equations

julia> logistic_template_ef = estimating_function_template(logistic_nobs, logistic_ef);

julia> e1_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM")
RBM-estimation with estimating function contributions logistic_ef
Bias reduction method: implicit_trace

───────────────────────────────────────────────────────────────────────────
Estimate  Std. Error    z value  Pr(>|z|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1]    0.928169    2.20201    0.42151     0.6734   -3.38769   5.24402
theta[2]   -5.4578      1.20924   -4.51341     <1e-5    -7.82787  -3.08773
theta[3]   -5.81835     1.26797   -4.58872     <1e-5    -8.30352  -3.33318
theta[4]    0.81906     0.980035   0.835746    0.4033   -1.10177   2.73989
theta[5]    4.64568     1.30015    3.5732      0.0004    2.09744   7.19393
theta[6]    5.66444     1.54657    3.66258     0.0002    2.63321   8.69566
theta[7]    4.16222     1.11835    3.72174     0.0002    1.97028   6.35415
theta[8]    1.14271     1.13618    1.00575     0.3145   -1.08416   3.36957
theta[9]   -2.58574     1.2083    -2.13999     0.0324   -4.95396  -0.217525
theta[10]  -4.22949     1.49518   -2.82876     0.0047   -7.15999  -1.299
───────────────────────────────────────────────────────────────────────────
Adjusted estimating functions:	[-5.200950781159008e-12; -7.726777551120279e-12; … ; -5.21382936824466e-12; -4.383812757247085e-12]

returns the reduced-bias estimates from maximum penalized likelihood:

julia> isapprox(coef(o1_br), coef(e1_br))
true

### Bias-reduction methods

GEEBRA currently implements 2 alternative bias reduction methods, called implicit_trace and explicit_trace. implicit_trace will adjust the estimating functions or penalize the objectives, as we have seen earlier. explicit_trace, on the other hand, will form an estimate of the bias of the $M$-estimator and subtract that from the $M$-estimates. The default method is implicit_trace.

For example, for logistic regression via estimating functions

julia> e2_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with estimating function contributions logistic_ef
Bias reduction method: explicit_trace

───────────────────────────────────────────────────────────────────────────
Estimate  Std. Error    z value  Pr(>|z|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1]    0.895301    2.10014    0.426305    0.6699   -3.2209    5.0115
theta[2]   -5.11645     1.11115   -4.60463     <1e-5    -7.29427  -2.93863
theta[3]   -5.46494     1.15431   -4.73437     <1e-5    -7.72736  -3.20253
theta[4]    0.74801     0.936884   0.798402    0.4246   -1.08825   2.58427
theta[5]    4.34256     1.20242    3.6115      0.0003    1.98585   6.69926
theta[6]    5.29117     1.42302    3.71828     0.0002    2.50211   8.08023
theta[7]    3.87517     1.0339     3.74812     0.0002    1.84877   5.90158
theta[8]    1.07407     1.09374    0.98202     0.3261   -1.06962   3.21776
theta[9]   -2.42976     1.13518   -2.14042     0.0323   -4.65466  -0.204855
theta[10]  -3.93176     1.37586   -2.85768     0.0043   -6.62839  -1.23512
───────────────────────────────────────────────────────────────────────────
Adjusted estimating functions:	[0.04705609374710429; -0.006424930231929443; … ; -0.0238329565743529; -0.034942423671889]

which gives slightly different estimates that what are in the implict_trace fit in e1_br.

The same can be done using objective functions, but numerical differentiation (using the FiniteDiff package) is used to approximate the gradient of the bias-reducing penalty (i.e. $A(\theta)$).

julia> o2_br = fit(logistic_template, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with objective contributions logistic_loglik
Bias reduction method: explicit_trace

───────────────────────────────────────────────────────────────────────────
Estimate  Std. Error    z value  Pr(>|z|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1]    0.895301    2.10014    0.426305    0.6699   -3.2209    5.0115
theta[2]   -5.11645     1.11115   -4.60463     <1e-5    -7.29427  -2.93863
theta[3]   -5.46494     1.15431   -4.73437     <1e-5    -7.72736  -3.20253
theta[4]    0.74801     0.936884   0.798402    0.4246   -1.08825   2.58427
theta[5]    4.34256     1.20242    3.6115      0.0003    1.98585   6.69926
theta[6]    5.29117     1.42302    3.71828     0.0002    2.50211   8.08023
theta[7]    3.87517     1.0339     3.74812     0.0002    1.84877   5.90158
theta[8]    1.07407     1.09374    0.98202     0.3261   -1.06962   3.21776
theta[9]   -2.42976     1.13518   -2.14042     0.0323   -4.65466  -0.204855
theta[10]  -3.93176     1.37586   -2.85768     0.0043   -6.62839  -1.23512
───────────────────────────────────────────────────────────────────────────
Maximum penalized objetive:	-36.607
Takeuchi information criterion:	81.6646
Akaike information criterion:	84.7635

julia> isapprox(coef(e2_br), coef(o2_br))
true