CausalGPSLC.ConfoundersType
Confounders (U)

Latent confounders that CausalGPSLC performs inference over.

Either 1D or 2D matrices of Float64 values.

CausalGPSLC.TreatmentType
Treatment (T)

Is made up of BinaryTreatment which is an alias for Vector{Bool} and ContinuousTreatment which is an alias for Vector{Float64}. These types support other vector types to afford compatibility with internal libraries.

CausalGPSLC.generateUConstant

Gen function to generate latent confounders (U) from mvnormal distribution

CausalGPSLC.paramProposalConstant
paramProposal(trace, variance, addr)

Like a Gaussian drift, we match the moments of our proposal with the previous noise sample with a fixed variance. See https://arxiv.org/pdf/1605.01019.pdf.

CausalGPSLC.CovariatesType

Covariates (X) Observed confounders and covariates.

Matrix{Float64} is the only valid structure for covariates

CausalGPSLC.GPSLCObjectMethod

Constructor for GPSLCObject that samples from the posterior before constructing the GPSLCObject.

GPSLCObject(hyperparams, priorparams, SigmaU, obj, X, T, Y)
GPSLCObject(hyperparams, priorparams, SigmaU, obj, nothing, T, Y)
GPSLCObject(hyperparams, priorparams, nothing, nothing, X, T, Y)
GPSLCObject(hyperparams, priorparams, nothing, nothing, nothing, T, Y)

Full Model or model with no observed Covariates

CausalGPSLC.ObjectLabelsType
Object Labels for instances (obj)

Optional for CausalGPSLC, but per publication it improves performance.

CausalGPSLC.OutcomeType
Outcome (Y)

The outcome for the series of Gaussian Process predictions is a Vector{Float64}. Currently only continuous values are supported as outcomes for input data.

CausalGPSLC.PriorParametersType
PriorParameters

A dictionary of shapes and scales for various Inverse Gamma distributions used as priors for kernel parameters and other parameters. More information on each of the attributes can be found in getPriorParameters.

CausalGPSLC.ITEsamplesMethod
ITEsamples(MeanITEs, CovITEs, nSamplesPerMixture)

Individual Treatment Effect Samples

Returns nMixtures * nSamplesPerMixture outcome (Y) samples for each individual [nMixtures * nSamplesPerMixture, n] where nMixtures is the number of posterior samples (nOuter)

CausalGPSLC.PosteriorMethod

Continuous Treatment No Confounders

nESInner is not used to sample anything via elliptical slice sampling It is required here for compatibility with the binary treatment version which uses ES to learn hyperparameters for the support of binary variables which are not usually supported by Gaussian processes.

CausalGPSLC.SATEsamplesMethod
SATEsamples(MeanSATEs, VarSATEs, nSamplesPerMixture)

Collect Sample Average Treatment Effect corresponding to each posterior sample.

Returns a vector of nSamplesPerMixture samples for each posterior sample's SATE distribution parameters.

CausalGPSLC.conditionalITEMethod
conditionalITE(g, psindex, doT)

Wrapper for conditionalITE that extracts parameters from g::GPSLCObject at posterior sample psindex and applies intervention doT.

CausalGPSLC.conditionalITEMethod
conditionalITE(nothing, nothing, tyLS, yNoise, yScale, nothing, nothing, T, Y, doT)
conditionalITE(nothing, xyLS, tyLS, yNoise, yScale, nothing, X, T, Y, doT)
conditionalITE(uyLS, nothing, tyLS, yNoise, yScale, U, nothing, T, Y, doT)
conditionalITE(uyLS, xyLS, tyLS, yNoise, yScale, U, X, T, Y, doT)

Conditional Individual Treatment Estimation

conditionalITE takes in parameters (presumably from posterior inference) as well as the observed and inferred data to produce individual treatment effects.

Params:

  • uyLS: (optional) Kernel lengthscale for latent confounders to outcome
  • xyLS: (optional) Kernel lengthscale for covariates to outcome
  • tyLS: Kernel lengthscale for treatment to outcome
  • yNoise: Gaussian noise for outcome prediction
  • yScale: Gaussian scale for outcome prediction
  • U: (optional) Latent confounders
  • X: (optional) Covariates
  • T: Treatment
  • Y: Outcome
  • doT: Treatment intervention

Returns:

  • MeanITE::Vector{Float64}: The mean value for the n individual treatment effects
  • CovITE::Matrix{Float64}: The covariance matrix for the n individual treatment effects.
CausalGPSLC.expitMethod

Expit is the inverse of the logit function, mapping a Real to [0,1]

CausalGPSLC.extractParametersMethod
extractParameters(g, posteriorSampleIdx)

Get inferred parameters from g's posteriorSampleIdxth posterior sample. Parameters are uyLS, xyLS, tyLS, yNoise, yScale, U, some of which are allowed to be Nothing

CausalGPSLC.generateSigmaUFunction
generateSigmaU(nIndividualsArray)
generateSigmaU(nIndividualsArray, eps)
generateSigmaU(nIndividualsArray, eps, cov)

Generate block matrix for U given object counts

SigmaU is shorthand for the object structure of the latent confounder

CausalGPSLC.getAddressesMethod
getAddresses(choicemap)

Debugging tool to print all available address keys in choicemap

CausalGPSLC.getHyperParametersMethod
getHyperParameters()

Returns default values for hyperparameters

  • nU = 1: Number of latent confounding variables assumed to be influencing all the instances that belong to one object. Inference will be performed over these values.
  • nOuter = 20: Number of posterior samples to draw.
  • nMHInner = 5: Number of internal Metropolis-Hastings updates to make per posterior sample.
  • nESInner = 5: Number of elliptical-slice sampling updates to make per posterior for latent confounders and binary treatment.
  • nBurnIn = 5: Number of posterior samples to discard when making predictions and estimates.
  • stepSize = 1: How frequently to use posterior samples (1 being every one after burnIn, higher being every stepSizeth).
  • predictionCovarianceNoise=1e-10: Predicting with Gaussian processes requires use of covariance matrices that are Symmetric Positive Definite, and this covariance noise on the diagonal ensures these operations can be performed in a stable and consistent way.
CausalGPSLC.getNUMethod
getNU(g)

Number of latent confounders to perform inference over (hyperparameter).

CausalGPSLC.getNXMethod
getNX(g)

Number of covariates (and observed confounders) in dataset.

CausalGPSLC.getNumPosteriorSamplesMethod
getNumPosteriorSamples(g)

Number of posterior samples that will be used based on hyperparameters.

(total posterior samples - burn in) / step size = nBurnIn:stepSize:nOuter

CausalGPSLC.getPriorParametersMethod
getPriorParameters()

These are standard values for scale and shape of Inverse Gamma priors over kernel parameters, confounder structure covariance noise, and confounder Gaussian prior covariance.

  • uNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of U
  • uNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of U
  • xNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of X
  • xNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of X
  • tNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of T
  • tNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of T
  • yNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of Y
  • yNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of Y
  • xScaleShape::Float64=4.0: shape of the InvGamma prior over kernel scale of X
  • xScaleScale::Float64=4.0: scale of the InvGamma prior over kernel scale of X
  • tScaleShape::Float64=4.0: shape of the InvGamma prior over kernel scale of T
  • tScaleScale::Float64=4.0: scale of the InvGamma prior over kernel scale of T
  • yScaleShape::Float64=4.0: shape of the InvGamma prior over kernel scale of Y
  • yScaleScale::Float64=4.0: scale of the InvGamma prior over kernel scale of Y
  • uxLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of U and X
  • uxLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of U and X
  • utLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of U and T
  • utLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of U and T
  • xtLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of X and T
  • xtLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of X and T
  • uyLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of U and Y
  • uyLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of U and Y
  • xyLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of X and Y
  • xyLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of X and Y
  • tyLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of T and Y
  • tyLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of T and Y
  • sigmaUNoise::Float64=1.0e-13: noise added to matrix to make covariance stable and invertible
  • sigmaUCov::Float64=1.0: assumed covariance over structured confounders
  • drift::Float64=0.5: as in the paper, Metropolis Hastings Gaussian Drift
CausalGPSLC.getProposalAddressMethod
getProposalAddress(name, i, j)

Optimizes paramProposal by providing compact way to access trace address symbols

CausalGPSLC.gpslcMethod
gpslc(filename * ".csv")
gpslc(filename * ".csv"; hyperparams=hyperparams, priorparams=priorparams))

gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...))
gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...); hyperparams=hyperparams, priorparams=priorparams)

Run posterior inference on the input data.

Datatypes of DataFrame or CSV must follow these standards:

  • T (Boolean/Float64)
  • Y (Float64)
  • X1...XN (Float64...Float64)
  • obj (Any)

Optional parameters

Returns a GPSLCObject which stores the hyperparameters, prior parameters, data, and posterior samples.

CausalGPSLC.likelihoodDistributionMethod
likelihoodDistribution

A utility that uses multiple dispatch to take in one set of parameters for CausalGPSLC and the observed data, plus an intervention doT, and outputs the necessary matrices to compute MeanITE and CovITE, as well as other predictions, like directly predicting $Y_cf$.

CausalGPSLC.loadGPSLCObjectMethod
loadGPSLCObject(filename)
loadGPSLCObject("path/to/filename")
loadGPSLCObject("path/to/filename.gpslc")

This function will load and return the GPSLCObject contained in <filename>.gpslc.

Note: the extension .gpslc is optional and will be added if it is not included.

CausalGPSLC.predictCounterfactualEffectsMethod
predictCounterfactualOutcomes(g, nSamplesPerMixture)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100, minDoT=0, maxDoT=5)

Params

  • g::GPSLCObject: The GPSLCObject that inference has already been computed for.
  • nSamplesPerMixture::Int64: The number of outcome samples to

draw from each set of inferred posterior parameters.

  • fidelity::Int64: How many intervention values to use to cover the domain of treatment values. Higher means more samples.
  • minDoT::Float64=min(g.T...): The lowest interventional treatment to use.Defaults to the data g.T's lowest treatment value.
  • maxDoT::Float64=max(g.T...): The highest interventional treatment to use. Defaults to the data g.T's highest treatment value.
julia> ite, doT = predictCounterfactualEffects(g, 30; fidelity=100)

Returns

  • ite::Matrix{Float64}: An array of size [d, n, numPosteriorSamples * nSamplesPerMixture] where d is the number of interventional values defined by fidelity and the range of treatments in g.T - doTrange::Vector{Float64}: The list values of doT used, in order that matches the rows of ite.
CausalGPSLC.prepareDataFunction
prepareData(df, confounderEps, confounderCov)
prepareData("path/to/filename.csv", confounderEps, confounderCov)

Prepare Data Creates the latent confounding structure from the object labels in the data. Parses matrices for the observed covariates, treatments, and outcomes.

Returns: X, T, Y, SigmaU

CausalGPSLC.processCovMethod

Convert covariance matrix back from log-space, scale and add noise (if passed)

CausalGPSLC.rbfKernelLogScalarMethod

Radial Basis Function Kernel applied element-wise to two vectors X1 and X2 passed

Params:

  • X1: First array of values
  • X2: Second array of values
  • LS: Lengthscale array

Output normalized by LS squared

CausalGPSLC.sampleITEMethod
sampleITE(g, doT)
sampleITE(g, doT; samplesPerPosterior=10)

Estimate Individual Treatment Effect with CausalGPSLC model

Params:

  • g::GPSLCObject: Contains data and hyperparameters
  • doT: The requested intervention (e.g. set all treatments to 1.0)
  • samplesPerPosterior: How many ITE samples to draw per posterior sample in g.

Returns:

ITEsamples: n x m matrix where n is the number of individuals, and m is the number of samples.

CausalGPSLC.sampleSATEMethod
sampleSATE(g, doT)
sampleSATE(g, doT; samplesPerPosterior=10)

Estimate Sample Average Treatment Effect with CausalGPSLC model

Using sampleITE, samples can be drawn for the sample average treatment effect

Params:

  • g::GPSLCObject: Contains data and hyperparameters
  • doT: The requested intervention (e.g. set all treatments to 1.0)
  • samplesPerPosterior: How many samples to draw per posterior sample in g.

Returns:

SATEsamples: n x m matrix where n is the number of individuals, and m is the number of samples.

CausalGPSLC.saveGPSLCObjectMethod
saveGPSLCObject(g, filename)
saveGPSLCObject(g, "path/to/filename")
saveGPSLCObject(g, "path/to/filename.gpslc")

This function will save the GPSLCObject g to the file <filename>.gpslc. This GPSLCObject, including the posterior samples contained within it can be retrieved with the loadGPSLCObject function.

Note: The extension .gpslc is optional and will be added if it is not included.

CausalGPSLC.summarizeEstimatesMethod
summarizeEstimates(samples)
summarizeEstimates(samples; savetofile="ite_samples.csv")

Summarize Predicted Estimates (Counterfactual Outcomes or Individual Treatment Effects)

Create dataframe of mean, lower and upper quantiles of the samples from sampleITE or predictCounterfactualEffects.

Params:

  • samples: The n x m array of samples from sampleSATE or sampleITE
  • savetofile::String: Optionally save the resultant DataFrame as CSV to the filename passed
  • credible_interval::Float64: A real in [0,1] where 0.90 is the default for a 90% credible interval

Returns:

  • df: Dataframe of Individual, Mean, LowerBound, and UpperBound values for the credible intervals around the sample.
CausalGPSLC.toMatrixMethod
toMatrix(X, n, m)

Convert a vector of vectors or similar to a 2D matrix. Only call if you know all subvectors are same length.