CausalGPSLC.BinaryTreatment
— TypeBinary Treatment (T)
CausalGPSLC.CausalGPSLCBinaryT
— ConstantBinary Treatment CausalGPSLC with Covariates (X) and Latent Confounders (U)
CausalGPSLC.CausalGPSLCNoCovBinaryT
— ConstantNo Covariates (no X), Binary Treatment CausalGPSLC
CausalGPSLC.CausalGPSLCNoCovRealT
— ConstantNo Covariates (no X), Continuous CausalGPSLC
CausalGPSLC.CausalGPSLCNoUBinaryT
— ConstantNo latent confounders (no U), Binary Treatment CausalGPSLC
CausalGPSLC.CausalGPSLCNoUNoCovBinaryT
— ConstantNo covariates (no X), no latent confounders (no U) for Binary Treatment CausalGPSLC
CausalGPSLC.CausalGPSLCNoUNoCovRealT
— ConstantNo Covariates (no X), No Latent Confounders (no U), Continuous CausalGPSLC
CausalGPSLC.CausalGPSLCNoURealT
— ConstantNo Latent Confounders (no U), Continuous CausalGPSLC
CausalGPSLC.CausalGPSLCRealT
— ConstantContinous CausalGPSLC, with Latent Confounders (U) and Covariates (X)
CausalGPSLC.Confounders
— TypeConfounders (U)
Latent confounders that CausalGPSLC performs inference over.
Either 1D or 2D matrices of Float64
values.
CausalGPSLC.ContinuousTreatment
— TypeContinuous Treatment (T)
CausalGPSLC.Intervention
— TypeIntervention (doT)
CausalGPSLC.ReshapeableMatrix
— TypeReshapeableMatrix
Matrix that can be reshaped.
CausalGPSLC.SupportedRBFData
— TypeSupportedRBFData
Viable inputs to the rbfKernelLog function that are nested lists.
CausalGPSLC.SupportedRBFLengthscale
— TypeSupportedRBFLengthscale
Viable inputs to the rbfKernelLog function as kernel lengthscales.
CausalGPSLC.SupportedRBFMatrix
— TypeSupportedRBFMatrix
Viable inputs to the rbfKernelLog function in linear algebra datatypes.
CausalGPSLC.SupportedRBFVector
— TypeSupportedRBFVector
Viable inputs to the rbfKernelLog function in linear algebra datatypes.
CausalGPSLC.Treatment
— TypeTreatment (T)
Is made up of BinaryTreatment
which is an alias for Vector{Bool}
and ContinuousTreatment
which is an alias for Vector{Float64}
. These types support other vector types to afford compatibility with internal libraries.
CausalGPSLC.generateBinaryT
— ConstantGen function to generate binary treatment (T) from real value
CausalGPSLC.generateBinaryTfromPrior
— ConstantSample binary treatment from prior (T)
CausalGPSLC.generateBinaryTfromU
— ConstantSample Binary T from confounders (U)
CausalGPSLC.generateBinaryTfromUX
— ConstantSample Binary T from confounders (U) and covariates (X)
CausalGPSLC.generateBinaryTfromX
— ConstantSample Binary T from covariates (X)
CausalGPSLC.generateLS
— ConstantGen function to generate lengthscale parameter for GP
CausalGPSLC.generateNoise
— ConstantGen function to generate noise from inv_gamma
CausalGPSLC.generateRealTfromPrior
— ConstantSample continuous treatment from prior (T)
CausalGPSLC.generateRealTfromU
— ConstantSample Continuous T from confounders (U)
CausalGPSLC.generateRealTfromUX
— ConstantSample T from confounders (U) and covariates (X)
CausalGPSLC.generateRealTfromX
— ConstantSample Binary T from covariates (X)
CausalGPSLC.generateScale
— ConstantGen function to generate scale parameter for GP
CausalGPSLC.generateU
— ConstantGen function to generate latent confounders (U) from mvnormal distribution
CausalGPSLC.generateUfromSigmaU
— ConstantSample U
CausalGPSLC.generateX
— ConstantGen function to generate covariates (n,X_k) from mvnormal distribution
CausalGPSLC.generateXfromPrior
— ConstantSample covariates from prior (X)
CausalGPSLC.generateXfromU
— ConstantSample X from U
CausalGPSLC.generateYfromT
— ConstantSample Y from only treatment (T)
CausalGPSLC.generateYfromUT
— ConstantSample Y from confounders (U) and treatment (T)
CausalGPSLC.generateYfromUXT
— ConstantSample Y from confounders (U), covariates (X), and treatment (T)
CausalGPSLC.generateYfromXT
— ConstantSample Y from covariates (X) and treatment (T)
CausalGPSLC.lengthscaleFromPriorT
— ConstantTreatment to outcome lengthscale
CausalGPSLC.lengthscaleFromPriorU
— ConstantLatent confounders to treatment and outcome lengthscale
CausalGPSLC.lengthscaleFromPriorUX
— ConstantLatent confounders to treatment and outcome lengthscale when nX is known
CausalGPSLC.lengthscaleFromPriorX
— ConstantCovariates to treatment and outcome lengthscale
CausalGPSLC.paramProposal
— ConstantparamProposal(trace, variance, addr)
Like a Gaussian drift, we match the moments of our proposal with the previous noise sample with a fixed variance. See https://arxiv.org/pdf/1605.01019.pdf.
CausalGPSLC.sampleNoiseFromPriorT
— ConstantSample noise from prior for treatment (T)
CausalGPSLC.sampleNoiseFromPriorU
— ConstantGenerate noise terms from noise prior
Sample noise for prior from confounders (U)
CausalGPSLC.sampleNoiseFromPriorX
— ConstantSample noise from prior for covariates (X)
CausalGPSLC.sampleNoiseFromPriorY
— ConstantSample noise from prior for outcome (Y)
CausalGPSLC.scaleFromPriorT
— ConstantSample kernel scale from prior for treatment (T)
CausalGPSLC.scaleFromPriorX
— ConstantGenerate kernel scales from prior for covariates (X)
CausalGPSLC.scaleFromPriorY
— ConstantSample kernel scale from prior for outcome (Y)
CausalGPSLC.ConfounderStructure
— TypeSigmaU
structured prior for U.
CausalGPSLC.Covariates
— TypeCovariates (X) Observed confounders and covariates.
Matrix{Float64}
is the only valid structure for covariates
CausalGPSLC.GPSLCObject
— TypeGPSLCObject
This is the struct in CausalGPSLC.jl that contains the data, hyperparamters, prior parameters, and posterior samples. It provides the primary interfaces to abstract the internals of CausalGPSLC away from the higher-order functions like sampleITE
, sampleSATE
, and predictCounterfactualEffects
.
Returned by gpslc
CausalGPSLC.GPSLCObject
— MethodConstructor for GPSLCObject that samples from the posterior before constructing the GPSLCObject.
GPSLCObject(hyperparams, priorparams, SigmaU, obj, X, T, Y)
GPSLCObject(hyperparams, priorparams, SigmaU, obj, nothing, T, Y)
GPSLCObject(hyperparams, priorparams, nothing, nothing, X, T, Y)
GPSLCObject(hyperparams, priorparams, nothing, nothing, nothing, T, Y)
Full Model or model with no observed Covariates
CausalGPSLC.GPSLCObject
— MethodNo Confounders
CausalGPSLC.GPSLCObject
— MethodNo Confounders, No Covariates
CausalGPSLC.HyperParameters
— TypeHyperParameters
Define the high-level attributes of the inference procedure. More information on each of the attributes can be found in getHyperParameters
.
CausalGPSLC.ObjectLabels
— TypeObject Labels for instances (obj)
Optional for CausalGPSLC, but per publication it improves performance.
CausalGPSLC.Outcome
— TypeOutcome (Y)
The outcome for the series of Gaussian Process predictions is a Vector{Float64}
. Currently only continuous values are supported as outcomes for input data.
CausalGPSLC.PriorParameters
— TypePriorParameters
A dictionary of shapes and scales for various Inverse Gamma distributions used as priors for kernel parameters and other parameters. More information on each of the attributes can be found in getPriorParameters
.
CausalGPSLC.SupportedCovarianceMatrix
— TypeSupportedCovarianceMatrix
Viable inputs to the processCov function.
CausalGPSLC.ITEDistributions
— MethodITEDistributions(g, doT)
Collect MeanITEs and CovITEs from the posterior with conditionalITE.
CausalGPSLC.ITEsamples
— MethodITEsamples(MeanITEs, CovITEs, nSamplesPerMixture)
Individual Treatment Effect Samples
Returns nMixtures * nSamplesPerMixture
outcome (Y) samples for each individual [nMixtures * nSamplesPerMixture, n]
where nMixtures is the number of posterior samples (nOuter)
CausalGPSLC.Posterior
— MethodBinary Treatment Full Model
CausalGPSLC.Posterior
— MethodBinary Treatment with No Confounders
CausalGPSLC.Posterior
— MethodContinuous Treatment Full Model
CausalGPSLC.Posterior
— MethodContinuous Treatment No Confounders
nESInner
is not used to sample anything via elliptical slice sampling It is required here for compatibility with the binary treatment version which uses ES to learn hyperparameters for the support of binary variables which are not usually supported by Gaussian processes.
CausalGPSLC.Posterior
— MethodBinary Treatment No Covariates
CausalGPSLC.Posterior
— MethodBinary Treatment No Confounders No Covariates
CausalGPSLC.Posterior
— MethodContinuous Treatment No Covariates
CausalGPSLC.Posterior
— MethodContinuous Treatment No Confounders No Covariates
CausalGPSLC.SATEDistributions
— MethodSATEDistributions
Collect SATE Mean and Variance corresponding to each posterior sample.
CausalGPSLC.SATEsamples
— MethodSATEsamples(MeanSATEs, VarSATEs, nSamplesPerMixture)
Collect Sample Average Treatment Effect corresponding to each posterior sample.
Returns a vector of nSamplesPerMixture
samples for each posterior sample's SATE distribution parameters.
CausalGPSLC.conditionalITE
— MethodconditionalITE(g, psindex, doT)
Wrapper for conditionalITE that extracts parameters from g::GPSLCObject
at posterior sample psindex
and applies intervention doT
.
CausalGPSLC.conditionalITE
— MethodconditionalITE(nothing, nothing, tyLS, yNoise, yScale, nothing, nothing, T, Y, doT)
conditionalITE(nothing, xyLS, tyLS, yNoise, yScale, nothing, X, T, Y, doT)
conditionalITE(uyLS, nothing, tyLS, yNoise, yScale, U, nothing, T, Y, doT)
conditionalITE(uyLS, xyLS, tyLS, yNoise, yScale, U, X, T, Y, doT)
Conditional Individual Treatment Estimation
conditionalITE
takes in parameters (presumably from posterior inference) as well as the observed and inferred data to produce individual treatment effects.
Params:
uyLS
: (optional) Kernel lengthscale for latent confounders to outcomexyLS
: (optional) Kernel lengthscale for covariates to outcometyLS
: Kernel lengthscale for treatment to outcomeyNoise
: Gaussian noise for outcome predictionyScale
: Gaussian scale for outcome predictionU
: (optional) Latent confoundersX
: (optional) CovariatesT
: TreatmentY
: OutcomedoT
: Treatment intervention
Returns:
MeanITE::Vector{Float64}
: The mean value for then
individual treatment effectsCovITE::Matrix{Float64}
: The covariance matrix for then
individual treatment effects.
CausalGPSLC.conditionalSATE
— MethodconditionalSATE
Conditional Sample Average Treatment Effect
CausalGPSLC.expit
— MethodExpit is the inverse of the logit function, mapping a Real to [0,1]
CausalGPSLC.extractParameters
— MethodextractParameters(g, posteriorSampleIdx)
Get inferred parameters from g
's posteriorSampleIdx
th posterior sample. Parameters are uyLS, xyLS, tyLS, yNoise, yScale, U
, some of which are allowed to be Nothing
CausalGPSLC.generateSigmaU
— FunctiongenerateSigmaU(nIndividualsArray)
generateSigmaU(nIndividualsArray, eps)
generateSigmaU(nIndividualsArray, eps, cov)
Generate block matrix for U given object counts
SigmaU is shorthand for the object structure of the latent confounder
CausalGPSLC.getAddresses
— MethodgetAddresses(choicemap)
Debugging tool to print all available address keys in choicemap
CausalGPSLC.getHyperParameters
— MethodgetHyperParameters()
Returns default values for hyperparameters
nU = 1
: Number of latent confounding variables assumed to be influencing all the instances that belong to one object. Inference will be performed over these values.nOuter = 20
: Number of posterior samples to draw.nMHInner = 5
: Number of internal Metropolis-Hastings updates to make per posterior sample.nESInner = 5
: Number of elliptical-slice sampling updates to make per posterior for latent confounders and binary treatment.nBurnIn = 5
: Number of posterior samples to discard when making predictions and estimates.stepSize = 1
: How frequently to use posterior samples (1 being every one after burnIn, higher being everystepSize
th).predictionCovarianceNoise=1e-10
: Predicting with Gaussian processes requires use of covariance matrices that are Symmetric Positive Definite, and this covariance noise on the diagonal ensures these operations can be performed in a stable and consistent way.
CausalGPSLC.getN
— MethodgetN(g)
Number of individuals in dataset.
CausalGPSLC.getNU
— MethodgetNU(g)
Number of latent confounders to perform inference over (hyperparameter).
CausalGPSLC.getNX
— MethodgetNX(g)
Number of covariates (and observed confounders) in dataset.
CausalGPSLC.getNumPosteriorSamples
— MethodgetNumPosteriorSamples(g)
Number of posterior samples that will be used based on hyperparameters.
(total posterior samples - burn in) / step size = nBurnIn:stepSize:nOuter
CausalGPSLC.getPriorParameters
— MethodgetPriorParameters()
These are standard values for scale and shape of Inverse Gamma priors over kernel parameters, confounder structure covariance noise, and confounder Gaussian prior covariance.
uNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of UuNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of UxNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of XxNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of XtNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of TtNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of TyNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of YyNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of YxScaleShape::Float64=4.0
: shape of the InvGamma prior over kernel scale of XxScaleScale::Float64=4.0
: scale of the InvGamma prior over kernel scale of XtScaleShape::Float64=4.0
: shape of the InvGamma prior over kernel scale of TtScaleScale::Float64=4.0
: scale of the InvGamma prior over kernel scale of TyScaleShape::Float64=4.0
: shape of the InvGamma prior over kernel scale of YyScaleScale::Float64=4.0
: scale of the InvGamma prior over kernel scale of YuxLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of U and XuxLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of U and XutLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of U and TutLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of U and TxtLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of X and TxtLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of X and TuyLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of U and YuyLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of U and YxyLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of X and YxyLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of X and YtyLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of T and YtyLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of T and YsigmaUNoise::Float64=1.0e-13
: noise added to matrix to make covariance stable and invertiblesigmaUCov::Float64=1.0
: assumed covariance over structured confoundersdrift::Float64=0.5
: as in the paper, Metropolis Hastings Gaussian Drift
CausalGPSLC.getProposalAddress
— MethodgetProposalAddress(name, i, j)
Optimizes paramProposal
by providing compact way to access trace address symbols
CausalGPSLC.gpslc
— Methodgpslc(filename * ".csv")
gpslc(filename * ".csv"; hyperparams=hyperparams, priorparams=priorparams))
gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...))
gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...); hyperparams=hyperparams, priorparams=priorparams)
Run posterior inference on the input data.
Datatypes of DataFrame or CSV must follow these standards:
T
(Boolean/Float64)Y
(Float64)X1...XN
(Float64...Float64)obj
(Any)
Optional parameters
hyperparams::
HyperParameters
=getHyperParameters
()
: Hyper parameters primarily define the high level amount of inference to perform.priorparams::
PriorParameters
=getPriorParameters
()
: Prior parameters define the high level priors to draw from when constructing kernel functions and latent confounder structure.
Returns a GPSLCObject
which stores the hyperparameters, prior parameters, data, and posterior samples.
CausalGPSLC.likelihoodDistribution
— MethodNo Confounders No Covariates Continuous/Binary
CausalGPSLC.likelihoodDistribution
— MethodNo Confounders Continuous/Binary
CausalGPSLC.likelihoodDistribution
— MethodlikelihoodDistribution
A utility that uses multiple dispatch to take in one set of parameters for CausalGPSLC and the observed data, plus an intervention doT
, and outputs the necessary matrices to compute MeanITE and CovITE, as well as other predictions, like directly predicting $Y_cf$.
CausalGPSLC.likelihoodDistribution
— MethodNo Covariates Continuous/Binary
CausalGPSLC.loadData
— MethodloadData("path/to/filename.csv")
Returns DataFrame of CSV at path.
CausalGPSLC.loadGPSLCObject
— MethodloadGPSLCObject(filename)
loadGPSLCObject("path/to/filename")
loadGPSLCObject("path/to/filename.gpslc")
This function will load and return the GPSLCObject
contained in <filename>.gpslc
.
Note: the extension .gpslc
is optional and will be added if it is not included.
CausalGPSLC.logit
— MethodLogit maps a [0,1] onto the Real values
CausalGPSLC.predictCounterfactualEffects
— MethodpredictCounterfactualOutcomes(g, nSamplesPerMixture)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100, minDoT=0, maxDoT=5)
Params
g::
GPSLCObject
: TheGPSLCObject
that inference has already been computed for.nSamplesPerMixture::Int64
: The number of outcome samples to
draw from each set of inferred posterior parameters.
fidelity::Int64
: How many intervention values to use to cover the domain of treatment values. Higher means more samples.minDoT::Float64=min(g.T...)
: The lowest interventional treatment to use.Defaults to the datag.T
's lowest treatment value.maxDoT::Float64=max(g.T...)
: The highest interventional treatment to use. Defaults to the datag.T
's highest treatment value.
julia> ite, doT = predictCounterfactualEffects(g, 30; fidelity=100)
Returns
ite::Matrix{Float64}
: An array of size[d, n, numPosteriorSamples * nSamplesPerMixture]
where d is the number of interventional values defined byfidelity
and the range of treatments ing.T
-doTrange::Vector{Float64}
: The list values of doT used, in order that matches the rows ofite
.
CausalGPSLC.prepareData
— FunctionprepareData(df, confounderEps, confounderCov)
prepareData("path/to/filename.csv", confounderEps, confounderCov)
Prepare Data Creates the latent confounding structure from the object labels in the data. Parses matrices for the observed covariates, treatments, and outcomes.
Returns: X, T, Y, SigmaU
CausalGPSLC.processCov
— MethodConvert covariance matrix back from log-space, scale and add noise (if passed)
CausalGPSLC.rbfKernelLog
— Method2D rbfKernelLog
CausalGPSLC.rbfKernelLogScalar
— MethodRadial Basis Function Kernel applied element-wise to two vectors X1
and X2
passed
Params:
X1
: First array of valuesX2
: Second array of valuesLS
: Lengthscale array
Output normalized by LS
squared
CausalGPSLC.removeAdjacent
— MethodremoveAdjacent(vector)
Return vector where each element is distinct from the previous one
CausalGPSLC.sampleITE
— MethodsampleITE(g, doT)
sampleITE(g, doT; samplesPerPosterior=10)
Estimate Individual Treatment Effect with CausalGPSLC model
Params:
g::
GPSLCObject
: Contains data and hyperparametersdoT
: The requested intervention (e.g. set all treatments to 1.0)samplesPerPosterior
: How many ITE samples to draw per posterior sample ing
.
Returns:
ITEsamples
: n x m
matrix where n
is the number of individuals, and m
is the number of samples.
CausalGPSLC.samplePosterior
— MethodsamplePosterior(hyperparameters, priorparameters, SigmaU, X, T, Y)
Draw samples from the posterior given the observed data. Params:
hyperparams::
HyperParameters
priorparams::
PriorParameters
SigmaU::
ConfounderStructure
X::
Covariates
T::
Treatment
Y::
Outcome
Posterior samples are returned as a Vector of Gen choicemaps.
CausalGPSLC.sampleSATE
— MethodsampleSATE(g, doT)
sampleSATE(g, doT; samplesPerPosterior=10)
Estimate Sample Average Treatment Effect with CausalGPSLC model
Using sampleITE
, samples can be drawn for the sample average treatment effect
Params:
g::
GPSLCObject
: Contains data and hyperparametersdoT
: The requested intervention (e.g. set all treatments to 1.0)samplesPerPosterior
: How many samples to draw per posterior sample ing
.
Returns:
SATEsamples
: n x m
matrix where n
is the number of individuals, and m
is the number of samples.
CausalGPSLC.saveGPSLCObject
— MethodsaveGPSLCObject(g, filename)
saveGPSLCObject(g, "path/to/filename")
saveGPSLCObject(g, "path/to/filename.gpslc")
This function will save the GPSLCObject
g
to the file <filename>.gpslc
. This GPSLCObject
, including the posterior samples contained within it can be retrieved with the loadGPSLCObject
function.
Note: The extension .gpslc
is optional and will be added if it is not included.
CausalGPSLC.summarizeEstimates
— MethodsummarizeEstimates(samples)
summarizeEstimates(samples; savetofile="ite_samples.csv")
Summarize Predicted Estimates (Counterfactual Outcomes or Individual Treatment Effects)
Create dataframe of mean, lower and upper quantiles of the samples from sampleITE
or predictCounterfactualEffects
.
Params:
samples
: Then x m
array of samples from sampleSATE or sampleITEsavetofile::String
: Optionally save the resultant DataFrame as CSV to the filename passedcredible_interval::Float64
: A real in [0,1] where 0.90 is the default for a 90% credible interval
Returns:
df
: Dataframe of Individual, Mean, LowerBound, and UpperBound values for the credible intervals around the sample.
CausalGPSLC.toMatrix
— MethodtoMatrix(X, n, m)
Convert a vector of vectors or similar to a 2D matrix. Only call if you know all subvectors are same length.
CausalGPSLC.toTupleOfVectors
— MethodtoTupleOfVectors(matrix)
Convert matrix to tuple of vectors.