DependentBootstrap.DependentBootstrapModule

Module for dependent bootstrap procedures, by Colin T Bowers

Implemented bootstrap methods:

- IID
- Stationary
- Moving Block
- Circular Block
- NoOverlapBlock

Implemented block length selection procedures:

- Patton, Politis, and White (2009) Correction to Automatic Block Length Selection For The Dependent Bootstrap

Accepted input dataset types:

- Vector{<:Number}
- Matrix{<:Number} (where rows are observations and columns are variables)
- Vector{Vector{<:Number}} (where elements of inner vectors are observations and outer vectors are variables)
- DataFrame
- TimeSeries.TimeArray{T,N} (only for N = 1 and N = 2)

Additional input dataset types are easily added. Please open an issue at https://github.com/colintbowers/DependentBootstrap.jl

The module has only a single exported type:

- BootInput  <-- Core input type accepted by all exported functions. Typically constructed via keyword method. See ?BootInput for more detail.

All exported functions exhibit the following keyword signatures:

- exported_func(data, bootinput::BootInput)
- exported_func(data ; kwargs...)

Most users will be content to use the keyword argument method. In practice, this method wraps a keyword argument BootInput constructor, which is then input to the exported function BootInput method. For more detail on accepted keywords, see ?BootInput. All exported functions then use the input dataset and bootstrap methodology described in BootInput in order to return the appropriate statistics. A list of exported functions follows:

- optblocklength <-- Estimate the optimal block length for the input dataset
- dbootinds      <-- Get a vector of bootstrap resampling indices
- dbootdata      <-- Get a vector of resampled datasets
- dbootlevel1    <-- Get a vector of level 1 resampled bootstrap statistics
- dboot          <-- Get the level 2 bootstrap statistic
- dbootlevel2    <-- Identical to dboot. Included for completeness
- dbootvar       <-- Wrapper on dboot that sets the level 2 statistic as the variance
- dbootconf      <-- Wrapper on dboot that sets the level 2 statistic as a confidence interval

I use the phrases level 1 and level 2 statistics in this package in the same manner discussed in Chapter 1 of Lahiri's textbook Resampling Methods for Dependent Data.

This package has an MIT license. Please see associated LICENSE.md file for more detail.

DependentBootstrap.BLPPW2009Type

BLPPW2009 <- Type for using multiple dispatch to get the block length selection procedure of Patton, Politis, and White (2009)

DependentBootstrap.BootInputType
BootInput

Core type that defines all parameters needed to perform a bootstrap procedure. The vast majority of users should use the keyword argument constructor that has the method signature:

BootInput(data ; kwargs...)

where data is the dataset to be bootstrapped, and kwargs denotes a set of keyword arguments (defined below) that are used for every exported function in the DependentBootstrap package. The following keyword arguments and default values follow:

- blocklength         <- Block length for bootstrapping procedure. Default value is 0. Set to <= 0 to auto-estimate the optimal block length from the dataset. Float64 inputs allowed.
- numresample         <- Number of times to resample the input dataset. Default value is the module constant NUM_RESAMPLE, currently set to 1000.
- bootmethod          <- Bootstrapping methodology to use. Default value is the Symbol :stationary (for the stationary bootstrap).
- blocklengthmethod   <- Block length selection procedure to use if user wishes to auto-estimate the block length. Default value is the Symbol :ppw2009 (use the method described in Patton, Politis, and White (2009)).
- flevel1             <- A function that converts the input dataset to the estimator that the user wishes to bootstrap. Default value is the sample mean.
- flevel2             <- A function that converts a vector of estimators constructed by flevel1 into a distributional parameter. Default value is sample variance.
- numobsperresample   <- Number of observations to be drawn (with replacement) per resample. The default value is the number of observations in the dataset (the vast majority of users will want this default value).
- fblocklengthcombine <- A function for converting a Vector{Float64} of estimated blocklengths to a single Float64 blocklength estimate. Default value is median.

The constructor will attempt to convert all provided keyword arguments to appropriate types, and will notify the user via an error if a supplied keyword argument is not valid.

Note that the bootmethod and blocklengthmethod keyword arguments will accept both Symbol and String inputs, and will convert them to BootMethod and BlockLengthMethod types internally. To see a list of acceptable Symbol or String values for the bootmethod and blocklengthmethod keyword arguments, use:

- collect(keys(DependentBootstrap.BOOT_METHOD_DICT))
- collect(keys(DependentBootstrap.BLOCKLENGTH_METHOD_DICT))

respectively. A small proportion of users may need the fine-grained control that comes from constructing BootMethod and BlockLengthMethod types explicitly and then providing them to the keyword constructor. These users should use ?BootMethod and ?BlockLengthMethod at the REPL for more info.

BootInput is not mutable, but the type is near instantaneous to construct, so if a user wishes to amend a BootInput it is recommended to just construct another one. A special constructor is provided to facilitate this process that has the method definition:

- BootInput(data, bootinput::BootInput ; kwargs...)

where the new BootInput draws its fields from the keyword arguments that are provided, and then the input BootInput for any keyword arguments that are not provided.

Note that all exported functions in the DependentBootstrap package exhibit the method signature:

- exported_func(data ; kwargs...)

which in practice just wraps the keyword argument constructor for a BootInput, and then calls the method signature:

 - exported_func(data, bootinput::BootInput)
DependentBootstrap.P2003Type

P2003 <- Type for using multiple dispatch to get the bandwidth selection procedure of Politis (2003)

DependentBootstrap.bandwidth_politis_2003Method
bandwidth_politis_2003(x::AbstractVector{T})::Tuple{Int, Float64, Vector{Float64}} where {T<:Number}

Implements the methodology from Politis (2003) "Adaptive Bandwidth Choice" to obtain a data-driven bandwidth estimate.

Return tuple is, in order, the bandwidth estimate, the variance of x, and the autocorrelations used to get the bandwidth estimate.

Note, most users won't be interested in the second and third output, but sometimes this routine will be called by other functions that need these terms, so they are returned to avoid duplicate computation.

DependentBootstrap.dbootMethod
dboot(data, bi::BootInput)
dboot(data ; kwargs...)

Get the level 2 bootstrapped statistics associated with dataset in data, and bootstrap methodology in BootInput.

A keyword method that calls the keyword constructor for BootInput is also provided. Please use ?BootInput at the REPL for more detail on feasible keywords.

Note, the return type of the output will be determined by bi.flevel2, which must be a function that accepts Vector{T}, where T is the output type of bi.flevel1.

For example, if data is a Vector{<:Number} and bi.flevel1 is mean, then in this case, bi.flevel1 will return Float64, and so bi.flevel2 must be some function that accepts Vector{Float64} as input (and can have any output type.)

Alternatively, bi.flevel2 could be the anonymous function (x -> quantile(x, [0.025, 0.975])), in which case the input should be Vector{Float64}, and so bi.flevel1 should return Float64. Note, the output of bi.flevel2 in this case will be a 2-element Vector{Float64} with elements corresponding bootstrapped 95% confidence interval for the level1 statistic of the input dataset

DependentBootstrap.dbootconfMethod

dbootconf <- Identical to dboot but with the level 2 statistic set to a confidence interval with width determined by keyword alpha. Default alpha=0.05 corresponds to a 95% confidence interval.

DependentBootstrap.dbootdataMethod
dbootdata(data::T , bi::BootInput)::Vector{T}
dbootdata(data::T ; kwargs...)::Vector{T}

Get the resampled datasets of the input data using the dependent bootstrap methodology defined in BootInput.

A keyword method that calls the keyword constructor for BootInput is also provided. Please use ?BootInput at the REPL for more detail on feasible keywords.

Note, this function should always have output type Vector{T}.

DependentBootstrap.dbootdata_oneMethod
dbootdata_one(data::T, bi::BootInput)::T
dbootdata_one(data::T; kwargs...)::T

Get a single resampled dataset of the input data using the dependent boostrap methodology defined in BootInput.

A keyword method that calls the keyword constructor for BootInput is also provided. Please use ?BootInput at the REPL for more detail on feasible keywords.

Note, the output type will always be the same as the type of the input data.

DependentBootstrap.dbootindsMethod
dbootinds(data::T ; bi::BootInput)::Vector{Vector{Int}}
dbootinds(data::T ; kwargs...)::Vector{Vector{Int}}

Each inner vector of the returned Vector{Vector{Int}} provides indices that, when used to index the original dataset, will provide a single resampled dataset.

A keyword method that calls the keyword constructor for BootInput is also provided. Please use ?BootInput at the REPL for more detail on feasible keywords.

Please use dbootinds_one if you only want to obtain a single Vector{Int} resampling index.

DependentBootstrap.dbootinds_oneMethod

dbootindsone(bi::BootInput)::Vector{Int} dbootindsone(data::T; kwargs...)::Vector{Int}

Returns a single resampling index that, when used to index the original dataset, will provide a single resampled dataset.

A keyword method that calls the keyword constructor for BootInput is also provided. Please use ?BootInput at the REPL for more detail on feasible keywords.

DependentBootstrap.dbootlevel1Method
dbootlevel1(data::T1, bi::BootInput)
dbootlevel1(data::T1; kwargs...)

Get the level 1 bootstrapped statistics associated with dataset in data, and bootstrap methodology in BootInput.

A keyword method that calls the keyword constructor for BootInput is also provided. Please use ?BootInput at the REPL for more detail on feasible keywords.

Note, the return type is determined by bi.flevel1, which must be a function that accepts T1, ie typeof(data), as input. It may return any output type T2, as long as bi.flevel2 will accept Vector{T2} as input.

For example, if data is a Vector{<:Number} then bi.flevel1 might be the function mean, which in this case will return Float64, so bi.flevel2 must be some function that can accept Vector{Float64} as input.

A more complicated example: if data is Matrix{<:Number} then bi.flevel1 might be the anonymous function x->mean(x,dims=1), which in this case will return a single row Matrix{Float64}, and so bi.flevel2 must be some function that can accept Vector{Matrix{Float64}} as input.

DependentBootstrap.dbootlevel2Method

dbootlevel2 <- Identical to the dboot function. This function is only included for naming consistency with dbootlevel1

DependentBootstrap.num_obsMethod

num_obs <- Internal function used to determine the number of observations in the input dataset

DependentBootstrap.optblocklengthMethod
optblocklength(data, bi::BootInput)::Float64
optblocklength(data ; kwargs...)::Float64

Provides an estimate of the optimal block-length to use with a dependent bootstrap.

For multivariate datasets, optimal block length is estimated for each column of data, and then bi.fblocklengthcombine, which is a function that maps Vector{Float64} to Float64, is called to reduce the multiple estimates to a single estimates. The default value for fblocklengthcombine is median.

Block length methods currently implemented include:

 - Patton, Politis, White (2009) "Correction to Automatic Block Length Selection For the Dependent Bootstrap"

For all methods discussed above, bandwidth is estimated following Politis (2003) "Adaptive Bandwidth Choice", using the flat-top kernel suggested in that paper.