Base.names
— Methodvariable_ids(result::FWResult{T}) -> Vector{T}
Extract the IDs/names of all variables (nodes) in the network.
FlashWeave.benjamini_hochberg!
— MethodAccelerated version of that found in MultipleTesting.jl
FlashWeave.graph
— Methodgraph(result::FWResult{T}) -> SimpleWeightedGraph{Int, Float64}
Extract the underlying weighted graph from network results.
FlashWeave.learn_network
— Functionlearn_network(all_data_paths::AbstractVector{<:AbstractString}, meta_data_path::AbstractString) -> FWResult{<:Integer}
Works like learnnetwork(datapath::AbstractString, metadatapath::AbstractString), but takes paths to multiple data sets (independent sequencing experiments (e.g. 16S + ITS) for the same biological samples) which are normalized independently.
FlashWeave.learn_network
— Functionlearn_network(data_path::AbstractString, meta_data_path::AbstractString) -> FWResult{<:Integer}
Works like learn_network(data::AbstractArray{<:Real, 2}), but instead of a data matrix takes file paths to an OTU table and optionally a meta data table as an input.
data_path
- path to a file storing an OTU count matrix (and JLD2 meta data)meta_data_path
- optional path to a file with meta data*_key
- HDF5 keys to access data sets with OTU counts, Meta variables and variable names in a JLD2 file. If a data item is absent the corresponding key should be 'nothing'. See '?load_data' for additional information.verbose
- print progress informationtransposed
- iftrue
, rows ofdata
are variables and columns are sampleskwargs...
- additional keyword arguments passed to learn_network(data::AbstractArray{<:Real, 2})
FlashWeave.learn_network
— Methodlearn_network(data::AbstractArray{<:Real, 2}) -> FWResult{<:Integer}
Learn an interaction network from a data matrix (including OTUs and optionally meta variables).
data
- data matrix with information on OTU counts and (optionally) meta variablesheader
- names of variable columns indata
meta_mask
- true/false mask indicating which variables are meta variables
Algorithmic parameters
heterogeneous
- enable heterogeneous mode for multi-habitat or -protocol data with at least thousands of samples (FlashWeaveHE)sensitive
- enable fine-grained association prediction (FlashWeave-S, FlashWeaveHE-S),sensitive=false
results in thefast
modes (FlashWeave-F, FlashWeaveHE-F)max_k
- maximum size of conditioning sets, high values can lead to the removal of more spurious edgens, but may also strongly increase runtime and reduce statistical power.max_k=0
results in no conditioning (univariate mode)alpha
- statistical significance threshold at which individual edges are acceptedconv
- convergence threshold, e.g. ifconv=0.01
assume convergence if the number of edges increased by only 1% after 100% more runtime (checked in intervals)feed_forward
- enable feed-forward heuristicfast_elim
- enable fast-elimiation heuristicmax_tests
- maximum number of conditional tests that is performed on a variable pair before association is assumedhps
- reliability criterion for statistical tests whensensitive=false
FDR
- perform False Discovery Rate correction (Benjamini-Hochberg method) on pairwise associationsn_obs_min
- don't compute associations between variables having less reliable samples (non-zero samples ifheterogeneous=true
) than this number.-1
: automatically choose a threshold.time_limit
- if feed-forward heuristic is active, determines the interval (seconds) at which neighborhood information is updated
General parameters
normalize
- automatically choose and perform data normalization method (based onsensitive
andheterogeneous
)track_rejections
- store for each discarded edge, which variable set lead to its exclusion (can be memory intense for large networks)verbose
- print progress informationtransposed
- iftrue
, rows ofdata
are variables and columns are samplesprec
- precision in bits to use for calculations (16, 32, 64 or 128)make_sparse
- use a sparse data representation (should be left attrue
in almost all cases)make_onehot
- create one-hot encodings for meta data variables with more than two categories (should be left attrue
in almost all cases)update_interval
- ifverbose=true
, determines the interval (seconds) at which network stat updates are printedextra_data
- tuples of the form (data, header) representing counts from additional sequencing experiments (e.g. 16S + ITS) for the same biological samples. These will be normalized independently.share_data
- if local parallel workers are detected, share input data (instead of copying)experimental_kwargs
- experimental keyword arguments that are directly passed to the underlying inference engine
FlashWeave.load_data
— Functionload_data(data_path::AbstractString, meta_path::AbstractString) -> (AbstractArray{<:Real, 2}, Vector{String}, AbstractArray{<:Real, 2}, Vector{String})
Load matrices with OTU count and optionally meta data from disc. Available formats are '.tsv', '.csv', '.biom' and '.jld2'.
data_path
- path to a file storing an OTU count matrixmeta_data_path
- optional path to a file with meta variable information*_key
- HDF5 keys to access data sets with OTU counts, Meta variables and variable names in a JLD2 file. If a data item is absent the corresponding key should be 'nothing'. See '?load_data' for additional information.transposed
- iftrue
, rows ofdata
are variables and columns are samples
FlashWeave.load_network
— Methodload_network(net_path::AbstractString) -> FWResult{Int}
Load network results from disk. Available formats are '.edgelist', '.gml' and '.jld2'. For GML, only files with structure identical to save_network('network.gml') output can currently be loaded. FlashWeave parameters that were used for network inference are only available when loading from JLD2.
net_path
- path from which to load the network results
FlashWeave.normalize_data
— Methodnormalize_data(data::AbstractArray{<:Real, 2}) -> (AbstractArray{<:Real, 2}, Vector{String}, Vector{Bool}, Vector{Bool})
Normalize data using various forms of centered-logratio transformation and discretization. This should only be used manually when experimenting with different normalization techniques.
data
- data matrix with information on OTU counts and (optionally) meta variablesheader
- names of variable-columns indata
meta_mask
- true/false mask indicating which variables are meta variablestest_name
- name of a FlashWeave-specific statistical test mode, the corresponding normalization method will be chosen automaticallynorm_mode
- identifier of a valid normalization mode ('clr-adapt', 'clr-nonzero', 'clr-nonzero-binned', 'pres-abs', 'tss', 'tss-nonzero-binned')filter_data
- whether to remove samples with no counts and variables with zero variation fromdata
verbose
- print progress informationprec
- precision in bits to use for calculations (16, 32, 64 or 128)make_sparse
,make_onehot
- see docstring for "learn_network(data::AbstractArray{<:Real, 2})"
FlashWeave.parameters
— Methodparameters(result::FWResult{T}) -> Dict{Symbol, Any}
Extract the used parameters from network results.
FlashWeave.save_network
— Methodsave_network(net_path::AbstractString, net_result::FWResult) -> Void
Save network results to disk. Available formats are '.edgelist', '.gml' and '.jld2'.
net_path
- output path for the networknet_result
- network results object that should be saveddetailed
- save additional information, such as discarding sets, if available (output file suffixes: 'rejections', 'unchecked')
FlashWeave.stacktake!
— Methodtake!(rr::RemoteChannel, args...)
Fetch value(s) from a RemoteChannel
rr
, removing the value(s) in the process.
FlashWeave.sufficient_power
— FunctionCan't be used for MiTestCond since levels_z requires contingency table