Base.namesMethod
variable_ids(result::FWResult{T}) -> Vector{T}

Extract the IDs/names of all variables (nodes) in the network.

FlashWeave.graphMethod
graph(result::FWResult{T}) -> SimpleWeightedGraph{Int, Float64}

Extract the underlying weighted graph from network results.

FlashWeave.learn_networkFunction
learn_network(all_data_paths::AbstractVector{<:AbstractString}, meta_data_path::AbstractString) -> FWResult{<:Integer}

Works like learnnetwork(datapath::AbstractString, metadatapath::AbstractString), but takes paths to multiple data sets (independent sequencing experiments (e.g. 16S + ITS) for the same biological samples) which are normalized independently.

FlashWeave.learn_networkFunction
learn_network(data_path::AbstractString, meta_data_path::AbstractString) -> FWResult{<:Integer}

Works like learn_network(data::AbstractArray{<:Real, 2}), but instead of a data matrix takes file paths to an OTU table and optionally a meta data table as an input.

  • data_path - path to a file storing an OTU count matrix (and JLD2 meta data)

  • meta_data_path - optional path to a file with meta data

  • *_key - HDF5 keys to access data sets with OTU counts, Meta variables and variable names in a JLD2 file. If a data item is absent the corresponding key should be 'nothing'. See '?load_data' for additional information.

  • verbose - print progress information

  • transposed - if true, rows of data are variables and columns are samples

  • kwargs... - additional keyword arguments passed to learn_network(data::AbstractArray{<:Real, 2})

FlashWeave.learn_networkMethod
learn_network(data::AbstractArray{<:Real, 2}) -> FWResult{<:Integer}

Learn an interaction network from a data matrix (including OTUs and optionally meta variables).

  • data - data matrix with information on OTU counts and (optionally) meta variables

  • header - names of variable columns in data

  • meta_mask - true/false mask indicating which variables are meta variables

Algorithmic parameters

  • heterogeneous - enable heterogeneous mode for multi-habitat or -protocol data with at least thousands of samples (FlashWeaveHE)

  • sensitive - enable fine-grained association prediction (FlashWeave-S, FlashWeaveHE-S), sensitive=false results in the fast modes (FlashWeave-F, FlashWeaveHE-F)

  • max_k - maximum size of conditioning sets, high values can lead to the removal of more spurious edgens, but may also strongly increase runtime and reduce statistical power. max_k=0 results in no conditioning (univariate mode)

  • alpha - statistical significance threshold at which individual edges are accepted

  • conv - convergence threshold, e.g. if conv=0.01 assume convergence if the number of edges increased by only 1% after 100% more runtime (checked in intervals)

  • feed_forward - enable feed-forward heuristic

  • fast_elim - enable fast-elimiation heuristic

  • max_tests - maximum number of conditional tests that is performed on a variable pair before association is assumed

  • hps - reliability criterion for statistical tests when sensitive=false

  • FDR - perform False Discovery Rate correction (Benjamini-Hochberg method) on pairwise associations

  • n_obs_min - don't compute associations between variables having less reliable samples (non-zero samples if heterogeneous=true) than this number. -1: automatically choose a threshold.

  • time_limit - if feed-forward heuristic is active, determines the interval (seconds) at which neighborhood information is updated

General parameters

  • normalize - automatically choose and perform data normalization method (based on sensitive and heterogeneous)

  • track_rejections - store for each discarded edge, which variable set lead to its exclusion (can be memory intense for large networks)

  • verbose - print progress information

  • transposed - if true, rows of data are variables and columns are samples

  • prec - precision in bits to use for calculations (16, 32, 64 or 128)

  • make_sparse - use a sparse data representation (should be left at true in almost all cases)

  • make_onehot - create one-hot encodings for meta data variables with more than two categories (should be left at true in almost all cases)

  • update_interval - if verbose=true, determines the interval (seconds) at which network stat updates are printed

  • extra_data - tuples of the form (data, header) representing counts from additional sequencing experiments (e.g. 16S + ITS) for the same biological samples. These will be normalized independently.

  • share_data - if local parallel workers are detected, share input data (instead of copying)

  • experimental_kwargs - experimental keyword arguments that are directly passed to the underlying inference engine

FlashWeave.load_dataFunction
load_data(data_path::AbstractString, meta_path::AbstractString) -> (AbstractArray{<:Real, 2}, Vector{String}, AbstractArray{<:Real, 2}, Vector{String})

Load matrices with OTU count and optionally meta data from disc. Available formats are '.tsv', '.csv', '.biom' and '.jld2'.

  • data_path - path to a file storing an OTU count matrix

  • meta_data_path - optional path to a file with meta variable information

  • *_key - HDF5 keys to access data sets with OTU counts, Meta variables and variable names in a JLD2 file. If a data item is absent the corresponding key should be 'nothing'. See '?load_data' for additional information.

  • transposed - if true, rows of data are variables and columns are samples

FlashWeave.load_networkMethod
load_network(net_path::AbstractString) -> FWResult{Int}

Load network results from disk. Available formats are '.edgelist', '.gml' and '.jld2'. For GML, only files with structure identical to save_network('network.gml') output can currently be loaded. FlashWeave parameters that were used for network inference are only available when loading from JLD2.

  • net_path - path from which to load the network results
FlashWeave.normalize_dataMethod
normalize_data(data::AbstractArray{<:Real, 2}) -> (AbstractArray{<:Real, 2}, Vector{String}, Vector{Bool}, Vector{Bool})

Normalize data using various forms of centered-logratio transformation and discretization. This should only be used manually when experimenting with different normalization techniques.

  • data - data matrix with information on OTU counts and (optionally) meta variables

  • header - names of variable-columns in data

  • meta_mask - true/false mask indicating which variables are meta variables

  • test_name - name of a FlashWeave-specific statistical test mode, the corresponding normalization method will be chosen automatically

  • norm_mode - identifier of a valid normalization mode ('clr-adapt', 'clr-nonzero', 'clr-nonzero-binned', 'pres-abs', 'tss', 'tss-nonzero-binned')

  • filter_data - whether to remove samples with no counts and variables with zero variation from data

  • verbose - print progress information

  • prec - precision in bits to use for calculations (16, 32, 64 or 128)

  • make_sparse, make_onehot - see docstring for "learn_network(data::AbstractArray{<:Real, 2})"

FlashWeave.parametersMethod
parameters(result::FWResult{T}) -> Dict{Symbol, Any}

Extract the used parameters from network results.

FlashWeave.save_networkMethod
save_network(net_path::AbstractString, net_result::FWResult) -> Void

Save network results to disk. Available formats are '.edgelist', '.gml' and '.jld2'.

  • net_path - output path for the network

  • net_result - network results object that should be saved

  • detailed - save additional information, such as discarding sets, if available (output file suffixes: 'rejections', 'unchecked')