add_percentage(s; ispct = false, digits = 2)

Add "%" to s.


  • s: numbers.
  • ispct: whether the orinal data is percentage; if not, the value will time 100.
  • digits: rounds to the specified number of digits after the decimal place.
            id = r"Pre.*_(.*)_.*", 
            type = :accuracy, 
            pct = true, 
            colanalyte = :Analyte,
            colstats = :Stats,
            colday = :Day,
            collevel = :Level

Compute accuracy and precision. A NamedTuple is returned with two elements: daily is a DataFrame containing accuracy and standard deviation for each day, and summary is a DataFrame containing overall accuracy, repeatability and reproducibility.


  • at: AnalysisTable.
  • id: Regex identifier for the AP experiment samples. The day and concentration level is captured in the identifier; the order can be set by order.
  • order: a string for setting the order of captured values from id; D is day; L is concentration level.
  • type: data type for calculation.
  • pct: whether converting ratio data into percentage (*100).
  • colanalyte: column name of analytes.
  • colstats: column name of statistics.
  • colday: column name of validation day.
  • collevel: column name of level.
            matrix = r"Post.*_(.*)_.*", 
            stds = r"STD.*_(.*)_.*", 
            type = :area, 
            pct = true, 
            colanalyte = :Analyte,
            colstats = :Stats,
            collevel = :Level

Compute matrix effects.


  • at: AnalysisTable.
  • matrix: Regex identifier for samples with matrix. The concentration level is captured in the identifier.
  • stds: Regex identifier for standard solution. The concentration level is captured in the identifier.
  • type: data type for calculation.
  • pct: whether converting ratio data into percentage (*100).
  • colanalyte: column name of analytes.
  • colstats: column name of statistics.
  • collevel: column name of level.
mean_plus_minus_std(m, s; digits = 2)

Round and merge mean values and standard deviations with "±".


  • m: mean values.
  • s: standard deviations.
  • digits: rounds to the specified number of digits after the decimal place.
normalize(df::DataFrame, normalizer::DataFrame; id = [:Analyte, :L], stats = (All(), "Accuracy"), colstats = :Stats)

Normalize DataFrame by the given normalizer.


  • normalizer: the DataFrame to normalize df.
  • df: the DataFrame to be normalized.
  • id: the column(s) (Symbol, string or integer) with a unique key for each row.
  • stats: a Tuple represented as statistics involved in normalization. The first argument applies to df, and the second applies to normalizer. All() indicates including all statistics.
  • colstats: column name of statistics.
pivot(df::DataFrame, col; rows = [], prefix = true, notsort = ["Stats", "File"], drop = [])
pivot(df::DataFrame, cols::AbstractVector; rows = [], prefix = true, notsort = ["Stats", "File"], drop = [])

Transform DataFrame into wide format.


  • df: target DataFrame.
  • col: the column (Symbol or String) holding the column names in wide format.
  • cols: the column(s) (Vector) holding the column names in wide format.

Keyword Arguments

  • rows: the column(s) (Symbol, String, or Vector) preserving as row keys in wide format.
  • prefix: whether preserving col or cols in column names.
  • notsort: columns (Vector); do not sort by these columns.
  • drop: columns (Vector); drop these columns.
            id = r"PooledQC", 
            type = :estimated_concentration, 
            pct = true,
            stats = [mean, std, pct ? rsd_pct : rsd], 
            names = ["Mean", "Standard Deviation", "Relative Standard Deviation" * (pct ? "(%)" : "")], 
            colanalyte = :Analyte,
            colstats = :Stats

Compute statistics of QC data.


  • at: AnalysisTable.
  • id: Regex identifier for the QC samples.
  • pct: whether converting ratio data into percentage (*100).
  • type: data type for calculation.
  • stats: statistics functions.
  • names: names of statistics. When nothing is given, stats is served as names.
  • colanalyte: column name of analytes.
  • colstats: column name of statistics.
        lod = nothing, 
        loq = nothing, 
        lloq = nothing, 
        uloq = nothing, 
        lodsub = "<LOD", 
        loqsub = "<LOQ", 
        lloqsub = "<LLOQ", 
        uloqsub = ">ULOQ")

Replace data out of acceptable range.

  • lod: limit of detection; values are promoted to match columns whose name starts with "Data".
  • loq: limit of quantification; values are promoted to match columns whose name starts with "Data".
  • lloq: lower limit of quantification; values are promoted to match columns whose name starts with "Data".
  • uloq: upper limit of quantification; values are promoted to match columns whose name starts with "Data".
  • lodsub: substitution for value smaller than LOD.
  • loqsub: substitution for value smaller than LOQ.
  • lloqsub: substitution for value smaller than LLOQ.
  • uloqsub: substitution for value larger than ULOQ.
read(file, source = :mh)

Read data into AnalysisTable from various sourece.

Currently, only data from Agilent MassHunter Quantitative analysis is implemented. The table needs to be flat. There must be a column whose name contains "Data File" as id for each file.

The returned AnalysisTable contains multiple SampleDataTable to repressent different data types which samplecol is :File.

                pre = r"Pre.*_(.*)_.*", 
                post = r"Post.*_(.*)_.*", 
                type = :area, 
                pct = true, 
                colanalyte = :Analyte,
                colstats = :Stats,
                collevel = :Level

Compute recovery.


  • at: AnalysisTable.
  • pre: Regex identifier for prespiked samples. The concentration level is captured in the identifier.
  • post: Regex identifier for postspiked samples. The concentration level is captured in the identifier.
  • type: data type for calculation.
  • pct: whether converting ratio data into percentage (*100).
  • colanalyte: column name of analytes.
  • colstats: column name of statistics.
  • collevel: column name of level.
sample_report(at::AnalysisTable; id = r"Sample_(\d*).*", type = :estimated_concentration, colanalyte = :Analyte)

Compute mean of sample data.


  • at: AnalysisTable.
  • id: Regex identifier for the QC samples.
  • type: data type for calculation.
  • colanalyte: column name of analytes.
selectby(df::DataFrame, col, col_pairs...; 
        pivot = false, 
        rows = [], 
        notsort = ["Stats", "File"], 
        prefix = true, 
        drop = [], 

Select values by col, and apply select! as if the values are columns.


  • df: target DataFrame.
  • col: column name.
  • col_pairs: DataFrames.jl syntax to manipulate columns. They will be put in internal select! function.

Keyword Arguments

  • rows: the column(s) (Symbol, String, or Vector) preserving as row keys.
  • pivot: whether pivot the dataframe by col.
  • notsort: columns (Vector); do not sort by these columns.
  • drop: columns (Vector); drop these columns.
  • kwargs: keyword arguments for internal select! function.
                    day0 = r"S.*_(.*)_.*", 
                    stored = r"S.*_(.*)_(.*)_(.*)_.*", 
                    order = "CDL", 
                    type = :accuracy, 
                    pct = true,                             
                    colanalyte = :Analyte,
                    colstats = :Stats,
                    colcondition = :Condition,
                    colday = :Day,
                    collevel = :Level,
                    isaccuracy = true

Compute stability. A NamedTuple is returned with three elements: day0 is a DataFrame conataing day0 samples, stored is a DataFrame conataing stored samples, and stored_over_day0 is stored divided by day0.

if day0 is not available, both day0 and stored_over_day0 are nothing.


  • at: AnalysisTable.
  • day0: Regex identifier for day0 samples. The concentration level is captured in the identifier.
  • stored: Regex identifier for stored samples. The storage condition, concentration level, and storage days are captured in the identifier; the order can be set by order.
  • order: a string for setting the order of captured values from id; C is storage condition; D is storage days; L is concentration level
  • type: data type for calculation.
  • pct: whether converting ratio data into percentage (*100).
  • colanalyte: column name of analytes.
  • colstats: column name of statistics.
  • colcondition: column name of storage condition.
  • colday: column name of validation day.
  • collevel: column name of level.
  • isaccuracy: whether the input data is accuracy.
unpivot(df::DataFrame, col; rows = [], notsort = ["Stats", "File"], drop = [])
unpivot(df::DataFrame, cols::AbstractVector; rows = [], notsort = ["Stats", "File"], drop = [])

Transform DataFrame into wide format.


  • df: target DataFrame.
  • col: the column name (Symbol or String) in long format.
  • cols: the column(s) (Vector) in long format.

Keyword Arguments

  • rows: the column(s) (Symbol, String, or Vector) preserving as row keys in long format.
  • notsort: columns (Vector); do not sort by these columns.
  • drop: columns (Vector); drop these columns.