Filtering data

Bigleaf.setinvalid_qualityflag!Function
setinvalid_qualityflag!(df; 
  vars=["LE","H","NEE","Tair","VPD","wind"],
  qc_suffix="_qc",
  good_quality_threshold = 1.0,
  missing_qc_as_bad = true,
  setvalmissing = true, 
)

Set records with quality flags indicating problems to false in :valid column.

Arguments

  • df: DataFrame with column :GPP

optional

  • vars=["LE","H","NEE","Tair","VPD","wind"]: columns to theck for quality
  • qc_suffix="_qc": naming of the corresponding quality-flag column
  • good_quality_threshold = 1.0: threshold in quality flag up to which data is considered good quality
  • missing_qc_as_bad = true: set to false to not mark records with missing quality flag as invalid
  • setvalmissing = true: set to false to prevent replacing values in value column corresponding to problematic quality flag to missing.

Value

df with modified :valid and value columns.

Example

using DataFrames
df = DataFrame(
  NEE = 1:3, NEE_qc = [1,1,2],
  GPP = 10:10:30, GPP_qc = [1,missing,1])
setinvalid_qualityflag!(df; vars = ["NEE", "GPP"])
df.valid == [true, false, false]
ismissing(df.GPP[2]) && ismissing(df.NEE[3])
Bigleaf.setinvalid_range!Function
setinvalid_range!(df, var_ranges...; setvalmissing = true, ...)

Set records with values outside specified ranges to false in :valid column.

If their is no limit towards small or large values, supply -Inf or Inf as the minimum or maximum respectively. If there were false values in the :value column before, they are kept false. In addition, set values outside ranges to missing.

Arguments

  • df: DataFrame with column :GPP
  • var_ranges: Pair Varname_symbol => (min,max): closed valid interval for respective column

optional

  • setvalmissing: set to false to prevent replacing values in value column outside ranges to missing.

Value

df with modified :valid and value columns.

using DataFrames
df = DataFrame(NEE = [missing, 2,3], GPP = 10:10:30)
setinvalid_range!(df, :NEE => (-2.0,4.0), :GPP => (8.0,28.0))
df.valid == [false, true, false]
ismissing(df.GPP[3])
Bigleaf.setinvalid_nongrowingseason!Function
setinvalid_nongrowingseason!(df, tGPP; kwargs...)

Set non-growseason to false in :valid column.

Arguments

  • df: DataFrame with columns :GPP and :datetime
  • tGPP: scalar threshold of daily GPP (see get_growingseason)

optional:

Value

df with modified columns :valid and if :GPPd_smoothed, where all non-growing season records are set to false.

Bigleaf.get_growingseasonFunction
get_growingseason(GPPd, tGPP; ws=15, min_int=5, warngap=true)

Filters annual time series for growing season based on smoothed daily GPP data.

Arguments

  • GPPd: daily GPP (any unit)
  • tGPP: GPP threshold (fraction of 95th percentile of the GPP time series). Takes values between 0 and 1.

optional

  • ws: window size used for GPP time series smoothing
  • min_int: minimum time interval in days for a given state of growing season
  • warngap: set to false to suppress warning on too few non-missing data

Details

The basic idea behind the growing season filter is that vegetation is considered to be active when its carbon uptake (GPP) is above a specified threshold, which is defined relative to the peak GPP (95th percentile) observed in the year. The GPP-threshold is calculated as:

$GPP_{threshold} = quantile(GPPd,0.95)*tGPP$

GPPd time series are smoothed with a moving average to avoid fluctuations in the delineation of the growing season. The window size defaults to 15 days, but depending on the ecosystem, other values can be appropriate.

The argument min_int serves to avoid short fluctuations in the status growing season vs. no growing season by defining a minimum length of the status. If a time interval shorter than min_int is labeled as growing season or non-growing season, it is changed to the status of the neighboring values, i.e its opposite.

Value

A NamedTuple with entries

  • is_growingseason: a BitVector of the same length as the input GPPd in which false indicate no growing season (dormant season) and true indicate growing season.
  • GPPd_smoothed: smoothed GPPd
Bigleaf.setinvalid_afterprecip!Function
setinvalid_afterprecip!(df; min_precip = 0.01, hours_after = 24.0)

Set records after precipitation to false in :valid column.

Arguments

  • df: DataFrame with columns :datetime and :precip sorted by :datetime in increasing order.

optional:

  • min_precip (in mm per timestep): minimum precip to be considered effective precipitation.
  • hours_after: time after the precipitation event to be considered invalid

Value

df with modified column :valid.