CommonDataModel.AbstractDataset
— TypeAbstractDataset
is a collection of multidimensional variables (for example a NetCDF or GRIB file)
A data set ds
of a type derived from AbstractDataset
should implemented at minimum:
Base.key(ds)
: return a list of variable names as stringsvariable(ds,varname::String)
: return an array-like data structure (derived fromAbstractVariable
) of the variables corresponding tovarname
. This array-like data structure should follow the CF semantics.dimnames(ds)
: should be an iterable with all dimension names in the data setds
dim(ds,name)
: dimension value corresponding to name
Optionally a data set can have attributes and groups:
attribnames(ds)
: should be an iterable with all attribute namesattrib(ds,name)
: attribute value corresponding to namegroupnames(ds)
: should be an iterable with all group namesgroup(ds,name)
: group corresponding to the name
For a writable dataset, one should also implement:
defDim
: define a dimensiondefAttrib
: define a attributedefVar
: define a variabledefGroup
: define a group
CommonDataModel.AbstractVariable
— TypeAbstractVariable{T,N}
is a subclass of AbstractArray{T, N}
. A variable v
of a type derived from AbstractVariable
should implement:
name(v)
: should be the name of variable within the data setdimnames(v)
: should be a iterable data structure with all dimension names of the variablev
dataset(v)
: the parent dataset containingv
Base.size(v)
: the size of the variableBase.getindex(v,indices...)
: get the data ofv
at the provided indices
Optionally a variable can have attributes:
attribnames(v)
: should be an iterable with all attribute namesattrib(v,name)
: attribute value corresponding to name
For a writable dataset, one should also implement:
defAttrib
: define a attributeBase.setindex!(v,data,indices...)
: set the data inv
at the provided indices
CommonDataModel.Attributes
— TypeA collection of attributes with a Dict-like interface dispatching to attribnames
, attrib
, defAttrib
for keys
, getindex
and setindex!
respectively.
CommonDataModel.CFVariable
— TypeVariable (with applied transformations following the CF convention) attrib can have different attributes as the parent variables (used in GRIBDatasets to map from grib attributes to CF attributes)
CommonDataModel.Dimensions
— TypeA collection of dimensions with a Dict-like interface dispatching to dimnames
, dim
, defDim
for keys
, getindex
and setindex!
respectively.
CommonDataModel.Groups
— TypeA collection of groups with a Dict-like interface dispatching to groupnames
and group
for keys
and getindex
respectively.
Base.collect
— Methodcollect always returns an array. Even if the result of the indexing is a scalar, it is wrapped into a zero-dimensional array.
Base.delete!
— MethodBase.delete!(a::Attributes, name)
Delete the attribute name
from the attribute list a
.
Base.filter
— Methoddata = CommonDataModel.filter(ncv, indices...; accepted_status_flags = nothing)
Load and filter observations by replacing all variables without an acepted status flag to missing
. It is used the attribute ancillary_variables
to identify the status flag.
# da["data"] is 2D matrix
good_data = NCDatasets.filter(ds["data"],:,:, accepted_status_flags = ["good_data","probably_good_data"])
Base.getindex
— Methodv = getindex(ds::AbstractDataset, varname::SymbolOrString)
Return the variable varname
in the dataset ds
as a CFVariable
. The following CF convention are honored when the variable is indexed:
_FillValue
ormissing_value
(which can be a list) will be returned asmissing
.scale_factor
andadd_offset
are applied (output =scale_factor
*data_in_file
+add_offset
)- time variables (recognized by the units attribute and possibly the calendar attribute) are returned usually as
DateTime
object. Note thatCFTime.DateTimeAllLeap
,CFTime.DateTimeNoLeap
andCF.TimeDateTime360Day
cannot be converted to the proleptic gregorian calendar used in julia and are returned as such. (SeeCFTime.jl
for more information about those date types.) If a calendar is defined but not among the ones specified in the CF convention, then the data in the file is not converted into a date structure.
A call getindex(ds, varname)
is usually written as ds[varname]
.
If variable represents a cell boundary, the attributes calendar
and units
of the related variables are used, if they are not specified. For example:
dimensions:
time = UNLIMITED; // (5 currently)
nv = 2;
variables:
double time(time);
time:long_name = "time";
time:units = "hours since 1998-04-019 06:00:00";
time:bounds = "time_bnds";
double time_bnds(time,nv);
In this case, the variable time_bnds
uses the units and calendar of time
because both variables are related thought the bounds attribute following the CF conventions.
See also cfvariable(ds, varname)
.
Base.getindex
— Methodgetindex(a::Attributes,name::SymbolOrString)
Return the value of the attribute called name
from the attribute list a
. Generally the attributes are loaded by indexing, for example:
using NCDatasets
ds = NCDataset("file.nc")
title = ds.attrib["title"]
Base.getindex
— Methodgroup = getindex(g::Groups,groupname::AbstractString)
Return the NetCDF group
with the name groupname
from the parent group g
.
For example:
ds = NCDataset("results.nc", "r");
forecast_group = ds.group["forecast"]
forecast_temp = forecast_group["temperature"]
Base.getindex
— Methodvar = getindex(ds::Union{AbstractDataset,AbstractVariable},cfname::CFStdName)
Return the NetCDF variable var
with the standard name cfname
from a dataset. If the first argument is a variable, then the search is limited to all variables with the same dimension names.
Base.keys
— MethodBase.keys(a::Attributes)
Return a list of the names of all attributes.
Base.keys
— Methodkeys(d::Dimensions)
Return a list of all dimension names in NCDataset ds
.
Examples
ds = NCDataset("results.nc", "r");
dimnames = keys(ds.dim)
Base.keys
— Methodnames = keys(g::Groups)
Return the names of all subgroubs of the group g
.
Base.reduce
— Methodgr = reduce(f,gv::GroupedVariable)
Reduce the grouped variable gv
along grouped dimension using the function f
. The function f
will be called as f(x,dims=d)
where x
array (an element of gv
) and d
is an integer of the dimension overwhich one need to reduce x
.
Base.setindex!
— MethodBase.setindex!(a::Attributes,data,name::SymbolOrString)
Set the attribute called name
to the value data
in the attribute list a
. data
can be a vector or a scalar. A scalar is handeld as a vector with one element in the NetCDF data model.
Generally the attributes are defined by indexing, for example:
ds = NCDataset("file.nc","c")
ds.attrib["title"] = "my title"
close(ds)
Base.setindex!
— Methodsetindex!(d::Dimensions,len,name::AbstractString)
Defines the dimension called name
to the length len
, for example:
ds = NCDataset("file.nc","c")
ds.dim["longitude"] = 100
If len
is the special value Inf
, then the dimension is considered as unlimited
, i.e. it will grow as data is added to the NetCDF file.
Base.size
— Methodsz = size(var::CFVariable)
Return a tuple of integers with the size of the variable var
.
Note that the size of a variable can change, i.e. for a variable with an unlimited dimension.
Base.view
— Methodsv = view(v::CommonDataModel.AbstractVariable,indices...)
Returns a view of the variable v
where indices are only lazily applied. No data is actually copied or loaded. Modifications to a view sv
, also modifies the underlying array v
. All attributes of v
are also present in sv
.
Examples
using NCDatasets
fname = tempname()
data = zeros(Int,10,11)
ds = NCDataset(fname,"c")
ncdata = defVar(ds,"temp",data,("lon","lat"))
ncdata_view = view(ncdata,2:3,2:4)
size(ncdata_view)
# output (2,3)
ncdata_view[1,1] = 1
ncdata[2,2]
# outputs 1 as ncdata is also modified
close(ds)
Base.write
— Methodwrite(dest::AbstractDataset, src::AbstractDataset; include = keys(src), exclude = [])
Write the variables of src
dataset into an empty dest
dataset (which must be opened in mode "a"
or "c"
). The keywords include
and exclude
configure which variable of src
should be included (by default all), or which should be excluded
(by default none).
If the first argument is a file name, then the dataset is open in create mode ("c"
).
This function is useful when you want to save the dataset from a multi-file dataset.
To save a subset, one can use the view function view
to virtually slice a dataset:
Example
NCDataset(fname_src) do ds
write(fname_slice,view(ds, lon = 2:3))
end
All variables in the source file fname_src
with a dimension lon
will be sliced along the indices 2:3
for the lon
dimension. All attributes (and variables without a dimension lon
) will be copied over unmodified.
CommonDataModel.ancillaryvariables
— Methodncvar = CommonDataModel.ancillaryvariables(ncv::CFVariable,modifier)
Return the first ancillary variables from the NetCDF (or other format) variable ncv
with the standard name modifier modifier
. It can be used for example to access related variable like status flags.
CommonDataModel.attrib
— MethodCommonDatamodel.attrib(ds::Union{AbstractDataset,AbstractVariable},attribname::SymbolOrString)
Return the value of the attribute attribname
in the data set ds
.
CommonDataModel.attribnames
— MethodCommonDatamodel.attribnames(ds::Union{AbstractDataset,AbstractVariable})
Return an iterable of all attribute names in ds
.
CommonDataModel.bounds
— Methodb = bounds(ncvar::NCDatasets.CFVariable)
Return the CFVariable corresponding to the bounds
attribute of the variable ncvar
. The time units and calendar from the ncvar
are used but not the attributes controling the packing of data scale_factor
, add_offset
and _FillValue
.
CommonDataModel.cfvariable
— Methodv = cfvariable(ds::NCDataset,varname::SymbolOrString; <attrib> = <value>)
Return the variable varname
in the dataset ds
as a NCDataset.CFVariable
. The keyword argument <attrib>
are the attributes (fillvalue
, missing_value
, scale_factor
, add_offset
, units
and calendar
) relevant to the CF conventions. By specifing the value of these attributes, the one can override the value specified in the data set. If the attribute is set to nothing
, then the attribute is not loaded and the corresponding transformation is ignored. This function is similar to ds[varname]
with the additional flexibility that some variable attributes can be overridden.
Example:
NCDataset("foo.nc","c") do ds
defVar(ds,"data",[10., 11., 12., 13.], ("time",), attrib = Dict(
"add_offset" => 10.,
"scale_factor" => 0.2))
end
# The stored (packed) valued are [0., 5., 10., 15.]
# since 0.2 .* [0., 5., 10., 15.] .+ 10 is [10., 11., 12., 13.]
ds = NCDataset("foo.nc");
@show ds["data"].var[:]
# returns [0., 5., 10., 15.]
@show cfvariable(ds,"data")[:]
# returns [10., 11., 12., 13.]
# neither add_offset nor scale_factor are applied
@show cfvariable(ds,"data", add_offset = nothing, scale_factor = nothing)[:]
# returns [0, 5, 10, 15]
# add_offset is applied but not scale_factor
@show cfvariable(ds,"data", scale_factor = nothing)[:]
# returns [10, 15, 20, 25]
# 0 is declared as the fill value (add_offset and scale_factor are applied as usual)
@show cfvariable(ds,"data", fillvalue = 0)[:]
# returns [missing, 11., 12., 13.]
# Use the time units: days since 2000-01-01
@show cfvariable(ds,"data", units = "days since 2000-01-01")[:]
# returns [DateTime(2000,1,11), DateTime(2000,1,12), DateTime(2000,1,13), DateTime(2000,1,14)]
close(ds)
CommonDataModel.chunking
— Methodstorage,chunksizes = chunking(v::MFVariable)
storage,chunksizes = chunking(v::MFCFVariable)
Return the storage type (:contiguous
or :chunked
) and the chunk sizes of the varable v
corresponding to the first file. If the first file in the collection is chunked then this storage attributes are returned. If not the first file is not contiguous, then multi-file variable is still reported as chunked with chunk size equal to the size of the first variable.
CommonDataModel.coord
— Methodcv = coord(v::Union{CFVariable,Variable},standard_name)
Find the coordinate of the variable v
by the standard name standard_name
or some standardized heuristics based on units. If the heuristics fail to detect the coordinate, consider to modify the file to add the standard_name
attribute. All dimensions of the coordinate must also be dimensions of the variable v
.
Example
using NCDatasets
ds = NCDataset("file.nc")
ncv = ds["SST"]
lon = coord(ncv,"longitude")[:]
lat = coord(ncv,"latitude")[:]
v = ncv[:]
close(ds)
CommonDataModel.dataset
— Methodds = CommonDataModel.dataset(v::AbstractVariable)
Return the data set ds
to which a the variable v
belongs to.
CommonDataModel.defAttrib
— MethodCommonDatamodel.defAttrib(ds::Union{AbstractDataset,AbstractVariable},name::SymbolOrString,data)
Create an attribute with the name attrib
in the data set or variable ds
.
CommonDataModel.defDim
— MethodCommonDatamodel.defDim(ds::AbstractDataset,name::SymbolOrString,len)
Create dimension with the name name
in the data set ds
with the length len
. len
can be Inf
for unlimited dimensions.
CommonDataModel.defGroup
— Methodgroup = CommonDatamodel.defGroup(ds::AbstractDataset,name::SymbolOrString)
Create an empty sub-group with the name name
in the data set ds
. The group
is a sub-type of AbstractDataset
.
CommonDataModel.defVar
— Methodv = CommonDataModel.defVar(ds::AbstractDataset,src::AbstractVariable)
v = CommonDataModel.defVar(ds::AbstractDataset,name::SymbolOrString,src::AbstractVariable)
Defines and return the variable in the data set ds
copied from the variable src
. The dimension name, attributes and data are copied from src
as well as the variable name (unless provide by name
).
CommonDataModel.delAttrib
— MethodCommonDatamodel.delAttrib(ds::Union{AbstractDataset,AbstractVariable},name::SymbolOrString,data)
Deletes an attribute with the name attrib
in the data set or variable ds
.
CommonDataModel.dim
— MethodCommonDatamodel.dim(ds::AbstractDataset,dimname::SymbolOrString)
Return the length of the dimension dimname
in the data set ds
.
CommonDataModel.dimnames
— MethodCommonDataModel.dimnames(v::AbstractVariable)
Return an iterable of the dimension names of the variable v
.
CommonDataModel.dimnames
— MethodCommonDatamodel.dimnames(ds::AbstractDataset)
Return an iterable of all dimension names in ds
. This information can also be accessed using the property ds.dim
:
Examples
ds = NCDataset("results.nc", "r");
dimnames = keys(ds.dim)
CommonDataModel.dimnames
— Methoddimnames(v::CFVariable)
Return a tuple of strings with the dimension names of the variable v
.
CommonDataModel.dims
— MethodCommonDatamodel.dims(ds::Union{AbstractDataset,AbstractVariable})
Return a dict-like of all dimensions and their corresponding length defined in the the data set ds
(or variable).
CommonDataModel.fillvalue
— Methodfillvalue(::Type{Int8})
fillvalue(::Type{UInt8})
fillvalue(::Type{Int16})
fillvalue(::Type{UInt16})
fillvalue(::Type{Int32})
fillvalue(::Type{UInt32})
fillvalue(::Type{Int64})
fillvalue(::Type{UInt64})
fillvalue(::Type{Float32})
fillvalue(::Type{Float64})
fillvalue(::Type{Char})
fillvalue(::Type{String})
Default fill-value for the given type from NetCDF.
CommonDataModel.find_gv
— MethodA = find_gv(T,As)
returns the first type T among the arguments.
CommonDataModel.group
— MethodCommonDatamodel.group(ds::AbstractDataset,groupname::SymbolOrString)
Return the sub-group data set with the name groupname
.
CommonDataModel.groupby
— Methodgv = CommonDataModel.groupby(v::AbstractVariable,:coordname => group_fun)
gv = CommonDataModel.groupby(v::AbstractVariable,"coordname" => group_fun)
Create a grouped variable gv
whose elements composed by all elements in v
whose corresponding coordinate variable (with the name coordname
) map to the same value once the group function group_fun
is applied to the coordinate.
The grouped variable gv
and be reduced using the functions sum
mean
, median
, var
or std
, for example gr = mean(gv)
. The result gr
is a lazy structure representing the outcome of these operations performed over the grouped dimension. Only when the result gr
is indexed the actually values are computed.
Broadcasting for gv
and gr
is overloaded. Broadcasting over all elements of gv
means that a mapping function is to be applied to all elements of gv
before a possible the reduction. Broadcasting over gr
, for example v .- gr
mean that gr
is broadcasted over the full size of v
according to the grouping function.
Example:
using NCDatasets, Dates
using CommonDataModel: @groupby, groupby
# create same test data
time = DateTime(2000,1,1):Day(1):DateTime(2009,12,31); # 10 years
data = rand(Float32.(-9:99),360,180,length(time));
fname = "test_file.nc"
ds = NCDataset(fname,"c");
defVar(ds,"time",time,("time",));
defVar(ds,"data",data,("lon","lat","time"));
# group over month
gv = @groupby(ds["data"],Dates.Month(time))
# or
# gv = groupby(ds["data"],:time => Dates.Month)
length(gv)
# output 12 as they are all 12 months in this dataset
size(gv[1])
# 360 x 180 x 310 array with all time slices corresponding to the 1st month
# the variable `gv` is equivalent to the following operation
time_month = Dates.Month.(ds[:time][:])
gv2 = [ds[:data][:,:,findall(time_month .== m)] for m in sort(unique(time_month))];
# compute basic statistics
using Statistics
monthly_mean = mean(gv);
size(monthly_mean)
# 360 x 180 x 12 array with the monthly mean
# get a regular julia array
monthly_mean_array = monthly_mean[:,:,:];
typeof(monthly_mean_array)
# Array{Float32, 3}
# substact from data the corresponding monthly mean
monthly_anomalies = data .- mean(gv);
close(ds)
CommonDataModel.groupnames
— MethodCommonDatamodel.groupnames(ds::AbstractDataset)
All the subgroup names of the data set ds
. For a data set containing only a single group, this will be an empty vector of String
.
CommonDataModel.groups
— MethodCommonDatamodel.groups(ds::AbstractDataset)
Return all sub-group data as a dict-like object.
CommonDataModel.load!
— MethodCommonDataModel.load!(ncvar::CFVariable, data, buffer, indices)
Loads a NetCDF (or other format) variables ncvar
in-place and puts the result in data
(an array of eltype(ncvar)
) along the specified indices
. buffer
is a temporary array of the same size as data but the type should be eltype(ncv.var)
, i.e. the corresponding type in the files (before applying scale_factor
, add_offset
and masking fill values). Scaling and masking will be applied to the array data
.
data
and buffer
can be the same array if eltype(ncvar) == eltype(ncvar.var)
.
Example:
# create some test array
Dataset("file.nc","c") do ds
defDim(ds,"time",3)
ncvar = defVar(ds,"vgos",Int16,("time",),attrib = ["scale_factor" => 0.1])
ncvar[:] = [1.1, 1.2, 1.3]
# store 11, 12 and 13 as scale_factor is 0.1
end
ds = Dataset("file.nc")
ncv = ds["vgos"];
# data and buffer must have the right shape and type
data = zeros(eltype(ncv),size(ncv)); # here Vector{Float64}
buffer = zeros(eltype(ncv.var),size(ncv)); # here Vector{Int16}
NCDatasets.load!(ncv,data,buffer,:,:,:)
close(ds)
CommonDataModel.name
— MethodCommonDatamodel.name(ds::AbstractDataset)
Name of the group of the data set ds
. For a data set containing only a single group, this will be always the root group "/"
.
CommonDataModel.name
— MethodCommonDataModel.name(v::AbstractVariable)
Return the name of the variable v
as a string.
CommonDataModel.parentdataset
— Methodpds = CommonDatamodel.parentdataset(ds::AbstractDataset)
The data set pds
containing ds
as a sub-group. pds
is nothing for the root group.
CommonDataModel.path
— MethodCommonDatamodel.path(ds::AbstractDataset)
File path of the data set ds
.
CommonDataModel.rolling
— Methodgv = CommonDataModel.rolling(v::AbstractVariable,:coordname => n)
gv = CommonDataModel.rolling(v::AbstractVariable,:coordname => rolling_classes)
Create a grouped variable gv
whose elements composed by all elements in v
grouped by a rolling window of length n
along the coordinate variable coordname
. One can also specify a vector classes (rolling_classes
) with as many elements as they are groups and the elements coorespond to the values of the coordinate. Unlike CommonDataModel.groupby
, the groups defined by rolling
can overlap.
The grouped variable gv
and be reduced using the functions sum
mean
, median
, var
or std
, for example gr = mean(gv)
. The result gr
is a lazy structure representing the outcome of these operations performed over the grouped dimension. Only when the result gr
is indexed the actually values are computed.
Broadcasting for gv
is overloaded. Broadcasting over all elements of gv
means that a mapping function is to be applied to all elements of gv
before a possible the reduction.
This operations is also called "running mean" when using mean
as reduction function.
Example:
using NCDatasets, Dates
using CommonDataModel: rolling
# create same test data
time = DateTime(2000,1,1):Day(1):DateTime(2009,12,31); # 10 years
data = rand(Float32.(-9:99),360,180,length(time));
fname = "test_file.nc"
ds = NCDataset(fname,"c");
defVar(ds,"time",time,("time",));
defVar(ds,"data",data,("lon","lat","time"));
# running weekly mean
gv = rolling(ds["data"],:time => 7)
length(gv)
# output 3653 as a mean will be compute for all time instance (including a
# partial mean and the beginning and the end)
size(gv[1])
# output 360 x 180 x 4: the first time slice will only be a mean of 4 values
size(gv[4])
# output 360 x 180 x 7: data from the first 7 days
# compute basic statistics
using Statistics
weekly_running_mean = mean(gv);
size(weekly_running_mean)
# 360 x 180 x 3653 array with the running weekly mean
# get a regular julia array
weekly_running_mean_array = weekly_running_mean[:,:,:];
typeof(weekly_running_mean_array)
# Array{Float32, 3}
# computing a centred 3 monthly mean taking into account that month do not have the
# same length
rolling_classes = [(t - Dates.Month(1)):Day(1):(t + Dates.Month(1)) for t in time]
extrema(length.(rolling_classes))
# output (60, 63)
data_3monthly = mean(rolling(ds["data"],:time => rolling_classes))[:,:,:];
close(ds)
CommonDataModel.select
— Methodvsubset = CommonDataModel.select(v,param1 => condition1, param2 => condition2,...)
dssubset = CommonDataModel.select(ds,param1 => condition1, param2 => condition2,...)
Return a subset of the variable v
(or dataset ds
) satisfying the conditions applied to the corresponding parameters. param1
, param2
... are symbols or strings with variable names and condition1
, condition2
... are functions taking as a argument the values of the corresponding parameter and return either false (ignore the corresponding elements) or true (select the corresponding elements for loading).
Examples
Create a sample file with random data:
using NCDatasets, Dates
using CommonDataModel: select, Near
fname = "sample_file.nc"
lon = -180:180
lat = -90:90
time = DateTime(2000,1,1):Day(1):DateTime(2000,1,3)
SST = randn(length(lon),length(lat),length(time));
ds = NCDataset(fname,"c")
defVar(ds,"lon",lon,("lon",));
defVar(ds,"lat",lat,("lat",));
defVar(ds,"time",time,("time",));
defVar(ds,"SST",SST,("lon","lat","time"));
# load by bounding box
v = select(ds["SST"],
:lon => lon -> 30 <= lon <= 60,
:lat => lat -> 40 <= lat <= 80)
# You can also select based on `ClosedInterval`s from `IntervalSets.jl`.
# Both 30..60 and 60 ± 20 construct `ClosedInterval`s, see their documentation for details. `∈` can be typed `\in` followed by the TAB-key.
using IntervalSets
v = select(ds["SST"], :lon => ∈(30..60), :lat => ∈(60 ± 20))
# get the indices matching the select query
(lon_indices,lat_indices,time_indices) = parentindices(v)
# get longitude matchting the select query
v_lon = v["lon"]
# find the nearest time instance
v = select(ds["SST"],:time => Near(DateTime(2000,1,4)))
# find the nearest time instance but not earlier or later than 2 hours
# an empty array is returned if no time instance is present
v = select(ds["SST"],:time => Near(DateTime(2000,1,3,1),Hour(2)))
close(ds)
See also @select
for more information.
CommonDataModel.set_attribute_color
— Method CommonDataModel.set_attribute_color(color::Symbol)
Set the attribute color. The default color is cyan
.
CommonDataModel.set_section_color
— Method CommonDataModel.set_section_color(color::Symbol)
Set the section color. The default color is red
.
CommonDataModel.set_variable_color
— Method CommonDataModel.set_variable_color(color::Symbol)
Set the variable color. The default color is green
.
CommonDataModel.show_dim
— MethodCommonDatamodel.show_dim(io,dim)
Print a list all dimensions (key/values pairs where key is the dimension names and value the corresponding length) in dim
to IO stream io
. The IO property :level
is used for indentation.
CommonDataModel.time_factor
— Methodtf = CommonDataModel.time_factor(v::CFVariable)
The time unit in milliseconds. E.g. seconds would be 1000., days would be 86400000. The result can also be nothing
if the variable has no time units.
CommonDataModel.unlimited
— MethodCommonDatamodel.unlimited(ds::AbstractDataset)
Iterator of strings with the name of the unlimited dimension.
CommonDataModel.unlimited
— Methodunlimited(d::Dimensions)
Return the names of all unlimited dimensions.
CommonDataModel.varbyattrib
— Methodvarbyattrib(ds, attname = attval)
Returns a list of variable(s) which has the attribute attname
matching the value attval
in the dataset ds
. The list is empty if the none of the variables has the match. The output is a list of CFVariable
s.
Examples
Load all the data of the first variable with standard name "longitude" from the NetCDF file results.nc
.
julia> ds = NCDataset("results.nc", "r");
julia> data = varbyattrib(ds, standard_name = "longitude")[1][:]
CommonDataModel.variable
— MethodCommonDataModel.variable(ds::AbstractDataset,variablename::SymbolOrString)
Return the variable with the name variablename
from the data set ds
.
CommonDataModel.varnames
— MethodCommonDataModel.varnames(ds::AbstractDataset)
Return an iterable of all the variable name.
CommonDataModel.@groupby
— Macrogv = CommonDataModel.@groupby(v,group_fun(coordname))
Create a grouped variable gv
whose elements are composed by all elements in v
whose corresponding coordinate variable (with the name coordname
) map to the same value once the group function group_fun
is applied to the coordinate variable.
See groupby
for more information.
CommonDataModel.@select
— Macrovsubset = CommonDataModel.@select(v,expression)
dssubset = CommonDataModel.@select(ds,expression)
Return a subset of the variable v
(or dataset ds
) satisfying the condition expression
as a view. The condition has the following form:
condition₁ && condition₂ && condition₃ ... conditionₙ
Every condition should involve a single 1D variable (typically a coordinate variable, referred as coord
below). If v
is a variable, the related 1D variable should have a shared dimension with the variable v
. All local variables need to have a $
prefix (see examples below). This macro is experimental and subjected to change.
Every condition can either perform:
a nearest match:
coord ≈ target_coord
(for≈
type\approx
followed by the TAB-key). Only the data corresponding to the index closest totarget_coord
is loaded.a nearest match with tolerance:
coord ≈ target_coord ± tolerance
. As before, but if the difference between the closest value incoord
andtarget_coord
is larger (in absolute value) thantolerance
, an empty array is returned.a condition operating on scalar values. For example, a
condition
equal to10 <= lon <= 20
loads all data with the longitude between 10 and 20 orabs(lat) > 60
loads all variables with a latitude north of 60° N and south of 60° S (assuming that the has the 1D variableslon
andlat
for longitude and latitude).
Only the data which satisfies all conditions is loaded. All conditions must be chained with an &&
(logical and). They should not contain additional parenthesis or other logical operators such as ||
(logical or).
To convert the view into a regular array one can use collect
, Array
or regular indexing. As in julia, views of scalars are wrapped into a zero dimensional arrays which can be dereferenced by using []
. Modifying a view will modify the underlying file (if the file is opened as writable, otherwise an error is issued).
As for any view, one can use parentindices(vsubset)
to get the indices matching a select query.
Examples
Create a sample file with random data:
using NCDatasets, Dates
using CommonDataModel: @select
# or
# using NCDatasets: @select
fname = "sample_file.nc"
lon = -180:180
lat = -90:90
time = DateTime(2000,1,1):Day(1):DateTime(2000,1,3)
SST = randn(length(lon),length(lat),length(time))
ds = NCDataset(fname,"c")
defVar(ds,"lon",lon,("lon",));
defVar(ds,"lat",lat,("lat",));
defVar(ds,"time",time,("time",));
defVar(ds,"SST",SST,("lon","lat","time"));
# load by bounding box
v = @select(ds["SST"],30 <= lon <= 60 && 40 <= lat <= 80)
# substitute a local variable in condition using $
lonr = (30,60) # longitude range
latr = (40,80) # latitude range
v = @select(ds["SST"],$lonr[1] <= lon <= $lonr[2] && $latr[1] <= lat <= $latr[2])
# You can also select based on `ClosedInterval`s from `IntervalSets.jl`.
# Both 30..60 and 60 ± 20 construct `ClosedInterval`s, see their documentation for details.
lon_interval = 30..60
lat_interval = 60 ± 20
v = @select(ds["SST"], lon ∈ $lon_interval && lat ∈ $lat_interval)
# get the indices matching the select query
(lon_indices,lat_indices,time_indices) = parentindices(v)
# get longitude matchting the select query
v_lon = v["lon"]
# find the nearest time instance
v = @select(ds["SST"],time ≈ DateTime(2000,1,4))
# find the nearest time instance but not earlier or later than 2 hours
# an empty array is returned if no time instance is present
v = @select(ds["SST"],time ≈ DateTime(2000,1,3,1) ± Hour(2))
close(ds)
Any 1D variable with the same dimension name can be used in @select
. For example, if we have a time series of temperature and salinity, the temperature values can also be selected based on salinity:
using NCDatasets, Dates
using CommonDataModel: @select
fname = "sample_series.nc"
# create a sample time series
time = DateTime(2000,1,1):Day(1):DateTime(2009,12,31)
salinity = randn(length(time)) .+ 35
temperature = randn(length(time))
NCDataset(fname,"c") do ds
defVar(ds,"time",time,("time",));
defVar(ds,"salinity",salinity,("time",));
defVar(ds,"temperature",temperature,("time",));
end
ds = NCDataset(fname)
# load all temperature data from January where the salinity is larger than 35.
v = @select(ds["temperature"],Dates.month(time) == 1 && salinity >= 35)
# this is equivalent to:
v2 = ds["temperature"][findall(Dates.month.(time) .== 1 .&& salinity .>= 35)]
v == v2
# returns true
close(ds)
For optimal performance, one should try to load contiguous data ranges, in particular when the data is loaded over HTTP/OPeNDAP.