DimensionalDataModule

DimensionalData

Build StatusCodecov

DimensionalData.jl provides tools and abstractions for working with datasets that have named dimensions. It's a pluggable, generalised version of AxisArrays.jl with a cleaner syntax, and additional functionality found in NamedDimensions.jl. It has similar goals to pythons xarray, and is primarily written for use with spatial data in GeoData.jl.

Status

This is a work in progress under active development, it may be a while before the interface stabilises and things are fully documented.

Dimensions

Dimensions are just wrapper types. They store the dimension index and define details about the grid and other metadata, and are also used to index into the array, wrapping a value or a Selector. X, Y, Z and Ti are the exported defaults.

A generalised Dim type is available to use arbitrary symbols to name dimensions. Custom dimensions can be defined using the @dim macro.

We can use dim wrappers for indexing, so that the dimension order in the underlying array does not need to be known:

a[X(1:10), Y(1:4)]

The core component is the AbstractDimension, and types that inherit from it, such as Time, X, Y, Z, the generic Dim{:x} or others you define manually using the @dim macro.

Dims can be used for indexing and views without knowing dimension order: a[X(20)], view(a, X(1:20), Y(30:40)) and for indicating dimesions to reduce mean(a, dims=Time), or permute permutedims(a, [X, Y, Z, Time]) in julia Base and Statistics functions that have dims arguments.

Selectors

Selectors find indices in the dimension based on values At, Near, or Between the index value(s). They can be used in getindex, setindex! and view to select indices matching the passed in value(s)

  • At(x) : get indices exactly matching the passed in value(s)
  • Near(x) : get the closest indices to the passed in value(s)
  • Between(a, b) : get all indices between two values (inclusive)

We can use selectors with dim wrappers:

a[X(Between(1, 10)), Y(At(25.7))]

Without dim wrappers selectors must be in the right order:

usin Unitful
a[Near(23u"s"), Between(10.5u"m", 50.5u"m")]

It's easy to write your own custom Selector if your need a different behaviour.

Example usage:

using Dates, DimensionalData
timespan = DateTime(2001,1):Month(1):DateTime(2001,12)
A = DimensionalArray(rand(12,10), (Ti(timespan), X(10:10:100)))

julia> A[X(Near(35)), Ti(At(DateTime(2001,5)))]
0.658404535807791

julia> A[Near(DateTime(2001, 5, 4)), Between(20, 50)]
DimensionalArray with dimensions:
 X: 20:10:50
and referenced dimensions:
 Time (type Ti): 2001-05-01T00:00:00
and data: 4-element Array{Float64,1}
[0.456175, 0.737336, 0.658405, 0.520152]

Dim types or objects can be used instead of a dimension number in many Base and Statistics methods:

Methods where dims can be used containing indices or Selectors

getindex, setindex!view

Methods where dims can be used

  • size, axes, firstindex, lastindex
  • cat
  • reverse
  • dropdims
  • reduce, mapreduce
  • sum, prod, maximum, minimum,
  • mean, median, extrema, std, var, cor, cov
  • permutedims, adjoint, transpose, Transpose
  • mapslices, eachslice

Example usage:

A = DimensionalArray(rand(20,10), (X, Y))
size(A, Y)
mean(A, dims=X)
std(A; dims=Y())

To learn how to use this package, see the Crash course.