A dictionary of common names of dimensions (as strings) to actual dimension types.

ClimArray(A::Array, dims::Tuple; name = "", attrib = nothing)

ClimArray is a structure that contains numerical array data bundled with dimensional information, a name and an attrib field (typically a dictionary) that holds general attributes. You can think of ClimArray as a in-memory representation of a CFVariable.

At the moment, a ClimArray is using DimArray from DimensionalData.jl, and all basic handling of ClimArray is offered by DimensionalData (see below).

ClimArray is created by passing in standard array data A and a tuple of dimensions dims. See ncread to automatically create a ClimArray from a .nc file. For obtaining the raw numeric values of a ClimArray or any of its dimensions, use the function gnv.


using ClimateBase, Dates
Time = ClimateBase.Ti # more intuitive name for time dimension
lats = -90:5:90
lons = 0:10:359
t = Date(2000, 3, 15):Month(1):Date(2020, 3, 15)
# dimensional information:
dimensions = (Lon(lons), Lat(lats), Time(t))
data = rand(36, 37, 241) # numeric data
A = ClimArray(data, dimensions)

Space information is represented by a single dimension Coord, whose elements are coordinates, i.e. 2-element SVector(longitude, latitude). Each coordinate represents the center of an arbitrary polygon in space. The actual limits of each polygon are not included in the dimension for performance reasons.

In statistical functions such as spaceagg, it is assumed that entry of the coordinates covers an equal amount of area. If this is not the case, you can simply provide an additional weights vector which would correspond to the area covered.

This dimension also allows indexing by latitude ranges, e.g. you can do

A # some `ClimArray` with a `Coord` dimension

Most functions of ClimateBase.jl implicitly assume that the coordinates are sorted by latidude. You can achieve this with the following code:

A # some `ClimArray` with a `Coord` dimension
coords = gnv(dims(A, Coord))
si = sortperm(coords; by = reverse)
A = A[Coord(si)]

This is done automatically by ncread.


Space information is represented by two orthogonal dimensions Lon, Lat, one being longitude and the other being latitude.

climplot(A::ClimArray; kwargs...) → fig, ax, el, cb

Main plotting function that dispatches to climscatter! if A has an CoordinateSpace dimension, or to climsurface! for OrthogonalSpace.

Return the figure, axis, plotted element, and colorbar.

Optionally you can provide keyword scatter = true to force using the scatterplot.

Plotting from ClimateBase.jl also works with Observables that enclose a ClimArray. You can update the values of the observable with another ClimArray with the same spatial dimension, and the plot will be updated. See documentation online for examples.

collapse(f, A, dim)

Reduce A towards dimension dim using the collapsing function f (e.g. mean). This means that f is applied across all other dimensions of A, each of which are subsequently dropped, leaving only the collapsed result of A vs. the remaining dimension.

create_dims(ds::NCDatasets.AbstractDataset, dnames, cfvar, sel = selecteverything(A))

Create a tuple of Dimensions from the dnames (tuple of strings).

dropagg(f, A::ClimArray, d [, W])

Apply statistics/aggregating function f (e.g. sum or mean) on array A across dimension(s) d and drop the corresponding dimension(s) from the result (Julia inherently keeps singleton dimensions).

If A is one dimensional, dropagg will return the single number of applying f(A).

Optionally you can provide statistical weights in the form of a W::ClimArray. W must either have same size as A or have only one dimension and satisfy length(W) == length(dims(A, d)) (i.e., a weight for each value of the dimension d). The latter case can only work when d is a single dimension. See also missing_weights for (properly) dealing with data that have missing values.

gnv(object) → x

Short for "get numeric value", this function will return the pure numeric value(s) of the given object. Convenience function for quickly getting the numeric data of dimensional arrays or dimensions.


Converts height above ground [m] to pressure [Pa]. The calculation is done under the assumption of hydrostatic equilibrium with a constant lapse rate of -6.5 K/km, starting from a surface temperature T0=288.15 K. See also equations 39 and 40 in: Berberan-Santos, M. N., Bodunov, E. N., & Pogliani, L. (1997). On the barometric formula. American Journal of Physics, 65(5), 404-412.

hemisphere_indices(coords) → nhi, shi

Return the indices of coordinates belonging to the north and south hemispheres. Coordinates with latitude exactly 0 are included in both hemispheres.

hemispheric_functions(A::ClimArray) → north, south

Return two arrays north, south, by splitting A to its northern and southern hemispheres, appropriately translating the latitudes of south so that both arrays have the same latitudinal dimension (and thus can be compared and do opperations between them).

hemispheric_means(A [,W]) → nh, sh

Return the (proper) averages of A over the northern and southern hemispheres. Notice that this function explicitly does both zonal as well as meridional averaging. Use hemispheric_functions to just split A into two hemispheres.

Optionally provide weights W that need to have the same structure as spaceagg.

insolation(t, ϕ; kwargs...)

Calculate daily averaged insolation in W/m² at given time and latitude t, φ. φ is given in degrees, and t in days (real number or date).


Ya = DAYS_IN_ORBIT # = 365.26 # days
t_VE = 76.0 # days of vernal equinox
S_0 = 1362.0 # W/m^2
e=0.017 # eccentricity
interpolate_height2pressure(A::ClimArray, pressure_levels::Vector; extrapolation_bc=NaN )

Return a ClimArray where the vertical coordinate is pressure. Pressure levels are calculated with the height2pressure function, based on hydrostatic equilibrium, and then interpolated to the given levels.

A: ClimArray with height above ground [m] as height coordinate. pressure_levels: Vector which contains the pressure levels in Pascal that A should be interpolated on. extrapolation_bc: Extrapolation is set to NaN by default, but can be any value or linear extrapolation with extrapolation_bc = Line(). For other extrapolation methods use the Interpolations package.

interpolate_pressure2height(A::ClimArray,heights::Vector; extrapolation_bc=NaN)

Return a ClimArray where the vertical coordinate is height above ground. Height above ground is calculated with the pressure2height function, based on hydrostatic equilibrium, and then interpolated to the given heights.

A: ClimArray with pressure [Pa] as height coordinate. heights: Vector which contains the heights that A should be interpolated on in meters above ground. extrapolation_bc: Extrapolation is set to NaN by default, but can be any value or linear extrapolation with extrapolation_bc = Line(). For other extrapolation methods use the Interpolations package.

interpolation2pressure(A::ClimArray, P::ClimArray, plevels::Vector; kwargs...) → B

Interpolate A, which has some arbitrary height dimension (height, model levels, etc.), into a new B::ClimArray where the vertical dimension is pressure. P is another ClimArray, that has the pressure values in Pascal, and otherwise same spatiotemporal structure as A. The plevels variable denotes which pressure levels B should have.

The keword vertical_dim = Hei() denotes which dimension of A denotes "height". The keyword extrapolation_bc = Line() configures extrapolation. It is linear by default, but can be any value such as extrapolation_bc = NaN. For other extrapolation methods use the Interpolations.jl package.

Keyword descending = true specifies whether pressure is ordered descendingly or ascendingly along the vertical dimension.


Return the latitude-mean A (mean across dimension Lat). This function properly weights by the cosine of the latitude.

lon_distance(λ1, λ2, Δλ = 360) → δ

Calculate distance δ (also in degrees) between longitudes λ1, λ2, but taking into account the periodic nature of longitude, which has period Δλ = 360ᵒ.

longitude_circshift(X::ClimArray [, l]; wrap = true) → Y::ClimArray

Perform the same action as Base.circshift, but only for the longitudinal dimension of X with shift l. If wrap = true the longitudes are wrapped to (-180, 180) degrees using the modulo operation.

If l is not given, it is as much as necessary so that all longitudes > 180 are wrapped.

lonlatfirst(A::ClimArray, args...) → B

Permute the dimensions of A to make a new array B that has first dimension longitude, second dimension latitude, with the remaining dimensions of A following (useful for most plotting functions). Optional extra dimensions can be given as args..., specifying a specific order for the remaining dimensions.


B = lonlatfirst(A)
C = lonlatfirst(A, Time)
maxyearspan(A::ClimArray) = maxyearspan(dims(A, Time))
maxyearspan(t::Vector{<:DateTime}) → i

Find the maximum index i of t so that t[1:i] covers exact(*) multiples of years.

(*) For monthly spaced data i is a multiple of 12 while for daily data we find the largest possible multiple of DAYS_IN_ORBIT.


Return the value that represents "missing" data in A, according to A's metadata. If A does not have the _FillValue metadata, return 0 instead.

missing_weights(A::ClimArray, val = missing_val(A)) → B, W

Generate a new array B with values like A, but with A's missing values replaced with val. Also generate an array of weights, which has the value 0 when A had missing, and the value 1 otherwise.

The output of this function should be used in conjunction with any of ClimateBase.jl aggregating functions like spacemean, timemean, ..., when your data have missing values which you want to completely skip during the aggregation process.

This function returns A, nothing if A has no missing values.


Convert x to the amount of months (starting from 0).

monthday_indices(times, date = times[1])

Find the indices in times at which the date in times gives the same day and month as date.

monthlyagg(A::ClimArray, f = mean; mday = 15) -> B

Create a new array where the temporal information has been aggregated into months using the function f. The dates of the new array always have day number of mday.


Return a vector of daily spaced Dates that span the entire month that t belongs in.

ncdetails(file, io = stdout)

Print details about the .nc file in file on io.

ncread(file, var [, selection]; kwargs...) → A

Load the variable var from the file and convert it into a ClimArray with proper dimension mapping and also containing the variable attributes as a dictionary. Dimension attributes are also given to the dimensions of A, if any exist. See keywords below for specifications for unstructured grids.

file can be a string to a .nc file. Or, it can be an NCDataset, which allows you to lazily combine different .nc data (typically split by time), e.g.

using Glob # for getting all files
alldata = glob("toa_fluxes_*.nc")
file = NCDataset(alldata; aggdim = "time")
A = ClimArray(file, "tow_sw_all")

var is a String denoting which variable to load. For .nc data containing groups var can also be a tuple ("group_name", "var_name") that loads a specific variable from a specific group. In this case, the attributes of both the group and the CF-variable are attributed to the created ClimArray.

Optionally you can provide a selection for selecting a smaller part of the full array. The selection must be a tuple of indices that compose the selection and you must specify exactly as many ranges as the dimensions of the array and in the correct order. For example, if var corresponds to an array with three dimensions, such syntaxes are possible:

(:, :, 1:3)
(1:5:100, 1:1, [1,5,6])

The function ncsize can be useful for selection.

See also ncdetails, nckeys and ncwrite.

Smart loading

The following things make loading data with ncread smarter than directly trying to use NCDatasets.jl and then convert to some kind of dimensional container.

  1. Data are directly transformed into ClimArray, conserving metadata and dimension names.
  2. If there are no missing values in the data (according to CF standards), the returned array is automatically converted to a concrete type (i.e. Union{Float32, Missing} becomes Float32).
  3. Dimensions that are ranges (i.e. sampled with constant step size) are automatically transformed to a standard Julia Range type (which makes sub-selecting faster).
  4. Automatically deducing whether the spatial information is in an orthogonal grid or not, and creating a single Coord dimension if not.


  • name optionally rename loaded array.
  • grid = nothing optionally specify whether the underlying grid is grid = OrthogonalSpace() or grid = CoordinateSpace(), see Types of spatial information. If nothing, we try to deduce automatically based on the names of dimensions and other keys of the NCDataset.
  • lon, lat. These two keywords are useful in unstructured grid data where the grid information is provided in a separate .nc file. What we need is the user to provide vectors of the central longitude and central latitude of each grid point. This is done e.g. by
    ds = NCDataset("path/to/");
    lon = Array(ds["clon"]);
    lat = Array(ds["clat"]);
    If lon, lat are given, grid is automatically assumed CoordinateSpace().
ncsize(file, var)

Return the size of the variable of the .nc file without actually loading any data.

ncwrite(file::String, Xs; globalattr = Dict())

Write the given ClimArray instances (any iterable of ClimArrays or a single ClimArray) to a .nc file following CF standard conventions using NCDatasets.jl. Optionally specify global attributes for the .nc file.

The metadata of the arrays in Xs, as well as their dimensions, are properly written in the .nc file and any necessary type convertions happen automatically.

WARNING: We assume that any dimensions shared between the Xs are identical.

See also ncread.

no_hour_datetype(d::TimeType) → D

Return a type D that contains no hour (or less) information, if possible.

otheridxs(A::ClimArray, Dim)

Return an iterator of indices, that when used can access all indices of A except those belonging to dimension(s) Dim.

For example, if A has dims (Lon, Lat, Time) you can get all timeseries of A:

for i in otheridxs(A, Time())
    x = A[i...] # this is a timeseries (Vector) for each lon × lat combination

or all time+latitude slices with

for i in otheridxs(A, (Time(), Lat()))
    x = A[i...] # matrix of time × latitude slice for each longitude

(notice that splatting i... is necessary because otheridxs generates tuples)


Converts pressure [Pa] to height above ground [m]. The calculation is done under the assumption of hydrostatic equilibrium with a constant lapse rate of -6.5 K/km, starting from a surface temperature T0=288.15 K. See also equations 39 and 40 in: Berberan-Santos, M. N., Bodunov, E. N., & Pogliani, L. (1997). On the barometric formula. American Journal of Physics, 65(5), 404-412.

quantile_space(A, B; n = 50) → Aq, Bq, q

Express the array B into the quantile space of A. B, A must have the same indices.

The array B is binned according to the quantile values of the elements of A. n bins are created in total and the bin edges p are returned. The i-th bin contains data whose A-quantile values are ∈ [q[i], q[i+1]). The indices of these A values are used to group B.

In each bin, the binned values of A, B are averaged, resulting in Aq, Bq.

The amount of datapoints per quantile is by definition length(A)/n.

realtime_days(t::AbstractVector{<:TimeType}, T = Float32)

Convert the given sequential date time vector t in a vector in a format of "real time", where time is represented by real numbers, increasing cumulatively, as is the case when representing a timeseries x(t). As only differences matter in this form, the returned vector always starts from 0. The measurement unit of time here is days.

For temporal sampling less than daily return realtime_milliseconds(t) ./ (24*60*60*1000).


julia> t = Date(2004):Month(1):Date(2004, 6)

julia> realtime_days(t)
6-element Vector{Float32}:
realtime_milliseconds(t::AbstractArray{<:TimeType}, T = Float64)

Similar with realtime_days, but now the measurement unit is millisecond. For extra accuracy, direct differences in t are used.

sametimespan(Xs; mintime = nothing, maxtime = nothing) → Ys

Given a container of ClimArrays, return the same ClimArrays but now accessed in the Time dimension so that they all span the same time interval. Also works for dictionaries with values ClimArrays.

Optionally you can provide Date or DateTime values for the keywords mintime, maxtime that can further limit the minimum/maximum time span accessed.

sametimespan takes into consideration the temporal sampling of the arrays for better accuracy.

season(date) → s::Int

Return the season of the given date, 1 for DJF, 2 for MAM, 3 for JJA, 4 for SON. Complements functions like Dates.year, Dates.month,

seasonal_decomposition(A::ClimArray, fs = [1, 2]) → seasonal, residual

Decompose A into a seasonal and residual components, where the seasonal contains the periodic parts of A, with frequencies given in fs, and residual contains what's left.

fs is measured in 1/year. This function works even for non-equispaced time axis (e.g. monthly averages) and uses LPVSpectral.jl and SignalDecomposition.jl.

seasonality(t, x; y0 = year(t[1])) → dates, vals

Calculate the "seasonality" of a vector x defined with respect to a datetime vector t and return dates, vals. dates are all unique dates present in t disregarding the year (so only the month and day are compared). The dates have as year entry y0. vals is a vector of vectors, where vals[i] are all the values of x that have day and month same as dates[i]. The elements of vals are sorted as encountered in x.

Typically one is interested in mean.(vals), which actually is the seasonality, and std.(vals) which is the interannual variability at each date.

seasonality(A::ClimArray) → dates, vals

If given a ClimArray, then the array must have only one dimension (time).

seasonalyagg(A::ClimArray, f = mean) -> B

Create a new array where the temporal information has been aggregated into seasons using the function f. By convention, seasons are represented as Dates spaced 3-months apart, where only the months December, March, June and September are used to specify the date, with day 1.

sinusoidal_continuation(T::ClimArray, fs = [1, 2]; Tmin = -Inf, Tmax = Inf)

Fill-in the missing values of spatiotemporal field T, by fitting sinusoidals to the non-missing values, and then using the fitted sinusoidals for the missing values.

Frequencies are given per year (frequency 2 means 1/2 of a year).

Tmin, Tmax limits are used to clamp the result into this range (no clamping in the default case).

spaceagg(f, A::ClimArray, W = nothing)

Aggregate A using function f (e.g. mean, std) over all available space (i.e. longitude and latitude) of A, weighting every part of A by its spatial area.

W can be extra weights, to weight each spatial point with. W can either be a ClimArray with same spatial information as A, or having exactly same dimensions as A.

spacemean(A::ClimArray [, W]) = spaceagg(mean, A, W)

Average A over its spatial coordinates. Optionally provide statistical weights in W.

spatialidxs(A::ClimArray) → idxs

Return an iterable that can be used to access all spatial points of A with the syntax

idxs = spatialidxs(A)
for i in idxs
    slice_at_give_space_point = A[i...]

Works for all types of space (... is necessary because i is a Tuple).

temporal_sampling(x) → symbol

Return the temporal sampling type of x, which is either an array of Dates or a dimensional array (with Time dimension).

Possible return values are:

  • :hourly, where the temporal difference between successive entries is exactly 1 hour.
  • :daily, where the temporal difference between successive entries is exactly 1 day.
  • :monthly, where all dates have the same day, but different month.
  • :yearly, where all dates have the same month and day, but different year.
  • :other, which means that x doesn't fall to any of the above categories.
temporalranges(A::ClimArray, f = Dates.month) → r
temporalranges(t::AbstractVector{<:TimeType}}, f = Dates.month) → r

Return a vector of ranges so that each range of indices are values of t that belong in either the same month, year, day, or season, depending on f. f can take the values: Dates.year, Dates.month, or season (all are functions).

Used in e.g. monthlyagg, yearlyagg or seasonalyagg.

timeagg(f, A::ClimArray, W = nothing)

Perform a proper temporal aggregation of the function f (e.g. mean, std) on A where:

  • Only full year spans of A are included, see maxyearspan (because most processes are affected by yearly cycle, and averaging over an uneven number of cycles typically results in artifacts)
  • Each month in A is weighted with its length in days (for monthly sampled data)

If you don't want these features, just do dropagg(f, A, Time, W). This is also done in the case where the time sampling is unknown.

W are possible statistical weights that are used in conjuction to the temporal weighting, to weight each time point differently. If they are not a vector (a weight for each time point), then they have to be a dimensional array of same dimensional layout as A (a weight for each data point).

See also monthlyagg, yearlyagg, seasonalyagg.

timeagg(f, t::Vector, x::Vector, w = nothing)

Same as above, but for arbitrary vector x accompanied by time vector t.

transform_to_coord(A::ClimArray) → B

Transform given A to a new B::ClimArray so that the Lon, Lat dimensions in A are transformed to a Coord dimension in B.

tropics_extratropics(A::ClimArray; lower_lat=30, higher_lat=90) → tropics, extratropics

Separate the given array into two arrays: one having latitudes ℓ ∈ [-lowerlat, +lowerlat], and one having [-higherlat:-lowerlat, lowerlat:higherlat].

uniquelats(A::ClimArray) → idxs, lats
uniquelats(c::Vector{<:AbstractVector}) → idxs, lats

Find the unique latitudes of A. Return the indices (vector of ranges) that each latitude in lats covers, as well as the latitudes themselves.

value_space(A, B, C; Arange, Brange) → Cmeans, bin_indices

Express the array C into the joint value space of A, B. A, B, C must have the same indices.

A 2D histogram is created based on the given ranges, the and elements of C are binned according to the values of A, B. Then, the elements are averaged, which returns a matrix Cmeans, defined over the joint space S = Arange × Brange. bin_indices is also a matrix (with vector elements). Cmeans is NaN for bins without any elements.

value_space(A, B; Arange) → Bmeans, bin_indices

Express the array B into the value space of A. B, A must have the same indices. This means that A is binned according to the given Arange, and the same indices that binned A are also used to bin B. The i-th bin contains data whose A values ∈ [Arange[i], Arange[i+1]). In each bin, the binned values of B are averaged, resulting in Bmeans.

Elements of A that are not ∈ Arange are skipped. The returned bin_indices are the indices in each bin (hence, the weights for the means are just length.(bin_indices))

By default Arange = range(minimum(A), nextfloat(maximum(A)); length = 100).

yearlyagg(A::ClimArray, f = mean) -> B

Create a new array where the temporal information has been aggregated into years using the function f. By convention, the dates of the new array always have month and day number of 1.

zonalmean(A::ClimArray [, W])

Return the zonal mean of A. Works for both OrthogonalSpace as well as CoordinateSpace. Optionally provide statistical weights W. These can be the same size as A or only having the same latitude structure as A.