DIVAnd.TimeSelectorRunningAverage
— TypeTS = TimeSelectorRunningAverage(times,window)
The structure TS
handles the time aggregation based on vector of central times
and the time window given in days. Observations at the i-th time instance will be selected if the dates is between times[i]-w/2
and time[i]+w/2
where w
is the time window expressed as days.
DIVAnd.TimeSelectorYearListMonthList
— TypeTS = TimeSelectorYearListMonthList(yearlists,monthlists)
The structure TS
handles the time aggregation based on yearlists
and monthlists
. yearlists
is a vector of ranges (containing start and end years), for example [1980:1989,1990:1999,2000:2009,2010:2019]
.
monthlists
is a vector of two-element vector (containing start and end months), for example [1:3,4:6,7:9,10:12]
.
The upper bound of a yearlist
and monthlist
element is considered inclusive. The range of years of 2010:2019 consideres all years upto and including the year 2009.
If a month range spans beyond December, then all Months must be specified, e.g. example [2:4,5:6,7:9,[10,11,12,1]]
or [2:4,5:6,7:9,[10:12;1]]
. However using [2:4,5:6,7:9,10:1]
(bug!) will result in an empty month range.
Example
# seasonal climatology using all data from 1900 to 2020
# for winter (December-February), spring, summer, autumn
TS = DIVAnd.TimeSelectorYearListMonthList([1900:2020],[[12,1,2],[3,4,5],[6,7,8],[9,10,11]])
Note that for seasonal analyses, DIVAnd will only select observations within the provided year range (and not pick year-1 for December), for example
using DIVAnd
TS = DIVAnd.TimeSelectorYearListMonthList([1900:2020],[[12,1,2],[3,4,5],[6,7,8],[9,10,11]])
DIVAnd.select(TS,1,[DateTime(1899,12,31), DateTime(1900,1,1)])
This returns [0,1]
i.e. the 1st observation is not used, while the second is used. There is no special case for the month 12.
If the data from e.g. December 1899 should be considered for a seasonal analysis for the year 1900-2020, one should shift the observations as follows:
obstime_shifted = copy(obstime)
obstime_shifted[Dates.month.(obstime) .== 12] .+= Dates.Year(1)
The analysis function should then use obstime_shifted
while for the function saveobs
is it recommended to use the original obstime
vector.
DIVAnd.statevector
— Methodsv = statevector_init((mask1, mask2, ...))
Initialize structure for packing and unpacking multiple variables given their corresponding land-sea mask.
Input: mask1, mask2,...: land-sea mask for variable 1,2,... Sea grid points correspond to one and land grid points to zero. Every mask can have a different shape.
Output: sv: structure to be used with statevectorpack and statevectorunpack.
Note: see also statevectorpack, statevectorunpack
Author: Alexander Barth, 2009,2017 <a.barth@ulg.ac.be> License: GPL 2 or later
DIVAnd.DIVAnd_GCVKii
— FunctionComputes an estimate of the mean value of the diagonal of HK using GCV and the already solved analysisand it structure s
Kii = DIVAnd_GCVKii(s);
DIVAnd.DIVAnd_GCVKiiobs
— FunctionKii = DIVAnd_GCVKiiobs(s)
Computes an estimate of the mean value of the diagonal of HK using GCV and the already solved analysis and it structure s Only using real data locations.
DIVAnd.DIVAnd_Lpmnrange
— MethodLpmnrange = DIVAnd_Lpmnrange(pmn,len);
In each direction, searches for the minimum and maximum value of the length scale times the metric in this direction
So it basically looks at the worst and the best resolution found in the grid
Input:
pmn
: scale factor of the grid. pmn is a tuple with n elements. Every element represents the scale factor of the corresponding dimension. Its inverse is the local resolution of the grid in a particular dimension.len
: correlation length
Output:
Lpmnrange
: Array of range tuples (minimum and maximum of L times metric)
DIVAnd.DIVAnd_adaptedeps2
— MethodDIVAnd_adaptedeps2(yo, residual, diagR, ignoreobs)
Using Deroziers adaptive approach provides a multiplicative factor for the current epsilon2 value so that factor*epsilon2 is a better estimate of the R matrix.
yo
the observations (minus the background), residual
the obserations minus the analysis, diagR
, the diagonal of the rel. obs. error covariance matrix and ignoreobs
is true if an observation is out of the grid or should be ignored for other reasons.
For unscaled R and assuming that the background is zero, Deroziers showed that:
mean((yᵒ - Hxᵃ) ⋅ yᵒ) = ϵ² mean(yᵒ ⋅ yᵒ) = σ² + ϵ²
mean(yᵒ ⋅ yᵒ) / mean((yᵒ - Hxᵃ) ⋅ yᵒ) = σ²/ϵ² + 1 λ = σ²/ϵ² = 1 - mean(yᵒ ⋅ yᵒ) / mean((yᵒ - Hxᵃ) ⋅ yᵒ)
ϵ² / σ² = 1 / λ
DIVAnd.DIVAnd_adaptedeps2
— Methodfactor = DIVAnd_adaptedeps2(s,fi);
Input:
s
: structure returned byDIVAndrun
fi
: analysis returned byDIVAndrun
Output:
factor
: multiplicative factor to apply to epsilon2
Using Deroziers adaptive approach provides a multiplicative factor for the current epsilon2 value so that factor*epsilon2 is a better estimate of the R matrix. If you cannot use DIVAndrun
but use DIVAndgo
, the latter provides automatically this pamater as result.
DIVAnd.DIVAnd_addc
— Methods = DIVAnd_addc(s,c)
Add a constraint c
to the cost function defined by s
. The structure s
is typically created by DIVAnd_background and the contrain c
has the following fields: R
(a covariance matrix), H
(extraction operator) and yo
(specified value for the constrain). The added contrain Jc(x) is quadratic and has the following structure.
Jc(x) = (H x - yo)ᵀ R⁻¹ (H x - yo)
DIVAnd.DIVAnd_aexerr
— Methodaexerr,Bref,fa,sa = DIVAnd_aexerr(mask,pmn,xi,x,f,len,epsilon2;...);
Input: same as for DIVAndrun
mask
: binary mask delimiting the domain. true is inside and false outside. For oceanographic application, this is the land-sea mask.pmn
: scale factor of the grid. pmn is a tuple with n elements. Every element represents the scale factor of the corresponding dimension. Its inverse is the local resolution of the grid in a particular dimension.xi
: tuple with n elements. Every element represents a coordinate of the final grid on which the observations are interpolatedx
: tuple with n elements. Every element represents a coordinate of the observationsf
: value of the observations minus the background estimate (m-by-1 array). (see note)len
: correlation lengthepsilon2
: error variance of the observations (normalized by the error variance of the background field).epsilon2
can be a scalar (all observations have the same error variance and their errors are decorrelated), a vector (all observations can have a difference error variance and their errors are decorrelated) or a matrix (all observations can have a difference error variance and their errors can be correlated). Ifepsilon2
is a scalar, it is thus the inverse of the signal-to-noise ratio.
Optional input arguments specified as keyword arguments also as for DIVAnd
Output:
aexerr
: the almost exact errorBref
: the background error for error scaling by backgroundaexerr./Bref
fa
: the analysis (with low impact fake data): DO NOT USE UNLESS YOU KNOW WHAT YOU ARE DOINGsa
: the associated structure
Compute a variational analysis of arbitrarily located observations to calculate the almost exact error
DIVAnd.DIVAnd_averaged_bg
— Methodfma,faanom = DIVAnd_averaged_bg(mask,pmn,xi,x,f,len,epsilon2,toaverage;moddim=[])
Input:
As for DIVAndrun, including all dimensions before averaging
additional argument:
- toaverage: Array of ndims of boolean telling if in the corresponding direction averaging must be done
Presently NO optional arguments from DIVAndrun supported except moddim
Output:
- fma: Analysis where in the directions where toaverage is true, the same value is found
- faanom: Data anomalies when the analysis is subtracted from the input field.
DIVAnd.DIVAnd_background
— FunctionForm the inverse of the background error covariance matrix. s = DIVAnd_background(mask,pmn,Labs,alpha,moddim) Form the inverse of the background error covariance matrix with finite-difference operators on a curvilinear grid
Input:
- mask: binary mask delimiting the domain. 1 is inside and 0 outside. For oceanographic applications, this is the land-sea mask.
- pmn: scale factor of the grid.
- Labs: correlation length
- alpha: dimensional coefficients for norm, gradient, laplacian,... alpha is usually [1,2,1] in 2 dimensions.
Output:
- s: structure containing
- s.iB: inverse of the background error covariance
- s.L: spatially averaged correlation length
- s.n: number of dimensions
- s.coeff: scaling coefficient such that the background variance diag(inv(iB)) is one far away from the boundary.
DIVAnd.DIVAnd_background_components
— MethodiB = DIVAnd_background_components(s,D,alpha; kwargs...)
Form the different components of the background error covariance matrix. Compute the components of the background error covariance matrix s.iB_
and their sum based on alpha (the adimensional coefficients for norm, gradient, laplacian,...).
If the optional arguments contains btrunc, the calculation of iB is limited to the term up and including alpha[btrunc]
DIVAnd.DIVAnd_bc_stretch
— FunctionDIVAnd.DIVAnd_constr_constcoast
— Methodc = DIVAnd_constr_constcoast(mask,eps2)
Constrain imposing that the gradients along the coastline defined by mask
are close to zero controlled by the parameter eps2
which represents the scaled error variance on the gradients.
This constrain is useful to indirectly impose that a stream function does not have a current component perpendicular to the coastline.
DIVAnd.DIVAnd_constr_fluxes
— Methodc = DIVAnd_constr_fluxes(s,topographyforfluxes,fluxes,epsfluxes,pmnin)
Creates integral constraints for each latitude so that a barotropic correction step leads to an additional flux prescribed.
Input: s: structure topographyforfluxes: tuple of two 2D arrays with the bottom topography used for the flux calculations DO NOT USE NaN in it. If an array is replaced by a scalar zero, the constraint is not used. for fluxes calculated with geostrophy apply g/f to h fluxes: tuple of two arrays of fluxes. The barotropic correction on elevation should be such that Sum over longitude at each latidute of Sum h δ(eta)/δx δx = - fluxes[1] Sum over latitude at each longitude of Sum h δ(eta)/δy δ y = -fluxes[2] WARNING: This has been coded to directly use geostrophy.jl output and flux directions epsfluxes: error variance on constraint. Scaling to be verified pmnin: metrics from the calling routine
Output: c: structure to be used by DIVAnd_addc with the following fields: R (a covariance matrix), H (extraction operator) and yo (specified value for the constrain).
DIVAnd.DIVAnd_constr_fractions
— Methodc = DIVAnd_constr_fractions(s,epsfractions)
Creates integral constraints so that fractions sum up to zero. It is assumed that the different fractions to be analyzed are in the last dimension of the geometry and are anomalies. To impose a total value different of zero (for example 100 if the observations are percentages), this needs to be incorporated in the reference field which must ensure that the total value is respected (in the example of the percentages for example a reference value of 20 for each of five variables, or 10 30 20 20 20)
Input: s: structure
epsfraction: error variance on constraint.
Output: c: structure to be used by DIVAnd_addc with the following fields: R (a covariance matrix), H (extraction operator) and yo (specified value for the constrain).
DIVAnd.DIVAnd_cpme
— Methodcpme = DIVAnd_cpme(mask,pmn,xi,x,f,len,epsilon2;...);
Input: Same as for DIVAndrun
mask
: binary mask delimiting the domain. true is inside and false outside.
For oceanographic application, this is the land-sea mask.
pmn
: scale factor of the grid. pmn is a tuple with n elements. Every element represents the scale factor of the corresponding dimension. Its inverse is the local resolution of the grid in a particular dimension.xi
: tuple with n elements. Every element represents a coordinate of the final grid on which the observations are interpolatedx
: tuple with n elements. Every element represents a coordinate of the observationsf
: value of the observations minus the background estimate (m-by-1 array). (see note)len
: correlation lengthepsilon2
: error variance of the observations (normalized by the error variance of the background field).epsilon2
can be a scalar (all observations have the same error variance and their errors are decorrelated), a vector (all observations can have a difference error variance and their errors are decorrelated) or a matrix (all observations can have a difference error variance and their errors can be correlated). Ifepsilon2
is a scalar, it is thus the inverse of the signal-to-noise ratio.keywords
: undocumented for the moment how to use iterative solver with coarser grid as preconditionner. seeDIVAndjog
forcsteps
,lmask
andalphapc
parameters
Optional input arguments specified as keyword arguments also as for DIVAnd
Output:
cpme
: the clever poor mans error
Perform an n-dimensional variational analysis of the observations f
located at the coordinates x
. The array cpme
represent the error field at the grid defined by the coordinates xi
and the scales factors pmn
. If you cannot run DIVAndrun
you can use DIVAndgo
with error field calculation :cpme
DIVAnd.DIVAnd_cpme_go
— Methoderri = DIVAnd_cpme_go(mask,pmn,xi,x,f,len,epsilon2; ...);
Input:
- Same arguments as DIVAndrun with in addition
MEMTOFIT=
: keyword controlling how to cut the domain depending on the memory remaining available for inversion (not total memory)RTIMESONESCALES=
: if you provide a tuple of length scales, data are weighted differently depending on the numbers of neighbours they have. Seeweight_RtimesOne
for details
Output:
erri
: relative error field using the clever poor man's error approach. Result on the same grid as fi.
ONLY USE THIS VERSION IF YOU CANNOT RUN DIVAndgo
with :cmpe
activated (or directly DIVAnd_cpme
if you can run DIVAndrun
)
DIVAnd.DIVAnd_cutter
— Methodwindowlist,csteps,lmask,alphapc = DIVAnd_cutter(Lpmnrange,gridsize,moddim,MEMTOFIT);
Creates a list of windows for subsequent domain decomposition. Also calculates already the subsampling steps csteps for the preconditionners as well as the mask lmask to apply to the length scales in the preconditionner, allowing to reduce the problem size
Input:
Lpmnrange
:gridsize
: number of points in each direction (size(mask))moddim
:MEMTOFIT
overlapfactor
: describes how many times the length scale is used for the overlapping. default is 3.3. use lower values ONLY for very good data coverage.
Output:
windowlist
: vector of tuples (iw1,iw2,isol1,isol2,istore1,istore2,) where(iw1,iw2)
correspond to the start and end indices in the (global) grid(isol1,isol2)
correspond to the start and end indices solution to be retained in the window (not all is retained due to overlapping) and(istore1,istore2)
correspond to the start and end indices of the solution relative to the global grid. They define thus where the local solution has to be stored in the combined global solution.csteps
: Array of steps for the coarse grid preconditionner.csteps
is zero for the direct solver.lmask
: Array of multiplication factors for length scale of preconditionneralphapc
: Norm defining coefficients for preconditionner
DIVAnd.DIVAnd_cv
— Functionbestfactorl,bestfactore, cvval,cvvalues, x2Ddata,y2Ddata,cvinter,xi2D,yi2D = DIVAnd_cv(mask,pmn,xi,x,f,len,epsilon2,nl,ne,method;...);
Performs a cross validation to estimate the analysis parameters (correlation length and signal-to-noise ratio).
Input
Same as for DIVAndrun
with three more parameters nl
,ne
and method
mask
: binary mask delimiting the domain. true is inside and false outside.
For oceanographic application, this is the land-sea mask.
pmn
: scale factor of the grid. pmn is a tuple with n elements. Every element represents the scale factor of the corresponding dimension. Its inverse is the local resolution of the grid in a particular dimension.xi
: tuple with n elements. Every element represents a coordinate of the final grid on which the observations are interpolatedx
: tuple with n elements. Every element represents a coordinate of the observationsf
: value of the observations minus the background estimate (m-by-1 array). (see note)len
: correlation lengthepsilon2
: error variance of the observations (normalized by the error variance of the background field).epsilon2
can be a scalar (all observations have the same error variance and their errors are decorrelated), a vector (all observations can have a difference error variance and their errors are decorrelated) or a matrix (all observations can have a difference error variance and their errors can be correlated). Ifepsilon2
is a scalar, it is thus the inverse of the signal-to-noise ratio.nl
: number of testing points around the current value of L.1
means one additional point on both sides of the current L.0
is allowed and means the parameter is not optimised.ne
: number of testing points around the current value of epsilon2.0
is allowed as fornl
method
: cross validation estimator method 1: full CV 2: sampled CV 3: GCV 0: automatic choice between the three possible ones, default valueOptional input arguments specified via keyword arguments are the same as for
DIVAnd
Output:
bestfactorl
: best estimate of the multiplication factor to apply to lenbestfactore
: best estimate of the multiplication factor to apply to epsilon2cvvales
: the cross validation values calculatedfactors
: the tested multiplication factorscvinter
: interpolated cv values for final optimisationX2Data, Y2Data
: coordinates of sampled cross validation inL,epsilon2
space . Normally only used for debugging or plottingXi2D, Yi2D
: coordinates of interpolated estimator . Normally only used for debugging or plotting
The output bestfactorl
and bestfactore
represent multiplication factors which should be applied to L
and epsilon2
.
The len
and epsilon2
provided should be close the real one as the tests will be performed around.
DIVAnd.DIVAnd_cvestimator
— Methodtheta = DIVAnd_cvestimator(s,residual)
Computes the cross validation estimator $(d-\hat{d})^T \mathbf R^{-1} (d-\hat{d}) / ( \mathbf 1^T \mathbf R^{-1} \mathbf 1)$ where the $\hat{d}$ is the analysis not using a data point.
DIVAnd.DIVAnd_datainboundingbox
— Methodxn,fn,indexes,Rn = DIVAnd_datainboundingbox(xi,x,f;Rmatrix=())
Input:
xi
: tuple with n elements. Every element represents a coordinate of the final grid on which the observations are interpolated
x
: tuple with n elements. Every element represents a coordinate of the observationsf
: value of the observationsRmatrix
: error variance of the observations (normalized by the error variance of the background field).epsilon2
can be a scalar (all observations have the same error variance and their errors are decorrelated), a vector (all observations can have a difference error variance and their errors are decorrelated) or a matrix (all observations can have a difference error variance and their errors can be correlated). Ifepsilon2
is a scalar, it is thus the inverse of the signal-to-noise ratio.
Output:
xn
: tuple with n elements. Every element represents a coordinate of the observations which falls in the bounding box defined by xi fn
: the corresponding data indexes:
the indexes in the original array retained Rn
: the new error variance
DIVAnd.DIVAnd_diagHK
— MethodComputes the diagonal terms of the so called hat-matrix HK, using the already solved analysis and it structure s. Warning: might take some time
diagonalterms = DIVAnd_diagHK(s);
DIVAnd.DIVAnd_diagHKobs
— Functiondiagonalterms = DIVAnd_diagHKobs(s)
Computes the diagonal terms of the so called hat-matrix HK, using the already solved analysis and it structure s. Warning: might take some time
This version only uses the real data (not those related to additional constraints)
DIVAnd.DIVAnd_diagapp
— Method errormap=DIVAnd_diagapp(P,pmn,len,sv;wheretocalculate=fill(true,size(pmn[1])),Hobs=(),Rmatrix=(),Binv=false,iterB=0)
Calculates an appriximation to the error map exploiting the fact that inv(P) is available via a cholesky decomposition Q'CC'Q where Q is a permutation matrix. The diagonal component i is then z'z where z= inv(Q'C) ei where ei is normally an array with zeros except in position i where it has a value of 1. Here we exploit that z in that case have only values in specific locations and hence if there is another 1 in ei at a very different location, we can make two calculation at the price of one by summing up the relevant parts of z'z only. And of course you can repeat the reasoning and covering the domain with well placed ones. Advantage compared to AEXERR: works with any number of data points. Should be particularly efficient in higher dimensions and situations where len are small compared to the domain size.
WARNING: if you provide the wheretocalculate array it WILL BE CHANGED IN PLACE
Input
P
: is the covariance matrix found in the structure s from the analysis s.P`pmn
: are the metrics of the gridlen
: the length scales usedsv
: the statevector from structure s : s.svwheretocalculate
: a boolean array of the same dimensions as pmn[1] specifying if in a point the error is to be calculated. Default is everywhereusefull if you are only interested in part of the error field or when calculating tiles with overlaps
Hobs
: s.obsconstrain.H only needed if Binv is trueRmatrix
: epsilon2 only needed if Binv is trueBinv
: boolean forcing the calculation of diag(B) by an iterative correction to the estimate of diag(inv(P))iterB
: number of iterations in case B is calculated, the higher epsilon2 the better the iterations converge, so something like 100/epsilon2 ?
Output
errormap
: array of the same dimensions as the analysis including the error variance at the point, relative to a constant background variance.diagB
: array of the same dimensions as the analysis including the spatial structure of the background variance.
DIVAnd.DIVAnd_erroratdatapoints
— Methoderrorvariance = DIVAnd_erroratdatapoints(s;restrictedlist=[])
Computes the error at the real data locations using the analysis structure s
If a restricedlist is provided erros are only calculated at the indexes where restricedlist==true
DIVAnd.DIVAnd_fill!
— MethodDIVAnd_fill!(A::AbstractArray,fillvalue)
Replace values in A equal to fillvalue (possibly NaN) with average of surrounding grid points
DIVAnd.DIVAnd_fittocpu
— Function stepsize,overlapping,isdirect = DIVAnd_fittocpu(Lpmnrange,gridsize,latercsteps,moddim,MEMTOFIT;forcedirect=false,overlapfactor=3.3);
Creates a list of windows for subsequent domain decomposition
Also calculates already the subsampling steps csteps
for the preconditionners
Input:
Lpmnrange
: for every dimension the minimum and maximum correlation length scaled by the local resolution (i.e. the product between L and pm (pn,...))gridsize
: number of points in each direction (size(mask))latercsteps
: coarsening steps used later if a lower resolution model is used for preconditioning.moddim
: modulo for cyclic dimension (vector with n elements). Zero is used for non-cyclic dimensions.memtofit
: parameter describing how much memory is expected to be available in Gbforcedirect
: if true forces direct solver even if iterative solver might allow for larger tilesoverlapfactor
: describes how many times the length scale is used for the overlapping. default is 3.3. use lower values ONLY for very good data coverage.
Output:
stepsize
: spatial (and temporal) shift in grid points between subdomains for every dimension (?)overlapping
: number of overlapping grid points for every dimensionisdirect
: true is the direct solver is activated
DIVAnd.DIVAnd_gradient
— MethodDx1,Dx2,...,Dxn = DIVAnd.DIVAnd_gradient(operatortype,mask,pmn,iscyclic)
Form the gradient using finite differences in all n-dimensions. mask
is a binary mask delimiting the domain. 1 is inside and 0 outside. For oceanographic application, this is the land-sea mask. pmn
is a tuple of arrays with the scale factor of the grid. The output Dx1,Dx2,...,Dxn
are sparse matrices represeting a gradient along different dimensions.
DIVAnd.DIVAnd_gradient
— MethodDx1,Dx2,...,Dxn = DIVAnd.DIVAnd_gradient(operatortype,mask,pmn,iscyclic)
Form the gradient using finite differences in all n-dimensions. mask
is a binary mask delimiting the domain. 1 is inside and 0 outside. For oceanographic application, this is the land-sea mask. pmn
is a tuple of arrays with the scale factor of the grid. The output Dx1,Dx2,...,Dxn
are sparse matrices represeting a gradient along different dimensions.
DIVAnd.DIVAnd_heatmap
— MethodComputes a heatmap based on locations of observations using kernel density estimation (probability density field whose integral over the domain is one)
dens,Ltuple,LCV,LSCV = DIVAnd_heatmap(mask,pmn,xi,x,inflation,Labs;Ladaptiveiterations=0,myheatmapmethod="DataKernel", optimizeheat=true,nmax=1000,otherargs...)
Input:
mask
: mask as usualpmn
: tuple of metrics as usualxi
: tuple of coordinates of the grid for the heatmapx
: tuple of coordinates of observationsinflation
: array generally of ones. For some applications an observation can carry a different weight which is then encoded in the arrayLabs
: the length scales for DIVAnd. Here their meaning is the spread (bandwidth) of the observations for the Kernel calculationif zero is provided, the routine applies an empirical estimate, returned in the Ltuple output.
Ladaptiveiterations
: adaptive scaling where the length scales are adapted on the data density already estimated. You can iterate. Default "0"optimizeheat
: boolean which can turn on or off an algorithmic optimisation. Results should be identical. Default is to optimizemyheatmapmethod
: can be "Automatic", "GridKernel" or "DataKernel" (Results should be very similar except near boundaries)nmax
: maximum number of data points. If actual data size is larger, approximatively nmax superobservations are calculated and a warning issued.otherargs...
: all other optional arguments DIVAndrun can take (advection etc)
Output:
dens
: data density field (integral is one)Ltuple
: The bandwidthth used (either the input value or the calculated ones)LCV
: Likelihood Cross Validation estimator value (the higher the better) leave one out approachLSCV
: Least Square Cross Validation estimator (the lower the better) leave one out approach
DIVAnd.DIVAnd_integral
— MethodComputes an N-dimensional volume integral
DIVAnd_integral(mask,pmn,fieldin)
Input:
mask
: mask as usualpmn
: tuple of metrics as usualfieldin
: field of the same dimensions as mask and which is integrated over the domain
Output:
integratedfield
: The integral
DIVAnd.DIVAnd_kernel
— Methodmu,K,len_scale = DIVAnd_kernel(n,alpha)
Return the analytical kernel and normalization factor.
Analytical (normalized) kernels K
for infinite domain in dimension n
and for coefficients alpha
and normalization factor mu
.
K(r)
is the kernel function (function of the normalized distance r
), len_scale
is the distance at which K(len_scale)
= 0.6019072301972346 (which is besselk(1,1))
DIVAnd.DIVAnd_laplacian
— MethodCreate the laplacian operator.
Lap = DIVAnd_laplacian(mask,pmn,nu,iscyclic)
Form a Laplacian using finite differences assumes that gradient is zero at "coastline"
Input: mask: binary mask delimiting the domain. 1 is inside and 0 outside. For oceanographic application, this is the land-sea mask. pmn: scale factor of the grid. nu: diffusion coefficient of the Laplacian field of the size mask or cell arrays of fields
Output: Lap: sparce matrix represeting a Laplacian
DIVAnd.DIVAnd_metric
— Methodpm,pn = DIVAnd_metric(lon,lat)
Compute metric scale factors pm
and pn
based on the arrays longitude lon
and latitude lat
. The variables pm and pn represent the inverse of the local resolution in meters using the mean Earth radius.
DIVAnd.DIVAnd_obs
— Methods = DIVAnd_obs(s,xi,x,R,I)
Include the constrain from the observations. It is assumed that each coordinate depends only on one index. If this is not the case, then matrix I must be provided.
Input: s: structure created by DIVAnd_background xi: coordinates of observations (tuple of vectors) x: coordinates of grid (tuple of arrays) R: obs. error covariance matrix (normalized) I (optional): fractional indexes of location of observation within the grid
Output: s: structure to be used by DIVAnd_factorize
Note: make sure not to mix Float32 and Float64 for DIVAnd_constrain.
DIVAnd.DIVAnd_obscovar
— MethodR = DIVAnd_obscovar(epsilon2,m)
Create a matrix representing the observation error covariance R of size m x m.
If epsilon2 is a scalar, then R = epsilon2 * I If epsilon2 is a vector, then R = diag(epsilon2) If epsilon2 is a matrix, then R = epsilon2
DIVAnd.DIVAnd_pc_none
— Methodfun = DIVAnd_pc_none(iB,H,R)
Dummy function for requiring that no preconditioner is used in DIVAnd.
See also: diavndpcsqrtiB
DIVAnd.DIVAnd_pc_sqrtiB
— MethodCompute a preconditioner using the Cholesky decomposition.
[M1,M2] = DIVAnd_pc_michol(iB,H,R)
Compute preconditioner matrices M1 and M2 based on the Cholesky decomposition of iB. The matrices H and R are not used. M2 is the transpose of M1 for this preconditioner.
DIVAnd.DIVAnd_qc
— Functionqcvalues = DIVAnd_qc(fi,s,method);
Perform a quality control of the observations using the interpolated field.
Input:
fi
: interpolated field from aDIVAndrun
executions
: corresponding structure returned byDIVAnd
method
: optional argument, which describes the method to be used:
1 as for standard cross validation, 3 as for GCV, 4 with CV estimator to be used outside the routine, 5 Poor man's GCV using data instead of random vector, 0 automatic selection of method.
Output
qcvalues
: quality check values, one for each data point.
The higher the value, the more suspect a data point is. Absolute values of qcvalues
might be not robust when analysis parameters are uncertain. The ranking is however quite robust.
If you cannot run DIVAndrun
but use DIVAndgo
(which does not provide a structure s at the output), the latter provides qcvalues
if you call DIVAndgo
with a keyword parameter QCMETHOD=
DIVAnd.DIVAnd_rectdom
— Methodmask,pmn,xyi = DIVAnd_rectdom(coord1,coord2,...)
Create a "rectangular" domain in n
dimensions with the coordinates coord1
coord2
... assuming a Catersian metric. This functions returns the mask mask
, the coordinates (xi,yi,...)
and the metric (pm,pn...)
.
For example:
julia> mask,(pm,pn),(xi,yi) = DIVAnd_rectdom(range(0,stop=1,length=50),linspace(0,stop=1,length=50))
DIVAnd.DIVAnd_residual
— Methoddataresidual = DIVAnd_residual(s,fi)
Computes the generalized residual yo - H xa using the analysis on the grid fi
and the solution structure s
.
DIVAnd.DIVAnd_residualobs
— Methoddataresidual = DIVAnd_residualobs(s,fi);
Computes the residual yo - H xa only at real data points using the analysis. on the grid fi
and the solution structure s
.
DIVAnd.DIVAnd_sampler
— Methodsamplesteps = DIVAnd_sampler(pmn,len);
Defines steps for sub-sampling in the discrete grid which would still allow
one to resolve the provided lengthscales
Input:
pmn
: scale factor of the grid. pmn is a tuple with n elements. Every element represents the scale factor of the corresponding dimension. Its inverse is the local resolution of the grid in a particular dimension.len
: correlation length
Output:
samplesteps
: vector of integers with steps in subsampling [1 2 4 1] means every grid point in x direction, every fifth in y etc
DIVAnd.DIVAnd_scaleL
— MethodComputes a relative length based on the mask, metrics and a density field, typically measuring the observation density calculated with DIVAnd_heatmap
lambda = DIVAnd_scaleL(mask,pmn,dens)
Input:
mask
: mask as usualpmn
: tuple of metrics as usualdens
: field of the same dimensions as mask. Higher values of dens will result in lower values of lambda.
Output:
lambda
: field to be applied to a reference length field. Values are around 1 so some regions will have smaller L and some higher L
DIVAnd.DIVAnd_solve!
— MethodSolve the variational problem.
fi = DIVAnd_solve(s)
Derive the analysis based on all contraints included in s and using the observations yo
Input: s: structure created by DIVAnd_factorize fi0: starting point for iterative primal methods f0: starting point for the iterative dual method
btrunc: the value at which the stored value of s.iB was truncated and needs to be completed on the fly using jmBix
Output: fi: analyzed field
DIVAnd.DIVAnd_squaredom
— Methodmask,pmn,xyi = DIVAnd_squaredom(n,coord)
Create a "square" domain in n
dimensions with the coordinates coord
assuming a Cartesian metric. This functions returns the mask mask
, the coordinates (xi,yi,...)
and the metrics (pm,pn...)
.
Example
mask,(pm,pn),(xi,yi) = DIVAnd_squaredom(2,range(0,stop=1,length=50))
DIVAnd.DIVAnd_superobs
— MethodComputes superobservations to reduce the number of observation point. IMPORTANT: assumes all points are located in the domain of interest. Also, if topology (not taken into account in the present version) is complicated, a superobs could fall on land. In that case possible solution: take as coordinates those of the closest real observation (in water). Also consider that when using superobs for heatmap calculations, the automatic calculation of bandwidth can become too optimistic (and yield to small bandwidth).
newx,newval,sumw,varp,idx=DIVAnd_superobs(x,val,nmax;weights=[],intensive=true)
Input:
x
: tuple of coordinates of the original observationsval
: array which contains the values of the observationnmax
: approximate number of superobservations to calculate (real number will depend on distribution in space)weights
: array of weight used to calculate weighted averages (including when calculating positions of the superobs)Default is unit weights.
intensive
: boolean which when true (default) calculates the average of values, otherwise the sum
Output:
newx
: tuple of coordinates of the new super observationsnewval
: array of values of the new super observationssumw
: array containing for each superobservation value the sum of weights that where involved. Can be used for defining weight of superobservationvarp
: array of variance (variance around the mean defining the new super observation). Can be used for defining weight of superobservationidx
: array containing the inverse of grid spacing (one in each direction) that were used for the binning
Normal use with default values creates classical superobservations
Alternative use to prepare heatmaps:
newx,newinflation,sumw,varp=DIVAnd_superobs(x,ones(size(inflation)),nmax;weights=inflation,intensive=false)
creates a new inflation array by summing up the inflation from the points involved in the superobservation calculation. For the moment in this case sumw, varp are not meaningful. TODO: include in varp the variance on POSITIONS so that for headmaps it can be taken into acount for adaptive length scales
DIVAnd.DIVAndgo
— Methodfi, erri, residuals, qcvalues, scalefactore = DIVAndgo(mask,pmn,xi,x,f,len,epsilon2,errormethod; ...);
Input:
- Same arguments as DIVAndrun with in addition
errormethod
: you have the choice between:cpme
(clever poorman's method, default method if parameter not provided),:none
or:exact
(only available if windowed analysis are done with DIVAndrun)MEMTOFIT=
: keyword controlling how to cut the domain depending on the memory remaining available for inversion (not total memory)RTIMESONESCALES=
: if you provide a tuple of length scales, data are weighted differently depending on the numbers of neighbours they have. Seeweight_RtimesOne
for detailsQCMETHOD=
: if you provide a qc method parameter, quality flags are calculated. SeeDIVAnd_cv
for detailssolver
(default:auto
:). :direct for the direct solver or :auto for automatic choice between the direct solver or the iterative solver.overlapfactor
: describes how many times the length scale is used for the overlapping. default is 3.3. use lower values ONLY for very good data coverage.
Output:
fi
: the analysed fielderri
: relative error field on the same grid as fi. () if errormethod is fixed to:none
residuals
: array of residuals at data points. For points not on the grid or on land:NaN
qcvalues
: ifQCMETHOD=
is provided, the output array contains the quality flags otherwise qcvalues is (). For points on land or not on the grid: 0scalefactore
: Desroziers et al. 2005 (doi: 10.1256/qj.05.108) scale factor forepsilon2
Perform an n-dimensional variational analysis of the observations f
located at the coordinates x
. The array fi
represent the interpolated field at the grid defined by the coordinates xi
and the scales factors pmn
.
IMPORTANT: DIVAndgo is very similar to DIVAndrun and is only interesting to use if DIVAndrun cannot fit into memory or if you want to parallelize. (In the latter case do not forget to define the number of workers; see addprocs
for example)
DIVAnd.DIVAndjog
— FunctionCompute a variational analysis of arbitrarily located observations.
fi,s = DIVAndjog(mask,pmn,xi,x,f,len,epsilon2,csteps,lmask; alphapc=[1,2,1], otherargs...);
Perform an n-dimensional variational analysis of the observations f
located at the coordinates x
. The array fi
represent the interpolated field at the grid defined by the coordinates xi
and the scales factors pmn
.
Input:
- Same parameters as for divarun. * Two additional parameters: * csteps: array of ndims values defining the sampling steps for the preconditionner * lmask: array of ndims mutilplications factors for length scales * One additional optional parameter * alphapc: The coefficients for the norm used in the preconditionner
Output:
fi
: the analysed fields
: structure with an arrays.P
representing the analysed error covariance
DIVAnd.DIVAndrun
— MethodDIVAndrun(mask,pmn,xi,x,f,len,epsilon2; <keyword arguments>)
Perform an n-dimensional variational analysis of the observations f
located at the coordinates x
. The array fi
represent the interpolated field at the grid defined by the coordinates xi
and the scales factors pmn
.
Input:
mask
: binary mask delimiting the domain. true is inside and false outside.
For oceanographic application, this is the land-sea mask where sea is true and land is false.
pmn
: scale factor of the grid. pmn is a tuple with n elements. Every element represents the scale factor of the corresponding dimension. Its inverse is the local resolution of the grid in a particular dimension. For example, in two dimensions,pmn
is a tuple(pm,pn)
wherepm
is the inverse of the local resolution in first dimension andpn
is the the inverse of the local resolution in second dimension.xi
: tuple with n elements. Every element represents a coordinate of the final grid on which the observations are interpolated.x
: tuple with n elements. Every element represents a coordinate of the observations.f
: value of the observations minus the background estimate (vector ofm
elements wherem
is the number of observations). See also note.len
: tuple with n elements. Every element represents the correlation length for a given dimension.epsilon2
: error variance of the observations (normalized by the error variance of the background field).epsilon2
can be a scalar (all observations have the same error variance and their errors are decorrelated), a vector (all observations can have a different error variance and their errors are decorrelated) or a matrix (all observations can have a different error variance and their errors can be correlated). Ifepsilon2
is a scalar, it is thus the inverse of the signal-to-noise ratio.
Optional input arguments specified as keyword arguments
velocity
: velocity of the advection constraint. It is a tuple of n arrays and each array represents a single velocity component. The individual array should have the same size as the final grid. The first (second,..) element of the velocity is the velocity compomenent along the first (second,...) dimension. Thevelocity
has the units of a length-scale. If this parameter is derived from ocean currents, then the later must be multiplied by a factor (to be determined for example by cross-validation) and this factor has the units of a time-scale. The default is no-advection constraint.alpha
: alpha is vector of coefficients multiplying various terms in the cost function. The first element multiplies the norm. The other i-th element of alpha multiplies the (i+1)-th derivative. Per default, the highest derivative is m = ceil(1+neff/2) where neff is the effective dimension of the problem (the number of dimensions with a nonzero correlation length) andceil
is the ceiling function (rounding up).
The values of alpha is the (m+1)th row of the Pascal triangle:
m=0 1
m=1 1 1
m=1 1 2 1 (n=1,2)
m=2 1 3 3 1 (n=3,4)
...
constraints
: a structure with user specified constraints (seeDIVAnd_addc
).moddim
: modulo for cyclic dimension (vector with n elements). Zero is used for non-cyclic dimensions. One should not include a boundary zone (sometimes called a ghost zone or halo) for cyclic dimensions. For example if the first dimension is cyclic, then the grid point corresponding tomask[1,j]
should be betweenmask[end,j]
(left neighbor) andmask[2,j]
(right neighbor).fracindex
: fractional indices (n-by-m array). If this array is specified, then x and xi are not used.inversion
: direct solver (:chol for Cholesky factorization), an interative solver (:pcg for preconditioned conjugate gradient [1]) can be used or :cgamgsa for a multigrid method with preconditioned conjugate gradient. The two last methods are iterative methods who a controlled by the number of iterationsmaxit
and the tolerancetol
.compPC
: function that returns a preconditioner for the primal formulation if inversion is set to 'pcg'. :The function has the following arguments:fun = compPC(iB,H,R)
where iB is the inverse background error covariance, H the observation operator and R the error covariance of the observation. The function
compPC
returns the preconditionerfun(x,fx)
computing fx =M \ x
(the inverse of M times x) whereM
is a positive defined symmetric matrix [1]. Effectively, the system E⁻¹ A (E⁻¹)ᵀ (E x) = E⁻¹ b is solved for (E x) where E Eᵀ = M. Ideally, M should this be similar to A, so that E⁻¹ A (E⁻¹)ᵀ is close to the identity matrix.fi0
: starting field for iterative primal algorithm (same size asmask
).f0
: starting field for iterative dual algorithm (same size as the observationsf
).operatortype
: Val{:sparse} for using sparse matrices (default) or Val{:MatFun} or using functions to define the constrains.scale_len
: true (default) if the correlation length-scale should be scaled such that the analytical kernel reaches 0.6019072301972346 (besselk(1.,1.)) at the same distance than in 2D. The kernel behaves thus similar to the default kernel in two dimensions (alpha = [1,2,1]).alphabc
: numerical value defining how the last grid points are stretched outward. Ifalphabc
is 1, the default value mimics an infinite domain. To have previous behaviour of finite domain use alphabc equal to0
.btrunc
: if provided defines where to truncate the calculation of the covariance matrix B. Only values up and including alpha[btrunc] will be calculated. If the iterative solution is calculated, the missing terms will be calculated on the fly during the conjugate gradient calculations. Default value is none and full covariance calculation.
Output:
fi
: the analysed fields
: a structure with an arrays.P
representing the analysed error covariance
Note:
If zero is not a valid first guess for your variable (as it is the case for e.g. ocean temperature), you have to subtract the first guess from the observations before calling DIVAnd and then add the first guess back in.
Example:
see DIVAndsimpleexample.jl
References
[1] https://en.wikipedia.org/w/index.php?title=Conjugategradientmethod&oldid=761287292#Thepreconditionedconjugategradientmethod
DIVAnd.Rtimesx!
— MethodRtimesx!(coord,LS,x,Rx)
Gaussian type R
matirx in ndim
dimensions applied to vector x
of length ndata
. The Gaussian scale differs in each direction k
: LS[k]
Coordinates of point i are coord[i,1],coord[i,2],...,coord[i,ndim]
To avoid an ndata² complexity a grid is set up first so as to allow only the calculation of covariances when distances are smaller than 3*LS
Adapted from DIVA3D/src/Fortran/Util/Rtimesx_weighting.f90
DIVAnd.SDNMetadata
— Methodncglobalattrib,ncvarattrib = SDNMetadata(metadata,fi)
Based on the information in the dictionary metadata
and the analysed 4D field fi
produce a list of NetCDF global and variable attributes for DIVAnd_save2
. To list all registered projects call keys(DIVAnd.PROJECTS)
. For example:
julia> using DIVAnd
julia> keys(DIVAnd.PROJECTS)
Base.KeySet for a Dict{String,Dict{String,String}} with 3 entries. Keys:
"EMODNET-chemistry"
"SeaDataNet"
"SeaDataCloud"
DIVAnd.SDNObsMetadata
— MethodSDNObsMetadata(id)
Return a link to the SeaDataNet metadata page of the observation with the identifier id
(a combination of the EDMO code and local CDI ID). This works only in IJulia.
DIVAnd.TimeSelectorYW
— MethodTS = TimeSelectorYW(years,yearwindow,monthlists)
The structure TS
handles the time aggregation based on years
and monthlists
. It is similar to TimeSelectorYearListMonthList
except that the elements of yearlists
are centred around years
and span yearwindow
years. yearlists
is in fact constructed by adding and subtracting yearwindow/2
to every element of years.
DIVAnd._getparam
— Methodhelper function for searchz
DIVAnd.alpha_default
— Methodneff, alpha = alpha_default(Labs,alpha)
Return a default value of alpha.
DIVAnd.approximate_gaussian
— Methodx2 is x squared
DIVAnd.average_background_profile
— Methodaverage_background_profile(
background_filename, (lonr,latr,depthr,TS), (obslon, obslat, obsdepth, obstime), obsvalue,
epsilon2,
varname;
transform = DIVAnd.Anam.notransform(),
searchz = background_profile_searchz(depthr),
Compute the average background profile by averaging the observations within a distance given by searchz
and for each time instance defined in the time selector TS
.
DIVAnd.average_files
— Methodaverage_files(filenames,varnameu::AbstractString,TSvelocity,outfilename,outvarname::AbstractString)
Averaged the gridded variables varnameu
from the NetCDF files from filenames
over time following the time selector TSvelocity
.
DIVAnd.backgroundfile
— Methodfun = backgroundfile(fname,varname,TS)
Return a function fun
which is used in DIVAnd to make anomalies out of observations based relative to the field defined in the NetCDF variable varname
in the NetCDF file fname
. It is assumed that the NetCDF variables has the variable lon
, lat
and depth
. And that the NetCDF variable is defined on the same grid as the analysis and was generated according to the provided time selector TS
(TimeSelectorYearListMonthList or TimeSelectorRunningAverage).
At all vertical levels, there should at least one sea point.
DIVAnd.backgroundfile
— Methodfun = backgroundfile(fname,varname)
Return a function fun
which is used in DIVAnd to make anomalies out of observations based relative to the field defined in the NetCDF variable varname
in the NetCDF file fname
. It is assumed that the NetCDF variables has the variable lon
, lat
and depth
. And that the NetCDF variable is defined on the same grid as the analysis.
DIVAnd.blkdiag
— Methodconcatenate diagonal matrices
DIVAnd.cgradient
— Methodhx,hy = cgradient(pmn,h)
DIVAnd.checkobs
— Methodcheckobs(x,v,ids)
checkobs(io::IO,x,v,ids)
Print some basic information about the coordinates x
(tuple of vector) and values v
(vector) having the identifier ids
(vector of strings) to check erroneous data. It prints wheter NaNs or Infs are found and the minimum and maximum value.
If the argument io
is provided, the information is input/output stream io
.
DIVAnd.checkresolution
— Methodcheckresolution(mask,pmn,len)
Returns a warning of the resolution is too coarse relative to the correlation length. The resolution must be at least 2 times finer than the correlation length.
DIVAnd.checksym
— MethodxAy, yATx = checksym(n,fun!)
Check if the the function fun!
represents a symmetric matrix when applied on random vectors of size n
.
DIVAnd.climatology_bounds
— Methodcbounds = climatology_bounds(TS)
Produce an matrix for DateTimes where cbounds[i,1]
is the start time of all sub-intervals defined by this i
-th time instance and cbounds[i,2]
is the end time of all sub-intervals defined by this i
-th time instance.
DIVAnd.conjugategradient
— Methodx,cgsuccess,niter = conjugategradient(fun!,b)
Solve a linear system with the preconditioned conjugated-gradient method: A x = b where A
is a symmetric positive defined matrix and b
is a vector. Equivalently the solution x
minimizes the cost function J(x) = ½ xᵀ A x - bᵀ x.
The function fun!(x,fx)
computes fx which is equal to A*x
. For example:
function fun!(x,fx)
fx[:] = A*x
end
Note that the following code will NOT work, because a new array fx
would be created and it would not be passed back to the caller.
function fun!(x,fx)
fx = A*x # bug!
end
The function fun!
works in-place to reduce the amount of memory allocations.
Optional input arguments
x0
: starting vector for the interationstol
: tolerance on |Ax-b| / |b|maxit
: maximum of interationspc!
: the preconditioner. The functionspc(x,fx)
computes fx = M⁻¹ x (the inverse of M times x) whereM
is a symmetric positive defined matrix. Effectively, the system E⁻¹ A (E⁻¹)ᵀ (E x) = E⁻¹ b is solved for (E x) where E Eᵀ = M. Ideally, M should this be similar to A, so that E⁻¹ A (E⁻¹)ᵀ is close to the identity matrix. The functionpc!
should be implemented in a similar way thanfun!
(see above).
Output
x
: the solutioncgsuccess
: true if the interation converged (otherwise false)niter
: the number of iterations
DIVAnd.cut
— MethodDIVAnd.cut(filename,varname,filename_cut,polygon_lon,polygon_lat)
Exclude all grid cells points of the netcdf file filename
with the variable varname
to the grid cell included in the polygon whose vertices are defined by the vector of longitude values polygon_lon
and the vector of latitude values polygon_lat
. Values outside of this polygone will be croped (if possible) or masked (if croping is not possible).
Only the NetCDF variable with the dimensions lon
and lat
will be modified. All other variarables (in particular obslon
, obslat
, ...) will not be changed.
Example
filename2 = "Water_body_dissolved_oxygen_concentration_monthly.nc"
# polygon with the area to retain
polygon_lon = [-20., 20, 20, -19]
polygon_lat = [21, 21, 53., 52.]
varname = "Water body dissolved oxygen concentration"
filename_cut = "cut.nc"
DIVAnd.cut(filename2,varname,filename_cut,polygon_lon,polygon_lat)
DIVAnd.dayssince
— Methoddayssince(dt; t0 = DateTime(1900,1,1))
Number of days since a starting day t0
(1900-01-01 per default).
DIVAnd.decompB!
— Methodwork1, work2: size of mask
Symmetric matrix
SB = √(β) (1 + α L)^(nmax / 2) W^{-1}
where W is the volumne of the corresponding grid cell. The background error covariance matrix B is SB W SB
DIVAnd.derived
— MethodDIVAnd.derived(filename,varname,new_filename;
error_thresholds = [("L1", 0.3), ("L2", 0.5)],
)
Compute derived quantities from a DIVAnd analyse (in particular the deepest value of an analysis) using the NetCDF file filename
with the variable varname
. The result will be written in new_filename
.
See DIVAnd.diva3d
for the optional parameter error_thresholds
.
Example:
using DIVAnd
filename = "Water_body_dissolved_oxygen_concentration_monthly.nc"
new_filename = "Water_body_dissolved_oxygen_concentration_monthly2.nc"
varname = "Water body dissolved oxygen concentration"
DIVAnd.derived(filename,varname,new_filename)
DIVAnd.distance
— Methodd = distance(lat1,lon1,lat2,lon2)
Compute the great-circle distance between the points (lat1,
lon1) and (
lat2,lon2
). The units of all input and output parameters are degrees.
DIVAnd.distance
— Methodd = distance([lon1,lat1],[lon2,lat2])
The same as distance(lat1,lon1,lat2,lon2)
but there the arguments are vectors and the order is longitude then latitude.
The units of all input and output parameters are degrees.
DIVAnd.diva3d
— Methoddbinfo = diva3d(xi,x,value,len,epsilon2,filename,varname)
Create a 3D analysis (or a series of 3D analysis) with DIVAnd using the observations value
(vector) at the locations x
(tuple of vectors) onto the regular grid defined by the vectors xi
using the scaled observational error variance epsilon2
and the correlation length len
. The result will be saved in the netCDF file filename
under the variable varname
.
Inputs
xi
: tuple with n elements. Every element represents a coordinate of the final grid on which the observations are interpolatedx
: tuple with n elements. Every element represents a coordinate of the observationsvalue
: value of the observationslen
: tuple with n elements. Every element represents the correlation length. Iffitcorrlen
isfalse
(default), the correlation length should be expressed in meters. Iffitcorrlen
istrue
, thenlen
can be the empty tuple()
or a tuple containing 3 arrays of normalized correlation lengths which will be multiplied by the horizontal and vertical correlation lengths.epsilon2
: error variance of the observations (normalized by the error variance of the background field).
epsilon2
can be a scalar (all observations have the same error variance and their errors are decorrelated), a vector (all observations can have a different error variance and their errors are decorrelated) or a matrix (all observations can have a different error variance and their errors can be correlated). If epsilon2
is a scalar, it is thus the inverse of the signal-to-noise ratio.
filename
: the output netCDF filename.varname
: the name of the variable (used in the netCDF file).
Optional input arguments:
bathname
: path to the netCDF bathymetry (default../../DIVAnd-example-data/Global/Bathymetry/gebco_30sec_16.nc
relative to this source file)bathisglobal
: true (default) is the bathymetry is a global data setplotres
: call-back routine for plotting ((timeindex,sel,fit,erri) -> nothing)timeorigin
: time origin (defaultDateTime(1900,1,1,0,0,0)
)moddim
: modulo for cyclic dimension (vector with n elements). Zero is used for non-cyclic dimensions. Halo points should not be included for cyclic dimensions. For example if the first dimension is cyclic, then the grid point corresponding tomask[1,j]
should be betweenmask[end,1]
(left neighbor) andmask[2,j]
(right neighbor). The default is [0,0,0],zlevel
::surface
(default) for surface analysis and:floor
for analysis from the bottom floor.ncvarattrib
: dictionary of netCDF variable attributes.ncglobalattrib
: dictionary of netCDF global attributes.transform
: Anamorphosis transformation function (default:Anam.notransform()
).fitcorrlen
: true if the correlation length is determined from the observation (defaultfalse
). Note that the parameterlen
is interpreted differently whenfitcorrlen
is set totrue
.fithorzcorrlen
: true if the horizontal correlation length is determined from the observation (default: the value offitcorrlen
) Note that the parameterlen
is interpreted differently whenfithorzcorrlen
is set totrue
.fitvertcorrlen
: true if the vertical correlation length is determined from the observation (default: the value offitcorrlen
) Note that the parameterlen
is interpreted differently whenfitvertcorrlen
is set totrue
.fithorz_param
: dictionary with additional optional parameters forfithorzlen
, for example:Dict(:smoothz => 200., :searchz => 50.)
.fitvert_param
: dictionary with additional optional parameters forfitvertlen
.distfun
: function to compute the distance (default(xi,xj) -> DIVAnd.distance(xi[2],xi[1],xj[2],xj[1])
).mask
: if different fromnothing
, then this mask overrides land-sea mask based on the bathymetry
(default nothing
).
background
: if different fromnothing
, then this parameter allows one
to load the background from a call-back function (default nothing
). The call-back functions has the parameters (x,n,trans_value,trans)
where x
represent the position of the observations, n
the time index, trans_value
, the observations (possibly transformed) and trans
the transformation function. The output of this function is the gridded background field and the observations minus the background field.
background_epsilon2_factor
: multiplication forepsilon2
when computing a vertical profile as a background estimate (default: computed internally based on the amount of data). This parameter is not used when the parameterbackground
orbackground_lenz
is provided.background_lenz
: vertical correlation for background computation (default 20 m). This parameter is not used when the parameterbackground
is provided.background_len
: deprecated option replaced bybackground_lenz
.filterbackground
: number of iterations to filter the background profile (default 0, no filtering)memtofit
: keyword controlling how to cut the domain depending on the memory remaining available for inversion. It is not the total memory (default 3). Use a large value (e.g. 100) to force the usage for the more efficient direct solver if you are not limited by the amount of RAM memory.minfield
: if the analysed field is belowminfield
, its value is replace byminfield
(default -Inf, i.e. no substitution is done).maxfield
: if the analysed field is abovemaxfield
, its value is replace bymaxfield
(default +Inf, i.e. no substitution is done).saveindex
: controls if just a subset of the analysis should be saved to the netCDF file. Per default,saveindex
is(:,:,:)
(corresponding to longitude, latitude and depth indices) meaning that everything is saved. If however, for example the first layer should not be saved thensaveindex
should be(:,:,2:length(depthr))
wheredepthr
is the 3rd element ofxi
.niter_e
: Number of iterations to estimate the optimal scale factor ofepsilon2
using Desroziers et al. 2005 (doi: 10.1256/qj.05.108). The default is 1 (i.e. no optimization is done).coeff_derivative2
(vector of 3 floats): for every dimension where this value is non-zero, an additional term is added to the cost function penalizing the second derivative. A typical value of this parameter is[0.,0.,1e-8]
.surfextend
: create an additional layer on top of the surface layer so that the effective background error variance is more similar to the deep ocean.false
is the default value.stat_per_timeslice
(default false): if true, then the residual values (and possibly qcvalues) are also returned by time slices which is useful if the time slices overlap (see example below).error_thresholds
(default[("L1", 0.3), ("L2", 0.5)]
). This is a list of tuples with the applied error thresholds and the variable names suffixes. If the variable is named e. g."Salinity"
, then the variables"Salinity_L1"
(resp."Salinity_L2"
) will be created where the analysis is masked if the relative error exceeds 0.3 (resp. 0.5).divamethod
:DIVAndgo
(default) orDIVAndrun
Any additional keywoard arguments understood by DIVAndgo
/DIVAndrun
can also be used here (e.g. velocity constrain)
The output is a dictionary with the followings keys:
:residuals
: the difference between the observations and the analysis (interpolated linearly to the
location of the observations). The residual is NaN
if the observations are not within the domain as defined by the mask and the coordinates of the observations x
.
:qcvalues
: quality control scores (if activated)
Example
Example on how to aggredate the residuals per time slice and to retain the maximum residual:
selection_per_timeslice = dbinfo[:selection_per_timeslice]
residuals_per_timeslice = dbinfo[:residuals_per_timeslice]
selection_per_timeslice = dbinfo[:selection_per_timeslice]
max_residuals = fill(-Inf,length(value))
for n = 1:length(selection_per_timeslice)
sel = selection_per_timeslice[n]
max_residuals[sel] = max.(max_residuals[sel],residuals_per_timeslice[n])
end
At all vertical levels, there should at least one sea point.
DIVAnd.divadoxml
— MethodDIVAnd.divadoxml(filepath,varname,project,cdilist,xmlfilename;
ignore_errors = false,
WMSlayername = [],
previewindex = 1,
additionalcontacts = [],
additionalvars = Dict{String,Any}())
Generate the XML metadata file xmlfilename
from the NetCDF file filepath
(or list of files) with the NetCDF variable varname
. Project is either "SeaDataNet", "EMODNET-chemistry" or "SeaDataCloud". cdilist
is the file from https://emodnet-chemistry.maris.nl/download/export.zip.
The XML file contains a list of the data the originators. divadoxml will abort with an error if some combinations of EDMO code, local CDI ID are not present in the cdilist
. Such errors can be ignored if ignore_errors
is set to true. To understand why some EDMO code/local CDI ID could not be found, one can the decompress file from https://emodnet-chemistry.maris.nl/download/export.zip which contains a file called export.csv
. This file has the columns author_edmo
and cdi_identifier
which this function uses to find the data originators (column originator_edmo
).
Information can be overridden with the dictionary additionalvars
. The keys should corresponds to the template tags found the in template
directory. Template tags are the strings inside {{ and }}.
NetCDF_URL should be suppplied since it's a URL of a ZIP file which is usually not from OceanBrowser.
If filepath
is a vector of file names, the argument WMSlayername
can be provided to give additional information to distinguish between the NetCDF files. The elements of the vector of string will be appended to the description of the WMS layer.
The resulting XML file includes the file names (provided by filepath
). Do not change the file names after running this function, otherwise the XML will still contain a reference to the old file names. If you must change the file names please do so before running this script.
If the data is present in a subfolder (e.g. "Winter") later on the OceanBrowser webserver, the filepath
should also contain this subfolder (e.g. "Winter/somefile.nc"). The local directories should mirror the directory structure on OceanBrowser. Relative paths should be used, and if the Julia code isn't right above the NetCDF files, use cd("<path>") before each setting the files parameter which use paths relative to this path.
If the products will go into a sub-directory (which is different from the domain name), the part of the path should with provided with the parameter url_path
.
The link for the NetCDF download for EMODNET-chemistry for example is the following:
PROJECTS["EMODNET-chemistry"]["baseurl_http"] * "/" * url_path * "/" * filepath
where url_path
defaults to the the domain name (from the C19 vocabulary) if it is not provided. The script will print the URLs for verification.
additionalcontacts
is a list of dictionaries with additional condact information to be added in the XML file. Elements are typically create by the function DIVAnd.getedmoinfo
.
Example
download("https://emodnet-chemistry.maris.nl/download/export.zip","export.zip")
files = [
"Winter (January-March) - 6-year running averages/Water_body_chlorophyll-a.4Danl.nc",
"Spring (April-June) - 6-year running averages/Water_body_chlorophyll-a.4Danl.nc",
"Summer (July-September) - 6-year running averages/Water_body_chlorophyll-a.4Danl.nc",
"Autumn (October-December) - 6-year running averages/Water_body_chlorophyll-a.4Danl.nc"
];
additionalcontacts = [
DIVAnd.getedmoinfo(1977,"originator"), # US NODC for World Ocean Database
DIVAnd.getedmoinfo(4630,"originator"), # CORIOLIS for CORA
]
DIVAnd.divadoxml(files,"Water_body_chlorophyll-a","EMODNET-chemistry","export.zip","test.xml";
ignore_errors = true,
additionalvars = Dict("abstract" => "Here goes the abstract"),
additionalcontacts = additionalcontacts,
WMSlayername = ["winter","spring","summer","autumn"]
)
For this function the following global NetCDF attributes are mandatory:
product_id
: UUID identifierparameter_keywords_urn
orparameter_keywords
: P35 code (e.g. "SDN:P35::EPC00007") or preferred label (e.g. "Water body phosphate") respectively.search_keywords_urn
orsearch_keywords
: P02 code or preferred labelarea_keywords_urn
orarea_keywords
: C19 code or preferred labelinstitution_edmo_code
: EDMO code numberproduct_version
: version of the product
Adding a global attribute, can be done using the following:
ds = NCDatafile("DIVA_file.nc","a")
ds.attrib["attribute_name"] = "attribute value"
close(ds)
DIVAnd.divadoxml_originators
— Methoddivadoxml_originators(cdilist,filepaths,xmlfilename;
errname = split(filepaths[1], ".nc")[1] * ".cdi_import_errors.csv",
ignore_errors = false)
Produces the originators/contact XML fragment xmlfilename
from the NetCDF files in filepaths
. See divadoxml
for an explication of the optinal arguments.
Example
using Glob
download(DIVAnd.OriginatorEDMO_URL,"export.zip")
run(`unzip export.zip`) # if unzip is installed
cdilist = "export.csv";
filepaths = glob("*/*/*silicate*nc")
xmlfilename = "out.xml"
ignore_errors = true
DIVAnd.divadoxml_originators(cdilist,filepaths,xmlfilename;
ignore_errors = ignore_errors)
DIVAnd.domain
— Methodmask,(pm,pn),(xi,yi) = DIVAnd.domain(bathname,bathisglobal,lonr,latr)
Generate a 2D geospatial domain based on the topography from the netCDF file bathname
.
DIVAnd.domain
— Methodmask,(pm,pn,po),(xi,yi,zi) = DIVAnd.domain(bathname,bathisglobal,lonr,latr,depthr)
Generate a 3D geospatial domain based on the topography from the netCDF file bathname
. If zlevel
is :surface
, then depthr
is zero for the sea surface and positive in water (positive is down). If zlevel
is :floor
, then depthr
is zero for the sea floor and positive in water (positive is up)
DIVAnd.domain
— Methodmask,(pm,pn,po,pp),(xi,yi,zi,ti) = domain(bathname,bathisglobal,lonr,latr,depthr,timer)
Generate a geospatial domain based on the topography from the netCDF file bathname
.
DIVAnd.empiriccovar
— Methoddistx,covar,corr,varx,count = empiriccovar(x,v,distbin,mincount;
maxpoints = 10000,
distfun = (xi,xj) -> sqrt(sum(abs2,xi-xj)))
Compute the covariance, correlation and variance of a cloud of data points with the value v
(a vector) and the location x
(a tuple of vectors) grouped by distance. Random pairs are choosen and grouped by their distance (computed by distfun
) in bins defined by distbin
. The function try to fill at least mincount
of data points in each bin but always stop after considering maxpoints
pairs.
DIVAnd.encodeWMSStyle
— Methodencode parameters as key-value separated by : and +
DIVAnd.escape
— Methodescape(x)
Escape characters except slash.
DIVAnd.extract_bath
— Methodbx,by,b = DIVAnd.extract_bath(bath_name,isglobal,xi,yi)
Extract the bathymetry from the NetCDF file bathname
. The parameter isglobal
is true if the NetCDF file covers the whole globe and thus the last longitude point can be considered to be right next to the first longitude point. xi
and yi
are vectors defining the bounding box of the data. No interpolation is performed.
Convention: b is positive in the water and negative in the air.
The NetCDF file is expected to have the one dimensional variables lon
and lat
with the longitude (degrees East) and latitude (degrees North) and the two dimentional array bat
with the digital terrain model (negative in water and positive above water). The order of the dimension should follow be: longitude and then latitude in Column-major ordering (or latitude and then longitude if the tool ncdump
is used, which is based on Row-major ordering).
Example of the output of ncdump -h
:
netcdf gebco_30sec_8 {
dimensions:
lat = 2702 ;
lon = 5400 ;
variables:
double lat(lat) ;
lat:long_name = "Latitude" ;
lat:standard_name = "latitude" ;
lat:units = "degrees_north" ;
double lon(lon) ;
lon:long_name = "Longitude" ;
lon:standard_name = "longitude" ;
lon:units = "degrees_east" ;
float bat(lat, lon) ;
bat:long_name = "elevation above sea level" ;
bat:standard_name = "height" ;
bat:units = "meters" ;
// global attributes:
:title = "GEBCO" ;
}
DIVAnd.fit
— Methodvar0opt,lensopt,distx,covar,fitcovar = fit(x,v,distbin,mincount;
alpha = DIVAnd.alpha_default(length(x)),
minlen = zeros(length(x)),
maxlen = ones(length(x)),
tolrel = 1e-4,
lens0 = ones(length(x)),
var0 = 1.,
minvar0 = 0.,
maxvar0 = 2.,
maxpoints = 10000,
distfun = (xi,xj,lens) -> sqrt(sum(abs2,(xi-xj)./lens)),
progress = (iter,var,len,fitness) -> nothing
)
The same as the function fit_isotropic
except that now the correlation length-scale lens0
, minlen
, maxlen
, lensopt
are vectors (one value per dimension). The distance function distfun
uses an additional parameter to compute the normalized distance.
The note of the optional parameters in divafit
also applies here.
DIVAnd.fit_isotropic
— Methodvar0,len,distx,covar,fitcovar = fit_isotropic(x,v,distbin,mincount;
alpha = DIVAnd.alpha_default(length(x)),
minlen = 0.,
maxlen = 10.,
tolrel = 1e-4,
maxpoints = 10000,
nmean = 100,
distfun = (xi,xj) -> sqrt(sum(abs2,xi-xj))),
progress = (iter,var,len,fitness) -> nothing
)
Determines the optimal correlation length len
and variance (for a separation distance approaching zero) var0
of a cloud of data points with value v
and coordiantes x
(tuple of vectors with the coordinates).
The function can find the solution corresponding to a local minimum which is not necessarily the global minimum.
See also empiriccovar
for future information about the output parameters.
Optional input parameters:
alpha
: if one correlation length is forced to zero during the anaylsis the values of alpha sould be set using the effective dimension. For example, if a 2D-analysis is simulated by forcing the vertical correlation length to zero, then alpha should be set to[1,2,1]
, otherwise alpha will be[1,3,3,1]
(for any proper 3D analysis).len
: initial value for the correlation length.minlen
,maxlen
: minimum and maximum values for the correlation length.tolrel
: relative tolerance for the optimizer.maxpoints
: maximum number of data points considered.nmean
: the number of times an empirical covariance is estimated. The average covariance is used for the fitting.distfun
: function to compute the distance between pointxi
(vector) andxj
. Per defaultdistfun
is the Euclidian distance:(xi,xj) -> sqrt(sum(abs2,xi-xj)))
.progress
: call-back function to show the progress of the optimization with the input parametersiter
,var
,len
andfitness
(all scalars).
The length-scale parameters and the variance have the corresponding units from the x
and v
. It is therefore often necessary to provide reasonable values for these default parameters.
The algorithm used to estimate the correlation-length and variance is based on randomly choosen points. Therefore the result can be different if the function is invoked repeately. If nmean
is increased, then these statistical fluctuations should decrease (for a not too large value of mincount
, i.e. about 100 for most cases).
If the lower bound minlen
is too small, then you might get the following error:
AmosException with id 4: input argument magnitude too large, complete loss of accuracy by argument reduction.
In these case, increase minlen
.
DIVAnd.fithorzlen
— Methodlenxy,dbinfo = DIVAnd.fithorzlen(x,value,z)
Determines the horizontal correlation length lenxy
based on the measurments value
at the location x
(tuple of 3 vectors corresponding to longitude, latitude and depth) at the depth levels defined in z
.
Optional arguments:
smoothz
(default 100): spatial filter for the correlation scalesearchz
(default 50): vertical search distance (can also be a function of the depth)maxnsamp
(default 5000): maximum number of sampleslimitlen
(default false): limit correlation length by mean distance between observationslimitfun
(default no function): a function with with the two arguments (depth and
estimated correlation length) which returns an adjusted correlation length. For example to force the correlation length to be between 300 km and 50 km one would use the following: limitfun = (z,len) -> max(min(len,300_000),10_000))
. If provided limitfun
is used before and after the smoothing.
epsilon2
(default is a vector of the same size asvalue
with all elements equal to 1): the relative error variance of the observations. Less reliable observation would have a larger corresponding value.distfun
: function computing the distance between the pointsxi
andxj
.
Per default it represent the Euclidian distance.
DIVAnd.fitlen
— Methodvarbak, RL, dbinfo = fitlen(x::Tuple,d,weight,nsamp; distfun = distfun_euclid, kwargs...)
this function used to be called lfit in fitlsn.f
DIVAnd.fitvertlen
— Methodlenz,dbinfo = DIVAnd.fitvertlen(x,value,z,...)
See also DIVAnd.fithorzlen
DIVAnd.floodfill
— Functionlabel = floodfill(mask)
Attribute an integer number (a numeric label) to every element in mask such that all grid points connected by a von Neumann neighborhood (without crossing elements which are false
in mask) have the same label. Labels are sorted such that the label 1 corresponds to the largest area, label 2 the 2nd largest and so on.
DIVAnd.floodfillpoint
— Functionm = floodfillpoint(mask,I,directions)
Fill the binary mask starting at index I
(CartesianIndex
). All element directly connected to the starting location I
will be true
without crossing any element equal to false
in mask
. Per default the value of I
is the first true element in mask
and directions
correspond to the Von Neumann neighborhood.
DIVAnd.formatsize
— Methoddisplay size as a string
DIVAnd.fzero
— Methodfzero(f,x0,x1,eps; maxiter = Inf) find the zero of the function f between x0 and x1 assuming x0 < x1 at a precision eps.
DIVAnd.getedmoinfo
— Methodcontact = DIVAnd.getedmoinfo(edmo_code,role)
Returns a dictionary with the contact information from the EDMO registry based on the prodivided emdo_code
. role
is the Sextant contact information role, i.e. either "originator" or "author".
DIVAnd.hmerge
— Methodmergedfield = hmerge(field,L)
Merge several field[:,:,1]
, field[:,:,2]
,... into a single 2d field mergedfield
values equal to NaN are ignored. This function is typically used to merge different DIVAnd anayses.
DIVAnd.ind2sub
— Methodsubscripts = ind2sub(sv,index)
Compute from linear index in the packed state vector a tuple of subscripts. The first element of the subscript indicates the variable index and the remaining the spatial subscripts.
DIVAnd.interp!
— Methodinterp!(xi,fi,x,f)
Interpolate field fi
(n-dimensional array) defined at xi
(tuble of n-dimensional arrays or vectors) onto grid x
(tuble of n-dimensional arrays). The interpolated field is stored in f
. The grid in xi
must be align with the axis (e.g. produced by DIVAnd.ndgrid).
DIVAnd.interp
— Methodf = interp(xi,fi,x)
Interpolate field fi
(n-dimensional array) defined at xi
(tuble of n-dimensional arrays or vectors) onto grid x
(tuble of n-dimensional arrays). The grid in xi
must be align with the axis (e.g. produced by DIVAnd.ndgrid).
DIVAnd.len_harmonize
— MethodLen = len_harmonise(len,mask)
Produce a tuple of arrays of the correlation length len
which can be either a scalar (homogeneous and isotropic case), a tuple of scalar (homogeneous case) or already a tuple of arrays (general case). The the later case the size of the arrays are veryfied.
DIVAnd.lengraddepth
— MethodRL = lengraddepth(pmn,h, L;
h2 = h,
hmin = 0.001
)
Create the relative correlation length-scale field RL
based on the bathymetry h
and the metric pmn
(tuple of arrays). Effectively the correlation-length scale is close to zero if the relative bathymetry gradients (|∇h|/h) are smaller than the length-scale L
(in consistent units as pmn
).
R_L = 1 / (1 + L |∇h| / max(h2,hmin))
Per default h2
is equal to h
. The depth h
must be positive. hmin
must have the same units as h (usually meters).
DIVAnd.load_bath
— Methodxi,yi,bath = DIVAnd.load_bath(bath_name,isglobal,xi,yi)
Load the bathymetry from the netCDF file bathname
. The parameter isglobal
is true if the NetCDF file covers the whole globe and thus the last longitude point can be considered to be right next to the first longitude point. xi
and yi
are vectors containing the longitude and latitude grid onto which the bathymetry should be interpolated.
DIVAnd.load_mask
— Methodxi,yi,mask = load_mask(bath_name,isglobal,xi,yi,level::Number)
Generate a land-sea mask based on the topography from the NetCDF file bathname
. The parameter isglobal
is true if the NetCDF file covers the whole globe and thus the last longitude point can be considered to be right next to the first longitude point. xi
and yi
are vectors containing the longitude and latitude grid onto which the bathymetry should be interpolated.
Convention: in the water, level
is positive and in the air level
is negative.
DIVAnd.loadbigfile
— Methodvalue,lon,lat,depth,time,obsid = loadbigfile(filename)
Load data from the text file filename
and returns vectors with the value, longitude, latitude, depth and time (as DateTime). A list string identifiers is also returned.
DIVAnd.loadobs
— Methodobsvalue,obslon,obslat,obsdepth,obstime,obsids = loadobs(T,filename,varname)
obsvalue,obslon,obslat,obsdepth,obstime,obsids = loadobs(T,filenames,varname)
Load the variable varname
from the netCDF file filename
(or list of filenames). Coordinates (the netCDF variables "obslon", "obslat", "obsdepth"), time ("obstime") and identifiers ("obsids") will also be loaded. Numeric output arguments will have the type T
.
DIVAnd.loadoriginators
— Methoddb = loadoriginators(fname)
Load the CDI list from the file fname
(zip with a csv file, or csv file directly).
DIVAnd.localize_separable_grid
— MethodDerive fractional indices on a separable grid.
I = localize_separable_grid(xi,mask,x)
xi is a tuple of vectors and x and tuple of n-dimensional arrays, e.g.
x1,x2 = ndgrid(2 * collect(1:5),collect(1:6)) x = (x1,x2)
Derive fractional indices where xi are the points (typical discrete observations) to localize in the separable grid x
(every dimension in independent on other dimension). The output I
is an n-by-m array where n number of dimensions and m number of observations. The corresponding element of I is negative if xi
is outside of the grid defined by x
.
DIVAnd.localresolution
— MethodΔcoord = localresolution(coord)
Estimate the local resolution of the coordinate coord
by finite differences.
DIVAnd.matfun_diff
— FunctionOperator for differentiation.
diffx = matfun_diff(sz1,m,cyclic)
Operator for differentiation along dimension m for "collapsed" matrix of the size sz1.
Input: sz1: size of rhs m: dimension to differentiate cyclic: true if domain is cyclic along dimension m. False is the default value
DIVAnd.matfun_shift
— FunctionOperator shifting a field in a given dimension.
function S = matfun_shift(sz1,m,cyclic)
Operator shifting a field in the dimension m. The field is a "collapsed" matrix of the size sz1.
Input: sz1: size of rhs m: dimension to shift cyclic: true if domain is cyclic along dimension m. False is the default value
DIVAnd.matfun_stagger
— FunctionS = matfun_stagger(sz1,m,cyclic)
Create an operator for staggering a field in dimension m. The field is a "collapsed" matrix of the size sz1.
Input: sz1: size of rhs m: dimension to stagger cyclic: true if domain is cyclic along dimension m. False is the default value
DIVAnd.matfun_trim
— MethodT = matfun_trim(sz1,m)
Create an operator which trim first and last row (or column) in The field is a "collapsed" matrix of the size sz1
. m
is the dimension to trim.
DIVAnd.meanerr
— Methodmeanerr(obsconstrain)
Compute the mean error variance of the observational contraints obsconstrain
ignoring values equal to Inf
(due to e.g. observations outside of the grid).
DIVAnd.misfit
— Methoderr, var = misfit(distx, covar, covarweight, RL)
This function used to be called forfit in fitlsn.f. err
is the weighted mean square error var
is the variance for a distance equal to zero.
DIVAnd.ncfile
— MethodDIVAnd_save(ds,filename,xyi,fi,varname;
ncvarattrib = Dict(), ncglobalattrib = Dict(), ...)
Save the result of the analysis in a netCDF file .
Input arguments
ds
: the NetCDF datasetfilename
: the name of the NetCDF filemask
: binary mask delimiting the domain. true is inside and false outside. For oceanographic application, this is the land-sea mask where sea is true and land is false.xyi
: tuple with n elements. Every element represents a coordinate of the final grid on which the observations are interpolatedfi
: the analysed fieldvarname
: the name of the NetCDF variable
Optional arguments:
ncglobalattrib
: a dictionary with the global attributesncvarattrib
: a dictionary with the variable attributesrelerr
: relative errortimeorigin
: time origin for the time units attribute (default is 1900-01-01 00:00:00)
DIVAnd.numstring
— Methods = numstring(x)
Converts x
to string and avoid to return "-0.0" if x
is equal to zero. https://github.com/gher-ulg/DIVAnd.jl/issues/28
DIVAnd.pack
— Methodx = pack(sv,(var1, var2, ...))
Pack the different variables var1, var2, ... into the vector x where sv
is a statevector
under the control of a mask. Only sea grid points are retained.
Input: sv: structure generated by statevector_init. var1, var2,...: variables to pack (with the same shape as the corresponding masks).
Output: x: vector of the packed elements. The size of this vector is the number of elements of all masks equal to 1.
Notes: If var1, var2, ... have an additional trailing dimension, then this dimension is assumed to represent the different ensemble members. In this case x is a matrix and its last dimension is the number of ensemble members.
DIVAnd.pc_none!
— Methodpc_none!(x,fx)
Dummy call-back function when no preconditioner is used. fx
will be equal to x
.
DIVAnd.previewURL
— MethodURL = previewURL(filepath,varname,project,domain;
colorbar_quantiles = [0.01,0.99],
basemap = "shadedrelief",
default_field_min = nothing,
default_field_max = nothing)
Generates a WMS preview URL
DIVAnd.random
— Methodfield = DIVAnd.random(mask,pmn,len,Nens)
Create Nens
random fields with the correlation length len
in a domain with the mask mask
and the metric pmn
.
See DIVAnd.DIVAndrun
for more information about these parameters.
DIVAnd.save
— Methodsave(filename,xyi,fi,varname;
ncvarattrib = Dict(), ncglobalattrib = Dict(), ...)
Save the result of the analysis in a netCDF file.
Input arguments
filename
: the name of the NetCDF filexyi
: tuple with n vectors. Every element in this tuple represents a coordinate of the final grid on which the observations are interpolatedfi
: the analysed fieldvarname
: the name of the NetCDF variable
Optional arguments:
ncglobalattrib
: a dictionary with the global attributesncvarattrib
: a dictionary with the variable attributesrelerr
: relative error
DIVAnd.saveobs
— MethodDIVAnd.saveobs(filename,varname,value,xy,ids;
type_save = Float32,
timeorigin = DateTime(1900,1,1,0,0,0),
used = trues(size(ids)),
chunksize = 10_000,
)
Save value
and the location and time of the observation in the NetCDF file filename
and their identifier ids
. xy
is a tuple with the vectors longitude, latitude, depth and time (as a vector of DateTime
). The values will be saved in the variable called varname
.
Optional arguments:
type_save
: the type to save the data (default Float32). However, the time is always saved asFloat64
.timeorigin
: time origin for the time units attribute (default is
1900-01-01 00:00:00)
used
: allows to subset the data to save only used variables in the netCDF file
DIVAnd.saveobs
— MethodDIVAnd.saveobs(filename,xy,ids;
type_save = Float32,
timeorigin = DateTime(1900,1,1,0,0,0),
used = trues(size(ids)),
)
Save the location and time of the observations in the netCDF file filename
and their identifier ids
. xy
is a tuple with the vectors longitude, latitude, depth and time (as a vector of DateTime
).
Optional arguments:
type_save
: the type to save the data (default Float32). However, the time is always saved asFloat64
.timeorigin
: time origin for the time units attribute (default is
1900-01-01 00:00:00)
used
: allows one to subset the data to save only used variables in the netCDF file
DIVAnd.scaleseparation
— MethodPhi1,H1Phi1,Phi2,H2Phi2=scaleseparation(K1andH1K1,K2andH2K2,d;niter=10)
Input:
K1andH1K1
: function, when called with a vector of data d, provides in return Kd,HK d ; i.e. the gridded analysis Kd and analysis HKd at data points for analysis tool 1K2andH2K2
: function, when called with a vector of data d, provides in return Kd,HK d ; i.e. the gridded analysis Kd and analysis HKd at data points for analysis tool 2d
: data arrayniter=
: optional keyword parameter defining the number of iterations used to invert I - H2K2 H1K2. Default is 10
Output:
Phi1
: analysis for tool 1 in which analysis of scale 2 is taken outH1Phi1
: analysis at data locations for tool 1 in which analysis of scale 2 is taken outPhi2
: analysis for tool 2 in which analysis of scale 1 is taken outH2Phi2
: analysis at data locations for tool 2 in which analysis of scale 1 is taken out
Tool to separate scales using two different analysis provided as two input functions
K1 should be related to the larger scales (or scales with high signal/noise ratios) and K2 to smaller or less energetic scales. If in doubt invert both and test with different number of iterations while looking at convergence.
see "Multi-scale optimal interpolation: application to DINEOF analysis spiced with a local optimal interpolation" http://hdl.handle.net/2268/165394
Here the two fields can have different supports (one could be a 3D analysis and the other one a season-depth analysis for example. Only the observational operators must provide the same data array at the output. In other words K1,HK1=K1andH1K1 should provide an output array HK1 of the same dimensions as the data array d and the output HK2 from K2,HK2=K2andH2K2
DIVAnd.smoothfilter
— Methodff = smoothfilter(x,f,scale)
Smooth the function f
defined on x
by solving the diffusion equation
∂ₜ ϕ = ν ∂²ₓ ϕ
scale
is the spatial scales of the removed length-scales. It is defined as 2Tν where T is the integration time.
It uses the Greens functions for 1D diffusion: 1/sqrt(4 π ν t) * exp(-x^2 / (4νt))
DIVAnd.smoothfilter_weighted
— MethodAll weights have to be positive (and different from zero).
DIVAnd.sparse_diff
— Functiondiffx = sparse_diff(sz1,m,cyclic)
Sparse operator for differentiation along dimension m
for "collapsed" matrix of the size sz1
. cyclic
is true if domain is cyclic along dimension m. false
is the default value
DIVAnd.sparse_gradcoast
— MethodH = sparse_gradcoast(mask)
Return a sparse matrix H
computing the gradient of the values along the coastline defined by mask
. Land points correspond to the value false
and sea points to the value true
in mask
.
DIVAnd.sparse_interp
— FunctionH,out = sparse_interp(mask,I)
Create interpolation matrix from mask
and fractional indexes I
.
Input: mask: 0 invalid and 1 valid points (n-dimensional array) I: fractional indexes (2-dim array n by mi, where mi is the number of points to interpolate) Ouput: H: sparse matrix with interpolation coefficients out: true if value outside of grid outbbox: 1 if outise bouding box onland: 1 if point touches land (where mask == 0)
DIVAnd.sparse_interp_g
— Methodsparse_interp(x,mask,xi) Interpolate from x onto xi
DIVAnd.sparse_stagger
— FunctionS = sparse_stagger(sz1,m,cyclic)
Create a sparse operator S
for staggering a field in dimension m
for "collapsed" matrix of the size sz1
. cyclic
is true if domain is cyclic along dimension m. false
is the default value
DIVAnd.statpos
— Methodulon,ulat,meanval,stdval,count = statpos(val,lon,lat)
Return unique positions (ulon
, ulat
) as well as their mean, standard deviation and count of the vector of observations val
located at the positions lon
and lat
.
DIVAnd.statpos
— Methodulon,ulat = statpos(lon,lat)
Return unique positions (ulon
, ulat
) as well their mean, standard deviation and count of the vector of observations val
located at the positions lon
and lat
.
DIVAnd.stats
— Methodmeanx,meany,stdx,stdy,covar,corr = stats(sumx,sumx2,sumy,sumy2,sumxy,N)
Computes the mean meanx
and the standard deviation stdx
from the sum (sumx
) and the sum of squares (sumx2
) from N
numbers and similarly for the variable y
. The function computes also the Pearson correlation corr
and covariance covar
between x
and y
.
DIVAnd.stats
— Methodmeanx,stdx = stats(sumx,sumx2,N)
Computes the mean meanx
and the standard deviation stdx
from the sum (sumx
) and the sum of squares (sumx2
) from N
numbers.
DIVAnd.sub2ind
— Methodind = statevector_sub2ind(sv,subscripts)
Compute from a tuple of subscripts the linear index in the packed state vector. The first element of the subscript indicates the variable index and the remaining the spatial subscripts.
DIVAnd.timesend
— Methodtimecim = timeend(TS::TimeSelectorYearListMonthList)
Return the end date of all intervals defined by TS
.
DIVAnd.timesstart
— Methodtimecim = timestart(TS::TimeSelectorYearListMonthList)
Return the start date of all intervals defined by TS
.
DIVAnd.ufill
— Methodufill(c::Array{T,2},mask::AbstractArray{Bool}) where T
mask
is true where c
is valid.
DIVAnd.ufill
— Methodcfilled = ufill(c,valex)
Replace values in c
equal to valex
by averages of surrounding points. valex
should not be NaN; use ufill(c,isfinite.(c))
or ufill(c,.!isnan.(c))
instead.
DIVAnd.unpack
— Methodvar1, var2, ... = unpack(sv,x)
var1, var2, ... = unpack(sv,x,fillvalue)
Unpack the vector x into the different variables var1, var2, ... where sv
is a statevector
.
Input: sv: structure generated by statevector_init. x: vector of the packed elements. The size of this vector is the number of elements equal to 1 in all masks.
Optional input parameter: fillvalue: The value to fill in var1, var2,... where the masks correspond to a land grid point. The default is zero.
Output: var1, var2,...: unpacked variables.
Notes: If x is a matrix, then the second dimension is assumed to represent the different ensemble members. In this case, var1, var2, ... have also an additional trailing dimension.
DIVAnd.varanalysis
— MethodVariational analysis similar to 3D-var
Input:
x0: start vector for iteration, at output it is the last state of the iteration. Note that x0 is related to the analysis xa by xa = SB^½ * W^½ * xa
| x + W^½ * SB^½ * H' * (R \ (H * SB^½ * W^½ * x )) - W^½ SB^{½} * H' * (R \ yo) | < tol * s.sv.n / length(yo) * | W^½ SB^{½} * H' * (R \ yo) |
Kernel is the solution of the n-dimensional diffusion equation
∂c/∂t = ∇ ⋅ (D ∇ c)
n-dimensional Green’s function
G(x,x',t) = (4πDt)^(-n/2) exp( - |x -x'|² / (4Dt))
G(x,x',t) = det(D)^(-½) (4π t)^(-n/2) exp( - (x -x')ᵀ D⁻¹ (x -x') / (4t))
http://www.rpgroup.caltech.edu/~natsirt/aph162/diffusion_old.pdf
DIVAnd.velocityfile
— Methodfun = velocityfile(fname,(varnameu,varnamev),TSvelocity,scale)
fun = velocityfile(fname,(varnameu,varnamev,varnamew),TSvelocity,scale)
Return a function fun
which is used in DIVAnd as a advection constraint using fields defined in the NetCDF variable varnameu
, varnamev
and varnamew
(zonal, meridional and vertical velocity components) in the NetCDF file fname
. If the parameter varnamew
is omitted, the vertical velicity is neglected. It is assumed that the NetCDF variables has the variable lon
, lat
and depth
and that the fields have been average according to the provided time selector TSvelocity
(TimeSelectorYearListMonthList
or TimeSelectorRunningAverage
).
See also DIVAnd.average_files
.
NetCDF _FillValues are treated as zeros.
DIVAnd.vonNeumannNeighborhood
— Methoddirections = vonNeumannNeighborhood(mask)
Return a vector will all search directions corresponding to the Von Neumann neighborhood in N dimensions where N is the dimension of the boolean array mask
.
DIVAnd.weight_RtimesOne
— Method weights = weight_RtimesOne(x,len)
Compute the weight of the observations at location x
to reduce the influence of locally clustered data. x
is a tuple with n elements: every element represents a coordinate of the observations. len
is a tuple of arrays representing the correlation length. len[i]
is the correlation length in the i-th dimension.
DIVAnd.writeslice
— Methodwriteslice(ncvar, ncvar_relerr, ncvar_Lx, fi, relerr, index)
Write a slice of data in a netCDF file given by the index index
. The variable relerr
can be nothing.