DataStructures

The DataStructures module implements helpful data structures to be used by other ClimaUtilities.jl modules or external packages.

LRUCache

DataStructures implements an LRUCache, which is used in both DataHandlingExt and NCFileReaderExt. LRUCache can be used to store pieces of information that need to be accessed multiple times and may be expensive to compute, such as regridded fields or loaded files. Instead of recomputing the values each time they're needed, the previously-computed information is saved in the cache and retrieved directly from it. To prevent the cache from growing so large that it takes up significant memory, the least-recently-used (LRU) scheme maintains a maximum cache size: every time adding an element to the cache would lead its size to grow larger thant the maximum allowed, the element that was accessed the least recently is deleted first.

Note: all the methods supported by dictionaries are currently implemented.

If you need one that is not implemented, please open an issue or a pull request.

Example

In many ways, LRUCaches behave like Julia dictionaries. To use one, we first need to initialize the empty cache specifying types of keys and values and optionally the maximum allowed size:

import ClimaUtilities.DataStructures
cache = DataStructures.LRUCache{Int, Int}(; max_size = 128)

Once we have the cache, we can access and insert elements with get!. get! retrieves the value associated to the key if avaialble, otherwise it inserts a new key with a provided default. The default can be passed as third argument or can be the return statement of a function provided as first (as typically done in do blocks). Continuing the example above

println("The value associated to 1 is: ", get!(cache, 1, 10))
other_value = get!(cache, 2) do
    return 20
end

This cache can be used to implement some recursive function more efficiently, as in the famous case of the factorial:

function factorial(n)
    n in (0, 1) && return 1
    return n * get!(cache, n - 1, factorial(n - 1))
end
# The first time we call factorial it will compute everything
@time factorial(10)
# The second time it will only compute the last element
@time factorial(11)

API

Base.get!Function
Base.get!(cache::LRUCache{K, V}, key::K, default::V) where {K, V}

Get the value associated with key in cache. If the key is not in the cache, add it with the value default. In any case, update the key's priority and make sure the cache doesn't exceed its maximum size.

Base.get!(default::Callable, cache::LRUCache{K, V}, key::K) where {K, V}

Get the value associated with key in cache. If the key is not in the cache, add it with the value default(). In any case, update the key's priority and make sure the cache doesn't exceed its maximum size.

This method is intended to be used with do block syntax.

Base.getFunction
Base.get(cache::LRUCache{K, V}, key::K, default::V)

Return the value associated with key in cache, or if key is not in cache, return default.

Base.get(default::Callable, cache::LRUCache{K, V}, key::K)

Return the value associated with key in cache, or if the key is not in the cache, return f().

Base.haskeyFunction
Base.haskey(cache::LRUCache{K, V}, key::K)

Return whether key has a mapping in cache.

Base.copyFunction
Base.copy(cache::LRUCache{K, V})

Return a shallow copy of cache.

Base.deepcopyFunction
Base.deepcopy(cache::LRUCache{K, V})

Return a deep copy of cache.

Base.emptyFunction
Base.empty(cache::LRUCache{K, V}, key_type::DataType=K, value_type::DataType=V)

Create an empty LRUCache with keys of type key_type and values of type value_type.

The second and third arguments are optional and default to the input's keytype and valuetype. If only one of the two type is specified, it is assumed to be the value_type, and key_type is assumed to the key type of cache.

Base.empty!Function
Base.empty!(cache::LRUCache{K, V})

Remove all key/value mappings from the input and return the emptied input.

Base.pop!Function
Base.pop!(cache::LRUCache{K, V}, key::K, [default::V])

Delete the mapping for key in cache and return the associated value, or if the key does not exist, return default or throw an error if default is not specified

Base.delete!Function
Base.delete!(cache::LRUCache{K, V}, key::K)

Delete mapping for key, if it is incache, and return cache

Base.getindexFunction
Base.getindex(cache::LRUCache{K, V}, key::K)

Return the mapping for key in cache.

Base.setindex!Function
Base.setindex!(cache::LRUCache{K, V}, value::V, key::K)

Store the mapping from key to value in cache. Then, return cache.

Base.lengthFunction
Base.length(cache::LRUCache{K, V})

Return the number of elements in cache.