module DictTools

Some tools to support Dict and Dictionary (and their abstract supertypes.)

_AbstractDict is a Union. A few pirate methods are made so that their APIs are closer.

Applications are count_map, update_map, update!.

  • hist_to_dist
  • add_counts!
_Dict{T, V}

Either a Dict or a Dictionary. A union type

SparseArray{T, N, DT, K, F, NT} <: AbstractSparseArray{T,N}

This sparse array wraps a dictionary. It may be used as an sparse array, but also is intended to function as an array view on a dictionary. As such, it is a means to convert the dictionary to an Array.

Only N=1, with alias SparseVector is supported at the moment.


  • data{K, V} - the dictionary
  • dims - NTuple{N, Int}
  • transform_key::F a function that takes an object of type K as input and returns i::Int, an index into a corresponding AbstractArray.

We currently don't store a function to go the other way, that is, convert an Int to a key of type K. But, this is probably necessary for better functionality.

_insert!(d::Union{_AbstractDict, AbstractVector}, k, v)

Insert the key value pair (k, v) in d. For Dictionary, this calls insert!. Otherwise, it calls `setindex!.

_set!(d::Union{_AbstractDict, AbstractVector}, k, v)

Set the key value pair (k, v) in d.

update!(dict::_AbstractDict, _key, func, default)

If dict has key _key, replace the value by the result of calling func on the value. Otherwise insert default for _key.

This function supports some subtypes of _AbstractDict.

add_counts!(counter::Union{_AbstractDict{<:Any,V}, AbstractVector{V}}, itr, ncounts=one(V)) where V

Add ncounts counts to counter for each key in itr. If ncounts is ommited, add one count for each key.

If counter is an AbstractVector, then itr must produce Int indices into counter.


Note: There are better solutions than baretype for many (perhaps all) use cases. Return the typename of T. This will fail for some input. In particular, and in general, if T is a UnionAll type. However, more robust methods for baretype are defined for Dict and Dictionary.


Return the typename of the type of object x. That is, return baretype(typeof(x)).

collect_sparse(dict::Dictionary{K,V}; transform=identity, neutral_element=Val(zero(V)),
                            max_ind::Union{Nothing, Int}=nothing) where {K, V}

Convert dict representing a sparse vector to a dense Vector.

Missing keys in dict are set to neutral_element in the returned vector. transform_key transforms keys of type K to type Int, so that they are valid indexes for Vector.

max_ind, if not equal to nothing, is the length of the output Vector, this must be large enough for all of they keys in dict. If max_ind is nothing, then the length will be computed. This will cost of some additional allocation if transform is not identity.

combine_values!(dest::Dictionary, combine_func, _key, val, defaultval=val)

If dest contains _key, replace the corresponding value old_val with combine_func(val, old_val). Otherwise, set the value for _key to defaultval.

construct(::Type{T<:_AbstractDict}, inds, vals)

Construct either an AbstractDict or AbstractDictionary from inds and vals.

This is intended to give a common interface. It would be more convenient to commit type piracy with Base.Dict(inds, vals) = Dict(zip(inds, vals)).

count_map([::Type{T}=Dictionary], itr, filt = x -> true)

Return a dictionary of type T whose keys are elements of itr and whose values count how many times each occurs. Only elements of itr for which filt returns true are counted.

count_map is tested for T being either Dict or Dictionary.

StatsBase.countmap differs in that it has an optimization for integers and that itr of indetermiate length is first collected internally.

empty_or_similar(d::Union{_AbstractDict, AbstractVector}, ::Type{KeyT}, ::Type{ValT})

For Dict call empty. For AbstractVector and Dictionary call similar. They key type for AbstractVector must be `Int.

map_keys!(dest::Dictionary, src::Dictionary, keymap_func, combine_func = +)

This is the same as map_keys except that the destination dest is passed as input. Existing values in dest will be combined with combine_func. However dest is typically empty when passed to map_keys!.

map_keys(dict::Dictionary, keymap_func, combine_func = +)

Return a Dictionary whose keys are the image of keys(dict) under keymap_func and whose values are created by accumulating with combine_func the values from the preimage of each key in the image of keymap_func.

For example, suppose combine_func is +, and keymap_func is iseven, and the only even keys in dict are in key-value pairs (2, 9), (4, 9), (6, 9). Then the output Dictionary will contain the key-value pair (true, 27).

map_keys is useful for computing a marginal probability distribution. If dict represents counts or a probability distribution, and combine_func is + and keymap_func is many-to-one for some keys, then map_keys effects marginalization of the distribution.

normalize!(dest, src)

Normalize src, writing the result to dest. Both dest and src are of type Union{_AbstractDict, AbstractVector}.

normalize!(container::Union{_AbstractDict, AbstractVector})

Normalize container in place.

If the normalized value cannot be converted to the value type of container, an error, such as InexactError will be thrown. For example if valtype(container) is Int, an InexactError will be thrown.


Normalize d in place. That is, scale the values of d so that they sum to one.

normalize(container::Union{_AbstractDict, AbstractVector})

Return a container similar to container but with values normalized so that they sum to one. This can be used to convert a histogram (count map) to a probability distribution.

update!(v::AbstractVector, _keys, func, _ignore=nothing)

Call update!(v, k, func) for each k::Int in iterable _keys.

update!(v::AbstractVector, _key::Int, func, _ignore=nothing)

Update the value of v at index _key by calling func on it. That is, set v[_key] = func(v[_key]).

update_map(::Type{T}=_AbstractDict, _keys, func, default)

Like count_map, but instead of incrementing an existing value by one, it is replaced by the result of calling func on it. Furthermore, the default is default rather than 1.

zchop(d::Dictionary, args...)
zchop!(d::Dictionary, args...)
nchop(d::Dictionary, args...; kwargs...)
nchop!(d::Dictionary, args...; kwargs...)

Apply zchop, zchop!, nchop, nchop! to each value of d. They keys are not altered.