Node utilities


Various functions in Base are overloaded to treat an AbstractNode as a collection of its nodes.

copy(tree::AbstractExpressionNode; break_sharing::Val=Val(false))

Copy a node, recursively copying all children nodes. This is more efficient than the built-in copy.

If break_sharing is set to Val(true), sharing in a tree will be ignored.

collect(tree::AbstractNode; break_sharing::Val=Val(false))

Collect all nodes in a tree into a flat array in depth-first order.

filter(f::Function, tree::AbstractNode; break_sharing::Val=Val(false))

Filter nodes of a tree, returning a flat array of the nodes for which the function returns true.

count(f::F, tree::AbstractNode; init=0, break_sharing::Val=Val(false)) where {F<:Function}

Count the number of nodes in a tree for which the function returns true.

foreach(f::Function, tree::AbstractNode; break_sharing::Val=Val(false))

Apply a function to each node in a tree without returning the results.

sum(f::Function, tree::AbstractNode; init=0, return_type=Undefined, f_on_shared=_default_shared_aggregation, break_sharing::Val=Val(false)) where {F<:Function}

Sum the results of a function over a tree. For graphs with shared nodes such as GraphNode, the function f_on_shared is called on the result of each shared node. This is used to avoid double-counting shared nodes (default behavior).

mapreduce(f::Function, op::Function, tree::AbstractNode; return_type, f_on_shared, break_sharing)

Map a function over a tree and aggregate the result using an operator op.

any(f::Function, tree::AbstractNode)

Reduce a flag function over a tree, returning true if the function returns true for any node. By using this instead of tree_mapreduce, we can take advantage of early exits.

all(f::Function, tree::AbstractNode)

Reduce a flag function over a tree, returning true if the function returns true for all nodes, false otherwise.

map(f::F, tree::AbstractNode, result_type::Type{RT}=Nothing; break_sharing::Val=Val(false)) where {F<:Function,RT}

Map a function over a tree and return a flat array of the results in depth-first order. Pre-specifying the result_type of the function can be used to avoid extra allocations.

convert(::Type{<:AbstractExpressionNode{T1}}, n::AbstractExpressionNode{T2}) where {T1,T2}

Convert a AbstractExpressionNode{T2} to a AbstractExpressionNode{T1}. This will recursively convert all children nodes to AbstractExpressionNode{T1}, using convert(T1, tree.val) at constant nodes.


  • ::Type{AbstractExpressionNode{T1}}: Type to convert to.
  • tree::AbstractExpressionNode{T2}: AbstractExpressionNode to convert.
hash(tree::AbstractExpressionNode{T}[, h::UInt]; break_sharing::Val=Val(false)) where {T}

Compute a hash of a tree. This will compute a hash differently if nodes are shared in a tree. This is ignored if break_sharing is set to Val(true).


Trees are printed using the string_tree function, which is very configurable:

    # Deprecated
)::String where {T,F1<:Function,F2<:Function}

Convert an equation to a string.


  • tree: the tree to convert to a string
  • operators: the operators used to define the tree

Keyword Arguments

  • f_variable: (optional) function to convert a variable to a string, with arguments (feature::UInt8, variable_names).
  • f_constant: (optional) function to convert a constant to a string, with arguments (val,)
  • variable_names::Union{Array{String, 1}, Nothing}=nothing: (optional) what variables to print for each feature.

The standard show and print methods will use the most recently-created OperatorEnum in a string_tree.


There are also methods for random sampling of nodes:

NodeSampler(; tree, filter::Function=Returns(true), weighting::Union{Nothing,Function}=nothing, break_sharing::Val=Val(false))

Defines a sampler of nodes in a tree.


  • tree: The tree to sample nodes from. For a regular Node, nodes are sampled uniformly. For a GraphNode, nodes are also sampled uniformly (e.g., in sin(x) + {x}, the x has equal probability of being sampled from the sin or the + node, because it is shared), unless break_sharing is set to Val(true).
  • filter::Function: A function that takes a node and returns a boolean indicating whether the node should be sampled. Defaults to Returns(true).
  • weighting::Union{Nothing,Function}: A function that takes a node and returns a weight for the node, if it passes the filter, proportional to the probability of sampling the node. If nothing, all nodes are sampled uniformly.
  • break_sharing::Val: If Val(true), the sampler will break sharing in the tree, and sample nodes uniformly from the tree.
rand(rng::AbstractRNG, tree::AbstractNode)

Sample a node from a tree according to the default sampler NodeSampler(; tree).

rand(rng::AbstractRNG, sampler::NodeSampler)

Sample a node from a tree according to the sampler sampler.

Internal utilities

Almost all node utilities are crafted using the tree_mapreduce function, which evaluates a mapreduce over a tree-like (or graph-like) structure:

    f_on_shared::Function=(result, is_shared) -> result,

Map a function over a tree and aggregate the result using an operator op. op should be defined with inputs (parent, child...) -> so that it can aggregate both unary and binary operators. op will not be called for leafs of the tree. This differs from a normal mapreduce in that it allows different treatment for parent nodes than children nodes. If this is not necessary, you may use the regular mapreduce instead. The argument break_sharing can be used to break connections in a GraphNode.

You can also provide separate functions for leaf (variable/constant) nodes and branch (operator) nodes.


julia> operators = OperatorEnum(; binary_operators=[+, *]);

julia> tree = Node(; feature=1) + Node(; feature=2) * 3.2;

julia> tree_mapreduce(t -> 1, +, tree)  # count nodes. (regular mapreduce also works)

julia> tree_mapreduce(t -> 1, (p, c...) -> p + max(c...), tree)  # compute depth. regular mapreduce would fail!

julia> tree_mapreduce(vcat, tree) do t == 2 ? [t.op] : UInt8[]
end  # Get list of binary operators used. (regular mapreduce also works)
2-element Vector{UInt8}:

julia> tree_mapreduce(vcat, tree) do t
    ( == 0 && t.constant) ? [t.val] : Float64[]
end  # Get list of constants. (regular mapreduce also works)
1-element Vector{Float64}:

Various other utility functions include the following:

filter_map(filter_fnc::Function, map_fnc::Function, tree::AbstractNode, result_type::Type, break_sharing::Val=Val(false))

A faster equivalent to map(map_fnc, filter(filter_fnc, tree)) that avoids the intermediate allocation. However, using this requires specifying the result_type of map_fnc so the resultant array can be preallocated.

filter_map!(filter_fnc::Function, map_fnc::Function, stack::Vector{GT}, tree::AbstractNode)

Equivalent to filter_map, but stores the results in a preallocated array.