AbstractTrees.jl
This package provides an interface for handling tree-like data structures in Julia.
Specifically, a tree consists of a set of nodes (each of which can be represented by any data type) which are connected in a graph with no cycles. For example, each object in the nested array [1, [2, 3]]
can be represented by a tree in which the object [1, [2, 3]]
is itself the root of the tree, 1
and [2,3]
are its children and 1
, 2
and 3
are the leaves.
Using this package involves implementing the abstract tree interface which, at a minimum, requires defining the function AbstractTrees.children
for an object.
See below for a complete guide on how to implement the interface.
The Abstract Tree Interface
Functions
All trees must define children
.
By default children
returns an empty tuple and parent
returns nothing
, meaning that all objects which do not define the abstract trees interface can be considered the sole node of a tree.
AbstractTrees.children
— Functionchildren(node)
Get the immediate children of node node
.
By default, every object is a parent node of an empty set of children. This is to make it simpler to define trees with nodes of different types, for example arrays are trees regardless of their eltype
.
REQUIRED: This is required for all tree nodes with non-empty sets of children. If it is not possible to infer the children from the node alone, this should be defined for a wrapper object which does.
AbstractTrees.nodevalue
— Methodnodevalue(node)
Get the value associated with a node in a tree. This removes wrappers such as Indexed
or TreeCursor
s.
By default, this function is the identity.
OPTIONAL: This should be implemented with any tree for which nodes have some "value" apart from the node itself. For example, integers cannot themselves be tree nodes, to create a tree in which the "nodes" are integers one can do something like
struct IntNode
value::Int
children::Vector{IntNode}
end
AbstractTrees.nodevalue(n::IntNode) = n.value
AbstractTrees.parent
— Functionparent(node)
Get the immediate parent of a node node
.
By default all objects are considered nodes of a trivial tree with no children and no parents. That is, the default method is simply parent(node) = nothing
.
OPTIONAL: The 1-argument version of this function must be implemented for nodes with the StoredParents
trait.
AbstractTrees.nextsibling
— Functionnextsibling(node)
Get the next sibling (child of the same parent) of the tree node node
. The returned node should be the same as the node that would be returned after node
when iterating over (children ∘ parent)(node)
.
OPTIONAL: This function is required for nodes with the StoredSiblings
trait. There is no default definition.
AbstractTrees.prevsibling
— Functionprevsibling(node)
Get the previous sibling (child of the same parent) of the tree node node
. The returned node should be the same as the node that would be returned prior to node
when iterating over (children ∘ parent)(node)
.
OPTIONAL: This function is optional in all cases. Default AbstractTrees method assume that it is impossible to obtain the previous sibling and all iterators act in the "forward" direction.
AbstractTrees.childrentype
— Functionchildrentype(::Type{T})
childrentype(n)
Indicates the type of the children (the collection of children, not individual children) of the tree node n
or its type T
. children
should return an object of this type.
If the childrentype
can be inferred from the type of the node alone, the type ::Type{T}
definition is preferred (the latter will fall back to it).
OPTIONAL: In most cases, childtype
is used instead. If childtype
is not defined it will fall back to eltype ∘ childrentype
.
AbstractTrees.childtype
— Functionchildtype(::Type{T})
childtype(n)
Indicates the type of children of the tree node n
or its type T
.
If childtype
can be inferred from the type of the node alone, the type ::Type{T}
definition is preferred (the latter will fall back to it).
OPTIONAL: It is strongly recommended to define this wherever possible, as without it almost no tree algorithms can be type-stable. If childrentype
is defined and can be known from the node type alone, this function will fall back to eltype(childrentype(T))
. If this gives a correct result it's not necessary to define childtype
.
AbstractTrees.childstatetype
— Functionchildstatetype(::Type{T})
childstatetype(n)
Indicates the type of the iteration state of the tree node n
or its type T
. This is used by tree traversal algorithms which must retain this state. It therefore is necessary to define this to ensure that most tree traversal is type stable.
OPTIONAL: Type inference is used to attempt to
In general nodes of a tree do not all need to have the same type, but it is much easier to achieve type-stability if they do. To specify that all nodes of a tree must have the same type, one should define Base.eltype(::Type{<:TreeIterator{T}})
, see Iteration.
Traits
ParentLinks
The default value of ParentLinks
is ImplicitParents
.
Types with the StoredParents
trait must define parent
.
AbstractTrees.ParentLinks
— TypeParentLinks(::Type{T})
ParentLinks(tree)
A trait which indicates whether a tree node stores references to its parents (StoredParents()
) or if the parents must be inferred from the tree structure (ImplicitParents()
).
Trees for which ParentLinks
returns StoredParents()
MUST implement parent
.
If StoredParents()
, all nodes in the tree must also have StoredParents()
, otherwise use ImplicitParents()
.
OPTIONAL: This should be implemented for a tree if parents of nodes are stored
AbstractTrees.ParentLinks(::Type{<:TreeType}) = AbstractTrees.StoredParents()
parent(t::TreeType) = get_parent(t)
AbstractTrees.ImplicitParents
— TypeImplicitParents <: ParentLinks
Indicates that the tree does not store parents. In these cases parents must be inferred from the tree structure and cannot be inferred through a single node.
AbstractTrees.StoredParents
— TypeStoredParents <: ParentLinks
Indicates that this node stores parent links explicitly. The implementation is responsible for defining the parentind function to expose this information.
If a node in a tree has this trait, so must all connected nodes. If this is not the case, use ImplicitParents
instead.
Required Methods
SiblingLinks
The default value of SiblingLinks
is ImplicitSiblings
.
Types with the StoredSiblings
trait must define nextsibling
and may define prevsibling
.
AbstractTrees.SiblingLinks
— TypeSiblingLinks(::Type{T})
SiblingLinks(tree)
A trait which indicates whether a tree node stores references to its siblings (StoredSiblings()
) or must be inferred from the tree structure (ImplicitSiblings()
).
If a node has the trait StoredSiblings()
, so must all connected nodes in the tree. Otherwise, use ImplicitSiblings()
instead.
OPTIONAL: This should be implemented for a tree if siblings of nodes are stored
AbstractTrees.SiblingLinks(::Type{<:TreeType}) = AbstractTrees.StoredSiblings()
AbstractTrees.ImplicitSiblings
— TypeImplicitSiblings <: SiblingLinks
Indicates that tree nodes do not store references to siblings so that they must be inferred from the tree structure.
AbstractTrees.StoredSiblings
— TypeStoredSiblings <: SiblingLinks
Indicates that this tree node stores sibling links explicitly, or can compute them quickly (e.g. because the tree has a (small) fixed branching ratio, so the current index of a node can be determined by quick linear search).
If a node has this trait, so must all connected nodes in the tree. Otherwise, use ImplicitSiblings()
instead.
Required Methods
ChildIndexing
AbstractTrees.ChildIndexing
— TypeChildIndexing(::Type{N})
ChildIndexing(node)
A trait indicating whether the tree node n
has children (as returned by children
) which can be indexed using 1-based indexing. Options are either NonIndexedChildren
(default) or IndexedChildren
.
To declare that the tree TreeType
supports one-based indexing on the children, define
AbstractTrees.ChildIndexing(::Type{<:TreeType}) = AbstractTrees.IndexedChildren()
If a node has the IndexedChildren()
so must all connected nodes in the tree. Otherwise, use NonIndexedChildren()
instead.
AbstractTrees.NonIndexedChildren
— TypeNonIndexedChildren <: ChildIndexing
Indicates that the object returned by children(n)
where n
is a tree node is not necessarily indexable. This trait applies to any tree which cannot guarantee indexable children in all cases, regardless of whether the tree is indexable in special cases. For example, Array
has this trait even though there is a large class of indexable trees consisting of arrays.
AbstractTrees.IndexedChildren
— TypeIndexedChildren <: ChildIndexing
Indicates that the object returned by children(n)
where n
is a tree node is indexable (1-based).
If a node has this trait so must all connected nodes in the tree. Otherwise, use NonIndexedChildren()
instead.
Required Methods
- A node
node
with this trait must return an indexable object fromchildren(node)
.
The default value of ChildIndexing
is NonIndexedChildren
.
Types with the IndexedChildren
trait must return an indexable object from children
(i.e. children(node)[idx]
must be valid for positive integers idx
).
NodeType
AbstractTrees.NodeType
— TypeNodeType(::Type)
NodeType(node)
A trait which specifies whether a tree has a predictable node type (HasNodeType()
) or not (NodeTypeUnknown()
).
This is analogous to Base.IteratorEltype
. In particular the IteratorEltype
of TreeIterator
is dictated by this trait.
The default value is NodeTypeUnknown()
.
AbstractTrees.NodeTypeUnknown
— TypeNodeTypeUnknown <: NodeType
Indicates that this node is connected to a tree for which it cannot be guaranteed that all nodes have the same type.
AbstractTrees.HasNodeType
— TypeHasNodeType <: NodeType
Indicates that this node is connected to a tree for which all nodes have types descended from eltype(node)
.
Providing the HasNodeType
trait will guarantee that all nodes connected to the node must be of the type returned by nodetype
.
An important use case of this trait is to guarantee the return types of a TreeIterator
. Tree nodes with NodeTypeUnknown
cannot have type-stable iteration over the entire tree.
For example
struct ExampleNode
x::Int
children::Vector{ExampleNode}
end
AbstractTrees.nodevalue(x::ExampleNode) = x.x
AbstractTrees.children(x::ExampleNode) = x.children
AbstractTrees.NodeType(::Type{<:ExampleNode}) = HasNodeType()
AbstractTrees.nodetype(::Type{<:ExampleNode}) = ExampleNode
In this example, iteration over a tree of ExampleNode
s is type-stable with eltype
ExampleNode
.
Providing the nodetype(::Type)
method is preferable to defining nodetype(::ExampleNode)
because it ensures that nodetype
can be involved at compile time even if values are not known.
The Indexed Tree Interface
The abstract tree interface assumes that all information about the descendants of a node is accessible through the node itself. The objects which can implement that interface are the nodes of a tree, not the tree itself.
The interface for trees which do not possess nodes for which tree structure can be inferred from the nodes alone is different. This is done by wrapping nodes in the IndexNode
nodes which allow nodes to be accessed from a centralized tree object.
AbstractTrees.IndexNode
— TypeIndexNode{T,I}
The node of a tree which implements the indexed tree interface. Such a tree consists of an object tree
from which nodes can be obtained with the two-argument method of nodevalue
which by default calls getindex
.
An IndexNode
implements the tree interface, and can be thought of an adapter from an object that implements the indexed tree interface to one that implements the tree interface.
IndexNode
do not store the value associated with the node but can obtain it by calling nodevalue
.
Constructors
IndexNode(tree, node_index)
IndexNode(tree) = IndexNode(tree, rootindex(tree)) # one-argument constructor requires `rootindex`
Here tree
is an object which stores or can obtain information for the entire tree structure, and node_index
is the index of the node for which node_index
is being constructed.
Functions
All indexed trees must implement childindices
.
Indexed trees rely on nodevalue(::Any, ::Any)
for obtaining the value of a node given the tree and index. By default, nodevalue(tree, idx) = tree[idx]
, trees which do not store nodes in this way should define nodevalue
.
Indexed trees can define ParentLinks
or SiblingLinks
. The IndexNode
s of a tree will inherit these traits from the wrapped tree.
AbstractTrees.childindices
— FunctionAbstractTrees.nodevalue
— Methodnodevalue(tree, node_index)
Get the value of the node specified by node_index
from the indexed tree object tree
.
By default, this falls back to tree[node_index]
.
OPTIONAL: Indexed trees only require this if the fallback to getindex
is not sufficient.
AbstractTrees.parentindex
— Functionparentindex(tree, node_index)
Get the index of the parent of the node of tree tree
specified by node_index
.
Nodes that have no parent (i.e. the root node) should return nothing
.
OPTIONAL: Indexed trees with the StoredParents
trait must implement this.
AbstractTrees.nextsiblingindex
— Functionnextsiblingindex(tree, node_index)
Get the index of the next sibling of the node of tree tree
specified by node_index
.
Nodes which have no next sibling should return nothing
.
OPTIONAL: Indexed trees with the StoredSiblings
trait must implement this.
AbstractTrees.prevsiblingindex
— Functionprevsiblingindex(tree, node_index)
Get the index of the previous sibling of the node of tree tree
specified by node_index
.
Nodes which have no previous sibling should return nothing
.
OPTIONAL: Indexed trees that have StoredSiblings
can implement this, but no built-in tree algorithms require it.
AbstractTrees.rootindex
— Functionrootindex(tree)
Get the root index of the indexed tree tree
.
OPTIONAL: The single-argument constructor for IndexNode
requires this, but it is not required for any built-in tree algorithms.
The AbstractNode
Type
It is not required that objects implementing the AbstractTrees.jl interface are of this type, but it can be used to indicate that an object must implement the interface.
AbstractTrees.AbstractNode
— TypeAbstractNode{T}
Abstract type of tree nodes that implement the AbstractTrees.jl interface.
It is NOT necessary for tree nodes to inherit from this type to implement the AbstractTrees.jl interface. Conversely, all AbstractNode
types are required to satisfy the AbstractTrees.jl interface (i.e. they must at least define children
).
Package developers should keep in mind when writing methods that most trees will not be of this type. Therefore, any functions which are intended to work on any tree should not dispatch on AbstractNode
.
The type parameter T
is the type of the nodevalue
of the concrete type descented from AbstractNode
.
Type Stability and Performance
Because of the recursive nature of trees it can be quite challenging to achieve type stability when traversing it in any way such as iterating over nodes. Only trees which guarantee that all nodes are of the same type (with HasNodeType
) can be type stable.
To make it easier to convert trees with non-uniform node types this package provides the StableNode
type.
AbstractTrees.StableNode
— TypeStableNode{T} <: AbstractNode{T}
A node belonging to a tree in which all nodes are of type StableNode{T}
. This type is provided so that trees with NodeTypeUnknown
can implement methods to be converted to type-stable trees with indexable children
which allow for efficient traversal and iteration.
Constructors
StableNode{T}(x::T, ch)
StableNode(x, ch=())
StableNode{T}(𝒻, node)
Arguments
x
: the value of the constructed node, returned bynodevalue
.ch
: the children of the node, each must be of typeStableNode
.𝒻
: A function which, when called on the node of a tree returns a value which should be wrapped by aStableNode
. The return value of𝒻
must be convertible toT
(see example).T
: The value type of theStableNode
s in a tree.node
: A node from a tree which is to be used to construct theStableNode
tree.
Examples
t = [1, [2,3]]
node = StableNode{Union{Int,Nothing}}(t) do n
n isa Integer ? convert(Int, n) : nothing
end
In the above example node
is a tree with HasNodeType
, nodes of type StableNode{Union{Int,Nothing}}
. The nodes in the new tree corresponding to arrays have value nothing
while other nodes have their corresponding Int
value.
To achieve the same performance with custom node types be sure to define at least
AbstractTrees.NodeType(::Type{<:ExampleNode}) = HasNodeType()
AbstractTrees.nodetype(::Type{<:ExampleNode}) = ExampleNode
In some circumstances it is also more efficient for nodes to have ChildIndexing
since this also guarantees the type of the iteration state of the iterator returned by children
.
Additional Functions
AbstractTrees.getdescendant
— Functiongetdescendant(node, idx)
Obtain a node from a tree by indexing each level of the tree with the elements of idx
.
This function is defined for all trees regardless of whether they have the IndexedChildren
. This is because a tree without IndexedChildren
might have special cases in which all children are indexable, a prominent example being Array
which may not have indexable sub-trees (e.g. an array containing a Dict) but there are common special cases in which array trees are fully indexable (e.g. a tree in which every non-leaf node is an array).
The elements of idx
can be any argument to getindex
, not necessarily integers. For example, getdescendant(Dict("a"=>1), ("a",))
returns 1
.
Note that this is a separate concept from indexed trees which by default do not have IndexedChildren()
, see IndexNode
.
Example
v = [1, [2, [3, 4]]]
getdescendant(v, (2, 2, 1)) == 3
AbstractTrees.nodevalues
— Functionnodevalues(itr::TreeIterator)
An iterator which returns the nodevalue
of each node in the tree, equivalent to Iterators.map(nodevalue, itr)
.
AbstractTrees.ischild
— Functionischild(node1, node2; equiv=(===))
Check if node1
is a child of node2
.
By default this iterates through children(node2)
, so performance may be improved by adding a specialized method for given node type.
Equivalence is established with the equiv
function. New methods of this function should include this argument or else it will fall back to ===
.
AbstractTrees.isroot
— Functionisroot(x)
Whether x
is the absolute root of a tree. More specifically, this returns true
if parent(x) ≡ nothing
, or parent(root, x) ≡ nothing
. That is, while any node is the root of some tree, this function only returns true for nodes which have parents which cannot be obtained with the AbstractTrees
interface.
AbstractTrees.intree
— Functionintree(node, root; equiv=(≡))
Check if node
is a member of the tree rooted at root
.
By default this traverses through the entire tree in search of node
, and so may be slow if a more specialized method has not been implemented for the given tree type.
Equivalence is established with the equiv
function. Note that new methods should also define equiv
or calls may fall back to the default method.
See also: isdescendant
AbstractTrees.isdescendant
— Functionisdescendant(node1, node2; equiv=(≡))
Check if node1
is a descendant of node2
. This isequivalent to checking whether node1
is a member of the subtree rooted at node2
(see intree
) except that a node cannot be a descendant of itself.
Internally this calls intree(node1, node2)
and so may be slow if a specialized method of that function is not available.
Equivalence is established with the equiv
function. Note that new methods should also define equiv
or calls may fall back to the default method.
AbstractTrees.treebreadth
— Functiontreebreadth(node)
Get the number of leaves in the tree rooted at node
. Leaf nodes have a breadth of one.
By default this recurses through all nodes in the tree and so may be slow if a more specialized method has not been implemented for the given type.
AbstractTrees.treeheight
— Functiontreeheight(node)
Get the maximum depth from node
to any of its descendants. Leaf nodes have a height of zero.
By default this recurses through all nodes in the tree and so may be slow if a more specialized method has not been implemented for the given type.
AbstractTrees.descendleft
— Functiondescendleft(node)
Descend from the node node
to the first encountered leaf node by recursively calling children
and taking the first child.
AbstractTrees.getroot
— Functiongetroot(node)
Get the root of the tree containing node node
. This requires node
to have the trait StoredParents
.
Example Implementations
- All objects in base which define the abstract trees interface are defined in
builtins.jl
. IDTree
OneNode
OneTree
FSNode
BinaryNode