# AbstractTrees.jl

This package provides an interface for handling tree-like data structures in Julia.

Specifically, a tree consists of a set of nodes (each of which can be represented by any data type) which are connected in a graph with no cycles. For example, each object in the nested array `[1, [2, 3]]`

can be represented by a tree in which the object `[1, [2, 3]]`

is itself the root of the tree, `1`

and `[2,3]`

are its children and `1`

, `2`

and `3`

are the leaves.

Using this package involves implementing the abstract tree interface which, at a minimum, requires defining the function `AbstractTrees.children`

for an object.

See below for a complete guide on how to implement the interface.

## The Abstract Tree Interface

### Functions

All trees *must* define `children`

.

By default `children`

returns an empty tuple and `parent`

returns `nothing`

, meaning that all objects which do not define the abstract trees interface can be considered the sole node of a tree.

`AbstractTrees.children`

— Function`children(node)`

Get the immediate children of node `node`

.

By default, every object is a parent node of an empty set of children. This is to make it simpler to define trees with nodes of different types, for example arrays are trees regardless of their `eltype`

.

**REQUIRED**: This is required for all tree nodes with non-empty sets of children. If it is not possible to infer the children from the node alone, this should be defined for a wrapper object which does.

`AbstractTrees.nodevalue`

— Method`nodevalue(node)`

Get the value associated with a node in a tree. This removes wrappers such as `Indexed`

or `TreeCursor`

s.

By default, this function is the identity.

**OPTIONAL**: This should be implemented with any tree for which nodes have some "value" apart from the node itself. For example, integers cannot themselves be tree nodes, to create a tree in which the "nodes" are integers one can do something like

```
struct IntNode
value::Int
children::Vector{IntNode}
end
AbstractTrees.nodevalue(n::IntNode) = n.value
```

`AbstractTrees.parent`

— Function`parent(node)`

Get the immediate parent of a node `node`

.

By default all objects are considered nodes of a trivial tree with no children and no parents. That is, the default method is simply `parent(node) = nothing`

.

**OPTIONAL**: The 1-argument version of this function must be implemented for nodes with the `StoredParents`

trait.

`AbstractTrees.nextsibling`

— Function`nextsibling(node)`

Get the next sibling (child of the same parent) of the tree node `node`

. The returned node should be the same as the node that would be returned after `node`

when iterating over `(children ∘ parent)(node)`

.

**OPTIONAL**: This function is required for nodes with the `StoredSiblings`

trait. There is no default definition.

`AbstractTrees.prevsibling`

— Function`prevsibling(node)`

Get the previous sibling (child of the same parent) of the tree node `node`

. The returned node should be the same as the node that would be returned prior to `node`

when iterating over `(children ∘ parent)(node)`

.

**OPTIONAL**: This function is optional in all cases. Default AbstractTrees method assume that it is impossible to obtain the previous sibling and all iterators act in the "forward" direction.

`AbstractTrees.childrentype`

— Function```
childrentype(::Type{T})
childrentype(n)
```

Indicates the type of the children (the *collection* of children, not individual children) of the tree node `n`

or its type `T`

. `children`

should return an object of this type.

If the `childrentype`

can be inferred from the type of the node alone, the type `::Type{T}`

definition is preferred (the latter will fall back to it).

**OPTIONAL**: In most cases, `childtype`

is used instead. If `childtype`

is not defined it will fall back to `eltype ∘ childrentype`

.

`AbstractTrees.childtype`

— Function```
childtype(::Type{T})
childtype(n)
```

Indicates the type of children of the tree node `n`

or its type `T`

.

If `childtype`

can be inferred from the type of the node alone, the type `::Type{T}`

definition is preferred (the latter will fall back to it).

**OPTIONAL**: It is strongly recommended to define this wherever possible, as without it almost no tree algorithms can be type-stable. If `childrentype`

is defined and can be known from the node type alone, this function will fall back to `eltype(childrentype(T))`

. If this gives a correct result it's not necessary to define `childtype`

.

`AbstractTrees.childstatetype`

— Function```
childstatetype(::Type{T})
childstatetype(n)
```

Indicates the type of the iteration state of the tree node `n`

or its type `T`

. This is used by tree traversal algorithms which must retain this state. It therefore is necessary to define this to ensure that most tree traversal is type stable.

**OPTIONAL**: Type inference is used to attempt to

In general nodes of a tree do not all need to have the same type, but it is much easier to achieve type-stability if they do. To specify that all nodes of a tree must have the same type, one should define `Base.eltype(::Type{<:TreeIterator{T}})`

, see Iteration.

### Traits

`ParentLinks`

The default value of `ParentLinks`

is `ImplicitParents`

.

Types with the `StoredParents`

trait *must* define `parent`

.

`AbstractTrees.ParentLinks`

— Type```
ParentLinks(::Type{T})
ParentLinks(tree)
```

A trait which indicates whether a tree node stores references to its parents (`StoredParents()`

) or if the parents must be inferred from the tree structure (`ImplicitParents()`

).

Trees for which `ParentLinks`

returns `StoredParents()`

*MUST* implement `parent`

.

If `StoredParents()`

, all nodes in the tree must also have `StoredParents()`

, otherwise use `ImplicitParents()`

.

**OPTIONAL**: This should be implemented for a tree if parents of nodes are stored

```
AbstractTrees.ParentLinks(::Type{<:TreeType}) = AbstractTrees.StoredParents()
parent(t::TreeType) = get_parent(t)
```

`AbstractTrees.ImplicitParents`

— Type`ImplicitParents <: ParentLinks`

Indicates that the tree does not store parents. In these cases parents must be inferred from the tree structure and cannot be inferred through a single node.

`AbstractTrees.StoredParents`

— Type`StoredParents <: ParentLinks`

Indicates that this node stores parent links explicitly. The implementation is responsible for defining the parentind function to expose this information.

If a node in a tree has this trait, so must all connected nodes. If this is not the case, use `ImplicitParents`

instead.

**Required Methods**

`SiblingLinks`

The default value of `SiblingLinks`

is `ImplicitSiblings`

.

Types with the `StoredSiblings`

trait *must* define `nextsibling`

and *may* define `prevsibling`

.

`AbstractTrees.SiblingLinks`

— Type```
SiblingLinks(::Type{T})
SiblingLinks(tree)
```

A trait which indicates whether a tree node stores references to its siblings (`StoredSiblings()`

) or must be inferred from the tree structure (`ImplicitSiblings()`

).

If a node has the trait `StoredSiblings()`

, so must all connected nodes in the tree. Otherwise, use `ImplicitSiblings()`

instead.

**OPTIONAL**: This should be implemented for a tree if siblings of nodes are stored

`AbstractTrees.SiblingLinks(::Type{<:TreeType}) = AbstractTrees.StoredSiblings()`

`AbstractTrees.ImplicitSiblings`

— Type`ImplicitSiblings <: SiblingLinks`

Indicates that tree nodes do not store references to siblings so that they must be inferred from the tree structure.

`AbstractTrees.StoredSiblings`

— Type`StoredSiblings <: SiblingLinks`

Indicates that this tree node stores sibling links explicitly, or can compute them quickly (e.g. because the tree has a (small) fixed branching ratio, so the current index of a node can be determined by quick linear search).

If a node has this trait, so must all connected nodes in the tree. Otherwise, use `ImplicitSiblings()`

instead.

**Required Methods**

`ChildIndexing`

`AbstractTrees.ChildIndexing`

— Type```
ChildIndexing(::Type{N})
ChildIndexing(node)
```

A trait indicating whether the tree node `n`

has children (as returned by `children`

) which can be indexed using 1-based indexing. Options are either `NonIndexedChildren`

(default) or `IndexedChildren`

.

To declare that the tree `TreeType`

supports one-based indexing on the children, define

`AbstractTrees.ChildIndexing(::Type{<:TreeType}) = AbstractTrees.IndexedChildren()`

If a node has the `IndexedChildren()`

so must all connected nodes in the tree. Otherwise, use `NonIndexedChildren()`

instead.

`AbstractTrees.NonIndexedChildren`

— Type`NonIndexedChildren <: ChildIndexing`

Indicates that the object returned by `children(n)`

where `n`

is a tree node is not necessarily indexable. This trait applies to any tree which cannot guarantee indexable children in all cases, regardless of whether the tree is indexable in special cases. For example, `Array`

has this trait even though there is a large class of indexable trees consisting of arrays.

`AbstractTrees.IndexedChildren`

— Type`IndexedChildren <: ChildIndexing`

Indicates that the object returned by `children(n)`

where `n`

is a tree node is indexable (1-based).

If a node has this trait so must all connected nodes in the tree. Otherwise, use `NonIndexedChildren()`

instead.

**Required Methods**

- A node
`node`

with this trait must return an indexable object from`children(node)`

.

The default value of `ChildIndexing`

is `NonIndexedChildren`

.

Types with the `IndexedChildren`

trait *must* return an indexable object from `children`

(i.e. `children(node)[idx]`

must be valid for positive integers `idx`

).

`NodeType`

`AbstractTrees.NodeType`

— Type```
NodeType(::Type)
NodeType(node)
```

A trait which specifies whether a tree has a predictable node type (`HasNodeType()`

) or not (`NodeTypeUnknown()`

).

This is analogous to `Base.IteratorEltype`

. In particular the `IteratorEltype`

of `TreeIterator`

is dictated by this trait.

The default value is `NodeTypeUnknown()`

.

`AbstractTrees.NodeTypeUnknown`

— Type`NodeTypeUnknown <: NodeType`

Indicates that this node is connected to a tree for which it cannot be guaranteed that all nodes have the same type.

`AbstractTrees.HasNodeType`

— Type`HasNodeType <: NodeType`

Indicates that this node is connected to a tree for which *all* nodes have types descended from `eltype(node)`

.

Providing the `HasNodeType`

trait will guarantee that all nodes connected to the node must be of the type returned by `nodetype`

.

An important use case of this trait is to guarantee the return types of a `TreeIterator`

. Tree nodes with `NodeTypeUnknown`

cannot have type-stable iteration over the entire tree.

For example

```
struct ExampleNode
x::Int
children::Vector{ExampleNode}
end
AbstractTrees.nodevalue(x::ExampleNode) = x.x
AbstractTrees.children(x::ExampleNode) = x.children
AbstractTrees.NodeType(::Type{<:ExampleNode}) = HasNodeType()
AbstractTrees.nodetype(::Type{<:ExampleNode}) = ExampleNode
```

In this example, iteration over a tree of `ExampleNode`

s is type-stable with `eltype`

`ExampleNode`

.

Providing the `nodetype(::Type)`

method is preferable to defining `nodetype(::ExampleNode)`

because it ensures that `nodetype`

can be involved at compile time even if values are not known.

## The Indexed Tree Interface

The abstract tree interface assumes that all information about the descendants of a node is accessible through the node itself. The objects which can implement that interface are the nodes of a tree, not the tree itself.

The interface for trees which do not possess nodes for which tree structure can be inferred from the nodes alone is different. This is done by wrapping nodes in the `IndexNode`

nodes which allow nodes to be accessed from a centralized tree object.

`AbstractTrees.IndexNode`

— Type`IndexNode{T,I}`

The node of a tree which implements the indexed tree interface. Such a tree consists of an object `tree`

from which nodes can be obtained with the two-argument method of `nodevalue`

which by default calls `getindex`

.

An `IndexNode`

implements the tree interface, and can be thought of an adapter from an object that implements the indexed tree interface to one that implements the tree interface.

`IndexNode`

do not store the value associated with the node but can obtain it by calling `nodevalue`

.

**Constructors**

```
IndexNode(tree, node_index)
IndexNode(tree) = IndexNode(tree, rootindex(tree)) # one-argument constructor requires `rootindex`
```

Here `tree`

is an object which stores or can obtain information for the entire tree structure, and `node_index`

is the index of the node for which `node_index`

is being constructed.

### Functions

All indexed trees *must* implement `childindices`

.

Indexed trees rely on `nodevalue(::Any, ::Any)`

for obtaining the value of a node given the tree and index. By default, `nodevalue(tree, idx) = tree[idx]`

, trees which do not store nodes in this way should define `nodevalue`

.

Indexed trees can define `ParentLinks`

or `SiblingLinks`

. The `IndexNode`

s of a tree will inherit these traits from the wrapped tree.

`AbstractTrees.childindices`

— Function`AbstractTrees.nodevalue`

— Method`nodevalue(tree, node_index)`

Get the value of the node specified by `node_index`

from the indexed tree object `tree`

.

By default, this falls back to `tree[node_index]`

.

**OPTIONAL**: Indexed trees only require this if the fallback to `getindex`

is not sufficient.

`AbstractTrees.parentindex`

— Function`parentindex(tree, node_index)`

Get the index of the parent of the node of tree `tree`

specified by `node_index`

.

Nodes that have no parent (i.e. the root node) should return `nothing`

.

**OPTIONAL**: Indexed trees with the `StoredParents`

trait must implement this.

`AbstractTrees.nextsiblingindex`

— Function`nextsiblingindex(tree, node_index)`

Get the index of the next sibling of the node of tree `tree`

specified by `node_index`

.

Nodes which have no next sibling should return `nothing`

.

**OPTIONAL**: Indexed trees with the `StoredSiblings`

trait must implement this.

`AbstractTrees.prevsiblingindex`

— Function`prevsiblingindex(tree, node_index)`

Get the index of the previous sibling of the node of tree `tree`

specified by `node_index`

.

Nodes which have no previous sibling should return `nothing`

.

**OPTIONAL**: Indexed trees that have `StoredSiblings`

can implement this, but no built-in tree algorithms require it.

`AbstractTrees.rootindex`

— Function`rootindex(tree)`

Get the root index of the indexed tree `tree`

.

**OPTIONAL**: The single-argument constructor for `IndexNode`

requires this, but it is not required for any built-in tree algorithms.

## The `AbstractNode`

Type

It is not required that objects implementing the AbstractTrees.jl interface are of this type, but it can be used to indicate that an object *must* implement the interface.

`AbstractTrees.AbstractNode`

— Type`AbstractNode{T}`

Abstract type of tree nodes that implement the AbstractTrees.jl interface.

It is *NOT* necessary for tree nodes to inherit from this type to implement the AbstractTrees.jl interface. Conversely, all `AbstractNode`

types are required to satisfy the AbstractTrees.jl interface (i.e. they must at least define `children`

).

Package developers should keep in mind when writing methods that most trees *will not* be of this type. Therefore, any functions which are intended to work on any tree should not dispatch on `AbstractNode`

.

The type parameter `T`

is the type of the `nodevalue`

of the concrete type descented from `AbstractNode`

.

## Type Stability and Performance

Because of the recursive nature of trees it can be quite challenging to achieve type stability when traversing it in any way such as iterating over nodes. Only trees which guarantee that all nodes are of the same type (with `HasNodeType`

) can be type stable.

To make it easier to convert trees with non-uniform node types this package provides the `StableNode`

type.

`AbstractTrees.StableNode`

— Type`StableNode{T} <: AbstractNode{T}`

A node belonging to a tree in which all nodes are of type `StableNode{T}`

. This type is provided so that trees with `NodeTypeUnknown`

can implement methods to be converted to type-stable trees with indexable `children`

which allow for efficient traversal and iteration.

**Constructors**

```
StableNode{T}(x::T, ch)
StableNode(x, ch=())
StableNode{T}(𝒻, node)
```

**Arguments**

`x`

: the value of the constructed node, returned by`nodevalue`

.`ch`

: the children of the node, each must be of type`StableNode`

.`𝒻`

: A function which, when called on the node of a tree returns a value which should be wrapped by a`StableNode`

. The return value of`𝒻`

must be convertible to`T`

(see example).`T`

: The value type of the`StableNode`

s in a tree.`node`

: A node from a tree which is to be used to construct the`StableNode`

tree.

**Examples**

```
t = [1, [2,3]]
node = StableNode{Union{Int,Nothing}}(t) do n
n isa Integer ? convert(Int, n) : nothing
end
```

In the above example `node`

is a tree with `HasNodeType`

, nodes of type `StableNode{Union{Int,Nothing}}`

. The nodes in the new tree corresponding to arrays have value `nothing`

while other nodes have their corresponding `Int`

value.

To achieve the same performance with custom node types be sure to define at least

```
AbstractTrees.NodeType(::Type{<:ExampleNode}) = HasNodeType()
AbstractTrees.nodetype(::Type{<:ExampleNode}) = ExampleNode
```

In some circumstances it is also more efficient for nodes to have `ChildIndexing`

since this also guarantees the type of the iteration state of the iterator returned by `children`

.

## Additional Functions

`AbstractTrees.getdescendant`

— Function`getdescendant(node, idx)`

Obtain a node from a tree by indexing each level of the tree with the elements of `idx`

.

This function is defined for all trees regardless of whether they have the `IndexedChildren`

. This is because a tree without `IndexedChildren`

might have special cases in which all children are indexable, a prominent example being `Array`

which may not have indexable sub-trees (e.g. an array containing a Dict) but there are common special cases in which array trees are fully indexable (e.g. a tree in which every non-leaf node is an array).

The elements of `idx`

can be any argument to `getindex`

, not necessarily integers. For example, `getdescendant(Dict("a"=>1), ("a",))`

returns `1`

.

Note that this is a separate concept from indexed trees which by default do not have `IndexedChildren()`

, see `IndexNode`

.

**Example**

```
v = [1, [2, [3, 4]]]
getdescendant(v, (2, 2, 1)) == 3
```

`AbstractTrees.nodevalues`

— Function`nodevalues(itr::TreeIterator)`

An iterator which returns the `nodevalue`

of each node in the tree, equivalent to `Iterators.map(nodevalue, itr)`

.

`AbstractTrees.ischild`

— Function`ischild(node1, node2; equiv=(===))`

Check if `node1`

is a child of `node2`

.

By default this iterates through `children(node2)`

, so performance may be improved by adding a specialized method for given node type.

Equivalence is established with the `equiv`

function. New methods of this function should include this argument or else it will fall back to `===`

.

`AbstractTrees.isroot`

— Function`isroot(x)`

Whether `x`

is the absolute root of a tree. More specifically, this returns `true`

if `parent(x) ≡ nothing`

, or `parent(root, x) ≡ nothing`

. That is, while any node is the root of some tree, this function only returns true for nodes which have parents which cannot be obtained with the `AbstractTrees`

interface.

`AbstractTrees.intree`

— Function`intree(node, root; equiv=(≡))`

Check if `node`

is a member of the tree rooted at `root`

.

By default this traverses through the entire tree in search of `node`

, and so may be slow if a more specialized method has not been implemented for the given tree type.

Equivalence is established with the `equiv`

function. Note that new methods should also define `equiv`

or calls may fall back to the default method.

See also: `isdescendant`

`AbstractTrees.isdescendant`

— Function`isdescendant(node1, node2; equiv=(≡))`

Check if `node1`

is a descendant of `node2`

. This isequivalent to checking whether `node1`

is a member of the subtree rooted at `node2`

(see `intree`

) except that a node cannot be a descendant of itself.

Internally this calls `intree(node1, node2)`

and so may be slow if a specialized method of that function is not available.

Equivalence is established with the `equiv`

function. Note that new methods should also define `equiv`

or calls may fall back to the default method.

`AbstractTrees.treebreadth`

— Function`treebreadth(node)`

Get the number of leaves in the tree rooted at `node`

. Leaf nodes have a breadth of one.

By default this recurses through all nodes in the tree and so may be slow if a more specialized method has not been implemented for the given type.

`AbstractTrees.treeheight`

— Function`treeheight(node)`

Get the maximum depth from `node`

to any of its descendants. Leaf nodes have a height of zero.

By default this recurses through all nodes in the tree and so may be slow if a more specialized method has not been implemented for the given type.

`AbstractTrees.descendleft`

— Function`descendleft(node)`

Descend from the node `node`

to the first encountered leaf node by recursively calling `children`

and taking the first child.

`AbstractTrees.getroot`

— Function`getroot(node)`

Get the root of the tree containing node `node`

. This requires `node`

to have the trait `StoredParents`

.

## Example Implementations

- All objects in base which define the abstract trees interface are defined in
`builtins.jl`

. `IDTree`

`OneNode`

`OneTree`

`FSNode`

`BinaryNode`