# Evaluation & Derivatives

## Evaluation

Given an expression tree specified with a `Node`

type, you may evaluate the expression over an array of data with the following command:

`DynamicExpressions.EvaluateModule.eval_tree_array`

— Method```
eval_tree_array(
tree::AbstractExpressionNode{T},
cX::AbstractMatrix{T},
operators::OperatorEnum;
eval_options::Union{EvalOptions,Nothing}=nothing,
) where {T}
```

Evaluate a binary tree (equation) over a given input data matrix. The operators contain all of the operators used. This function fuses doublets and triplets of operations for lower memory usage.

**Arguments**

`tree::AbstractExpressionNode`

: The root node of the tree to evaluate.`cX::AbstractMatrix{T}`

: The input data to evaluate the tree on.`operators::OperatorEnum`

: The operators used in the tree.`eval_options::Union{EvalOptions,Nothing}`

: See`EvalOptions`

for documentation on the different evaluation modes.

**Returns**

`(output, complete)::Tuple{AbstractVector{T}, Bool}`

: the result, which is a 1D array, as well as if the evaluation completed successfully (true/false). A`false`

complete means an infinity or nan was encountered, and a large loss should be assigned to the equation.

**Notes**

This function can be represented by the following pseudocode:

```
def eval(current_node)
if current_node is leaf
return current_node.value
elif current_node is degree 1
return current_node.operator(eval(current_node.left_child))
else
return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))
```

The bulk of the code is for optimizations and pre-emptive NaN/Inf checks, which speed up evaluation significantly.

You can also use the following shorthand by using the expression as a function:

```
(tree::AbstractExpressionNode)(X, operators::OperatorEnum; kws...)
Evaluate a binary tree (equation) over a given input data matrix. The
operators contain all of the operators used. This function fuses doublets
and triplets of operations for lower memory usage.
# Arguments
- `tree::AbstractExpressionNode`: The root node of the tree to evaluate.
- `cX::AbstractMatrix{T}`: The input data to evaluate the tree on.
- `operators::OperatorEnum`: The operators used in the tree.
- `kws...`: Passed to [`eval_tree_array`](@ref).
# Returns
- `output::AbstractVector{T}`: the result, which is a 1D array.
Any NaN, Inf, or other failure during the evaluation will result in the entire
output array being set to NaN.
```

For example,

```
using DynamicExpressions
operators = OperatorEnum(; binary_operators=[+, -, *], unary_operators=[cos])
tree = Node(; feature=1) * cos(Node(; feature=2) - 3.2)
tree([1 2 3; 4 5 6.], operators)
```

```
3-element Vector{Float64}:
0.6967067435533686
-0.4544040965128262
-2.8266669740855317
```

This is possible because when you call `OperatorEnum`

, it automatically re-defines `(::Node)(X)`

to call the evaluation operation with the given `operators loaded. It also re-defines`

print`,`

show`, and the various operators, to work with the`

Node` type.

The `Node`

type does not know about which `OperatorEnum`

you used to create it. Thus, if you define an expression with one `OperatorEnum`

, and then try to evaluate it or print it with a different `OperatorEnum`

, you will get undefined behavior!

For safer behavior, you should use `Expression`

objects.

Evaluation options are specified using `EvalOptions`

:

`DynamicExpressions.EvaluateModule.EvalOptions`

— Type`EvalOptions{T,B,E}`

This holds options for expression evaluation, such as evaluation backend.

**Fields**

`turbo::Val{T}=Val(false)`

: If`Val{true}`

, use LoopVectorization.jl for faster evaluation.`bumper::Val{B}=Val(false)`

: If`Val{true}`

, use Bumper.jl for faster evaluation.`early_exit::Val{E}=Val(true)`

: If`Val{true}`

, any element of any step becoming`NaN`

or`Inf`

will terminate the computation. For`eval_tree_array`

, this will result in the second return value, the completion flag, being`false`

. For calling an expression using`tree(X)`

, this will result in`NaN`

s filling the entire buffer. This early exit is performed to avoid wasting compute cycles. Setting`Val{false}`

will continue the computation as usual and thus result in`NaN`

s only in the elements that actually have`NaN`

s.

You can also work with arbitrary types, by defining a `GenericOperatorEnum`

instead. The notation is the same for `eval_tree_array`

, though it will return `nothing`

when it can't find a method, and not do any NaN checks:

`DynamicExpressions.EvaluateModule.eval_tree_array`

— Method`eval_tree_array(tree::AbstractExpressionNode, cX::AbstractMatrix, operators::GenericOperatorEnum; throw_errors::Bool=true)`

Evaluate a generic binary tree (equation) over a given input data, whatever that input data may be. The `operators`

enum contains all of the operators used. Unlike `eval_tree_array`

with the normal `OperatorEnum`

, the array `cX`

is sliced only along the first dimension. i.e., if `cX`

is a vector, then the output of a feature node will be a scalar. If `cX`

is a 3D tensor, then the output of a feature node will be a 2D tensor. Note also that `tree.feature`

will index along the first axis of `cX`

.

However, there is no requirement about input and output types in general. You may set up your tree such that some operator nodes work on tensors, while other operator nodes work on scalars. `eval_tree_array`

will simply return `nothing`

if a given operator is not defined for the given input type.

This function can be represented by the following pseudocode:

```
function eval(current_node)
if current_node is leaf
return current_node.value
elif current_node is degree 1
return current_node.operator(eval(current_node.left_child))
else
return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))
```

**Arguments**

`tree::AbstractExpressionNode`

: The root node of the tree to evaluate.`cX::AbstractArray`

: The input data to evaluate the tree on.`operators::GenericOperatorEnum`

: The operators used in the tree.`throw_errors::Bool=true`

: Whether to throw errors if they occur during evaluation. Otherwise, MethodErrors will be caught before they happen and evaluation will return`nothing`

, rather than throwing an error. This is useful in cases where you are unsure if a particular tree is valid or not, and would prefer to work with`nothing`

as an output.

**Returns**

`(output, complete)::Tuple{Any, Bool}`

: the result, as well as if the evaluation completed successfully (true/false). If evaluation failed,`nothing`

will be returned for the first argument. A`false`

complete means an operator was called on input types that it was not defined for.

Likewise for the shorthand notation:

```
(tree::Node)(X::AbstractMatrix, operators::GenericOperatorEnum; throw_errors::Bool=true)
# Arguments
- `X::AbstractArray`: The input data to evaluate the tree on.
- `operators::GenericOperatorEnum`: The operators used in the tree.
- `throw_errors::Bool=true`: Whether to throw errors
if they occur during evaluation. Otherwise,
MethodErrors will be caught before they happen and
evaluation will return `nothing`,
rather than throwing an error. This is useful in cases
where you are unsure if a particular tree is valid or not,
and would prefer to work with `nothing` as an output.
# Returns
- `output`: the result of the evaluation.
If evaluation failed, `nothing` will be returned for the first argument.
A `false` complete means an operator was called on input types
that it was not defined for. You can change this behavior by
setting `throw_errors=false`.
```

## Derivatives

`DynamicExpressions.jl`

can efficiently compute first-order derivatives of expressions with respect to variables or constants. This is done using either `eval_diff_tree_array`

, to compute derivative with respect to a single variable, or with `eval_grad_tree_array`

, to compute the gradient with respect all variables (or, all constants). Both use forward-mode automatic, but use `Zygote.jl`

to compute derivatives of each operator, so this is very efficient.

`DynamicExpressions.EvaluateDerivativeModule.eval_diff_tree_array`

— Method`eval_diff_tree_array(tree::AbstractExpressionNode{T}, cX::AbstractMatrix{T}, operators::OperatorEnum, direction::Integer; turbo::Union{Bool,Val}=Val(false))`

Compute the forward derivative of an expression, using a similar structure and optimization to eval*tree*array. `direction`

is the index of a particular variable in the expression. e.g., `direction=1`

would indicate derivative with respect to `x1`

.

**Arguments**

`tree::AbstractExpressionNode`

: The expression tree to evaluate.`cX::AbstractMatrix{T}`

: The data matrix, with each column being a data point.`operators::OperatorEnum`

: The operators used to create the`tree`

.`direction::Integer`

: The index of the variable to take the derivative with respect to.`turbo::Union{Bool,Val}`

: Use LoopVectorization.jl for faster evaluation. Currently this does not have any effect.

**Returns**

`(evaluation, derivative, complete)::Tuple{AbstractVector{T}, AbstractVector{T}, Bool}`

: the normal evaluation, the derivative, and whether the evaluation completed as normal (or encountered a nan or inf).

`DynamicExpressions.EvaluateDerivativeModule.eval_grad_tree_array`

— Method`eval_grad_tree_array(tree::AbstractExpressionNode{T}, cX::AbstractMatrix{T}, operators::OperatorEnum; variable::Union{Bool,Val}=Val(false), turbo::Union{Bool,Val}=Val(false))`

Compute the forward-mode derivative of an expression, using a similar structure and optimization to eval*tree*array. `variable`

specifies whether we should take derivatives with respect to features (i.e., cX), or with respect to every constant in the expression.

**Arguments**

`tree::AbstractExpressionNode{T}`

: The expression tree to evaluate.`cX::AbstractMatrix{T}`

: The data matrix, with each column being a data point.`operators::OperatorEnum`

: The operators used to create the`tree`

.`variable::Union{Bool,Val}`

: Whether to take derivatives with respect to features (i.e.,`cX`

- with`variable=true`

), or with respect to every constant in the expression (`variable=false`

).`turbo::Union{Bool,Val}`

: Use LoopVectorization.jl for faster evaluation. Currently this does not have any effect.

**Returns**

`(evaluation, gradient, complete)::Tuple{AbstractVector{T}, AbstractMatrix{T}, Bool}`

: the normal evaluation, the gradient, and whether the evaluation completed as normal (or encountered a nan or inf).

You can compute gradients this with shorthand notation as well (which by default computes gradients with respect to input matrix, rather than constants).

```
(tree::Node{T})'(X::AbstractMatrix{T}, operators::OperatorEnum; turbo::Bool=false, variable::Bool=true)
Compute the forward-mode derivative of an expression, using a similar
structure and optimization to eval_tree_array. `variable` specifies whether
we should take derivatives with respect to features (i.e., X), or with respect
to every constant in the expression.
# Arguments
- `X::AbstractMatrix{T}`: The data matrix, with each column being a data point.
- `operators::OperatorEnum`: The operators used to create the `tree`.
- `variable::Bool`: Whether to take derivatives with respect to features (i.e., `X` - with `variable=true`),
or with respect to every constant in the expression (`variable=false`).
- `turbo::Bool`: Use `LoopVectorization.@turbo` for faster evaluation.
# Returns
- `(evaluation, gradient, complete)::Tuple{AbstractVector{T}, AbstractMatrix{T}, Bool}`: the normal evaluation,
the gradient, and whether the evaluation completed as normal (or encountered a nan or inf).
```

Alternatively, you can compute higher-order derivatives by using `ForwardDiff`

on the function `differentiable_eval_tree_array`

, although this will be slower.

`DynamicExpressions.EvaluateModule.differentiable_eval_tree_array`

— Method`differentiable_eval_tree_array(tree::AbstractExpressionNode, cX::AbstractMatrix, operators::OperatorEnum)`

Evaluate an expression tree in a way that can be auto-differentiated.

### Enzyme

`DynamicExpressions.jl`

also supports automatic differentiation with `Enzyme.jl`

. Note that this is **extremely experimental**. You should expect to see occasional incorrect gradients. Be sure to explicitly verify gradients are correct for a particular space of operators (e.g., with finite differences).

Let's look at an example. First, let's create a tree:

```
using DynamicExpressions
operators = OperatorEnum(binary_operators=(+, -, *, /), unary_operators=(cos, sin))
x1 = Node{Float64}(feature=1)
x2 = Node{Float64}(feature=2)
tree = 0.5 * x1 + cos(x2 - 0.2)
```

Now, say we want to take the derivative of this expression with respect to x1 and x2. First, let's evaluate it normally:

```
X = [1.0 2.0 3.0; 4.0 5.0 6.0] # 2x3 matrix (2 features, 3 rows)
tree(X, operators)
```

Now, let's use `Enzyme.jl`

to compute the derivative of the outputs with respect to x1 and x2, using reverse-mode autodiff:

```
using Enzyme
function my_loss_function(tree, X, operators)
# Get the outputs
y = tree(X, operators)
# Sum them (so we can take a gradient, rather than a jacobian)
return sum(y)
end
dX = begin
storage=zero(X)
autodiff(
Reverse,
my_loss_function,
Active,
## Actual arguments to function:
Const(tree),
Duplicated(X, storage),
Const(operators),
)
storage
end
```

This will get returned as

```
2×3 Matrix{Float64}:
0.5 0.5 0.5
0.611858 0.996165 0.464602
```

which one can confirm is the correct gradient!

This will take a while the first time you run it, as Enzyme needs to take the gradients of the actual LLVM IR code. Subsequent runs won't spend any time compiling and be much faster.

Some general notes about this:

- We want to take a reverse-mode gradient, so we pass
`Reverse`

to`autodiff`

. - Since we want to take the gradient of the
*output*of`my_loss_function`

, we declare`Active`

as the third argument. - Following this, we pass our actual arguments to the function.
- Objects which we don't want to take gradients with respect to, and also don't temporarily store any data during the computation (such as
`tree`

and`operators`

here) should be wrapped with`Const`

. - Objects which we wish to take derivatives with respect to, we need to use
`Duplicated`

, and explicitly create a copy of it, with all numerical values set to zero. Enzyme will then store the derivatives in this object.

- Objects which we don't want to take gradients with respect to, and also don't temporarily store any data during the computation (such as

Note that you should never use anything other than `turbo=Val(false)`

with Enzyme, as Enzyme and LoopVectorization are not compatible, and will cause a segfault. *Even using turbo=false will not work, because it would cause Enzyme to trace the (unused) LoopVectorization code!*