ExpressionExplorer.FunctionNameSignaturePairType

For an expression like function Base.sqrt(x::Int)::Int x; end, it has the following fields:

  • name::FunctionName: the name, [:Base, :sqrt]
  • signature_hash::UInt: a UInt that is unique for the type signature of the method declaration, ignoring argument names. In the example, this is equals hash(ExpressionExplorer.canonalize( :(Base.sqrt(x::Int)::Int) )), see canonalize for more details.
ExpressionExplorer.ReactiveNodeType

Every cell is a node in the reactive graph. The nodes/point/vertices are the cells, and the edges/lines/arrows are the dependencies between cells. In a reactive notebook, these dependencies are the global variable references and definitions. (For the mathies: a reactive notebook is represented by a directed multigraph. A notebook without reactivity errors is an acyclic directed multigraph.) This struct contains the back edges (references) and forward edges (definitions, soft_definitions, funcdefs_with_signatures, funcdefs_without_signatures) of a single node.

Before 0.12.0, we could have written this struct with just two fields: references and definitions (both of type Set{Symbol}) because we used variable names to form the reactive links. However, to support defining multiple methods of the same function in different cells (https://github.com/fonsp/Pluto.jl/issues/177), we needed to change this. You might want to think about this old behavior first (try it on paper) before reading on.

The essential idea is that edges are still formed by variable names. Simple global variables (x = 1) are registered by their name as Symbol, but function definitions f(x::Int) = 5 are sometimes stored in two ways:

  • by their name (f) as Symbol, in funcdefs_without_signatures, and
  • by their name with its method signature as FunctionNameSignaturePair, in funcdefs_with_signatures.

The name without signature is most important: it is used to find the reactive dependencies between cells. The name with signature is needed to detect multiple cells that define methods with the same signature (f(x) = 1 and f(x) = 2) - this is illegal. This is why we do not collect definitions, funcdefs_with_signatures and funcdefs_without_signatures onto a single pile: we need them separately for different searches.

ExpressionExplorer.ReactiveNodeMethod

Turn a SymbolsState into a ReactiveNode. The main differences are:

  • A SymbolsState is a nested structure of function definitions inside function definitions inside... This conversion flattens this structure by merging SymbolsStates from defined functions.
  • ReactiveNode functions as a cache to improve efficiently, by turning the nested structures into multiple Set{Symbol}s with fast lookups.
ExpressionExplorer.SymbolsStateType

SymbolsState trickles down the ASTree: it carries referenced and defined variables from endpoints down to the root.

ExpressionExplorer.UsingsImportsType
UsingsImports()

This struct is generated by compute_usings_imports(ex::Expr) and represents the using and import statements present in the given expr ex. Additionally, the usings_isglobal and imports_isglobal represents whether the corresponding using or import statement is in global scope or not (nested in a module).

ExpressionExplorer.canonalizeMethod

Turn a function definition expression (Expr) into a "canonical" form, in the sense that two methods that would evaluate to the same method signature have the same canonical form. Part of a solution to https://github.com/fonsp/Pluto.jl/issues/177. Such a canonical form cannot be achieved statically with 100% correctness (impossible), but we can make it good enough to be practical.

Wait, "evaluate to the same method signature"?

In Pluto, you cannot do definitions of the same global variable in different cells. This is needed for reactivity to work, and it avoid ambiguous notebooks and stateful stuff. This rule used to also apply to functions: you had to place all methods of a function in one cell. (Go and read more about methods in Julia if you haven't already.) But this is quite annoying, especially because multiple dispatch is so important in Julia code. So we allow methods of the same function to be defined across multiple cells, but we still want to throw errors when you define multiple methods with the same signature, because one overrides the other. For example:

julia> f(x) = 1
f (generic function with 1 method)

julia> f(x) = 2
f (generic function with 1 method)
``

After adding the second method, the function still has only 1 method. This is because the second definition overrides the first one, instead of being added to the method table. This example should be illegal in Julia, for the same reason that `f = 1` and `f = 2` is illegal. So our problem is: how do we know that two cells will define overlapping methods? 

Ideally, we would just evaluate the user's code and **count methods** afterwards, letting Julia do the work. Unfortunately, we need to know this info _before_ we run cells, otherwise we don't know in which order to run a notebook! There are ways to break this circle, but it would complicate our process quite a bit.

Instead, we will do _static analysis_ on the function definition expressions to determine whether they overlap. This is non-trivial. For example, `f(x)` and `f(y::Any)` define the same method. Trickier examples are here: https://github.com/fonsp/Pluto.jl/issues/177#issuecomment-645039993

# Wait, "function definition expressions"?
For example:

julia e = :(function f(x::Int, y::String) x + y end)

dump(e, maxdepth=2)

#= gives:

Expr head: Symbol function args: Array{Any}((2,)) 1: Expr 2: Expr =#


This first arg is the function head:

julia e.args[1] == :(f(x::Int, y::String))


# Mathematics
Our problem is to find a way to compute the equivalence relation ~ on `H × H`, with `H` the set of function head expressions, defined as:

`a ~ b` iff evaluating both expressions results in a function with exactly one method.

_(More precisely, evaluating `Expr(:function, x, Expr(:block))` with `x ∈ {a, b}`.)_

The equivalence sets are isomorphic to the set of possible Julia methods.

Instead of finding a closed form algorithm for `~`, we search for a _canonical form_: a function `canonical: H -> H` that chooses one canonical expression per equivalence class. It has the property 
    
`canonical(a) = canonical(b)` implies `a ~ b`.

We use this **canonical form** of the function's definition expression as its "signature". We compare these canonical forms when determining whether two function expressions will result in overlapping methods.

# Example

julia e1 = :(f(x, z::Any)) e2 = :(g(x, y))

canonalize(e1) == canonalize(e2)

julia e1 = :(f(x)) e2 = :(g(x, y))

canonalize(e1) != canonalize(e2)

julia e1 = :(f(a::X, b::wow(ie), c, d...; e=f) where T) e2 = :(g(z::X, z::wow(ie), z::Any, z... ) where T)

canonalize(e1) == canonalize(e2) ```

ExpressionExplorer.compute_symbols_stateMethod

computesymbolsstate(ex::Any)::SymbolsState

Return the global references, assignment, function calls and function definitions inside an arbitrary expression, in a SymbolsState object.

ExpressionExplorer.compute_usings_importsMethod
compute_usings_imports(ex)::UsingsImports

Get the list of subexpressions like using Module.Z, SomethingElse or import Module that are contained in this expression.

ExpressionExplorer.explore_funcdef!Method

Return the function name and the SymbolsState from argument defaults. Add arguments as hidden globals to the scopestate.

Is also used for struct and abstract.

ExpressionExplorer.explore_module_definition!Method

Goes through a module definition, and picks out import ..x's, which are references to the outer module. We need module_depth + 1 dots before the specifier, so nested modules can still access Pluto.

ExpressionExplorer.generate_funcnamesMethod

Generates a vector of all possible variants from a function name

julia> generate_funcnames([:Base, :Foo, :bar])
3-element Vector{Symbol}:
 Symbol("Base.Foo.bar")
 Symbol("Foo.bar")
 :bar
ExpressionExplorer.get_rootassigneeFunction
get_rootassignee(ex)::Union{Symbol,Nothing}

If the expression is a (simple) assignment at its root, return the assignee as Symbol, return nothing otherwise.