HierarchicalUtils.jl

Mill.jl uses HierarchicalUtils.jl which brings a lot of additional features.

using HierarchicalUtils

Printing

For instance, Base.show with text/plain MIME calls HierarchicalUtils.printtree:

julia> ds = BagNode(ProductNode((BagNode(ArrayNode(randn(4, 10)),
                                         [1:2, 3:4, 5:5, 6:7, 8:10]),
                                 ArrayNode(randn(3, 5)),
                                 BagNode(BagNode(ArrayNode(randn(2, 30)),
                                                 [i:i+1 for i in 1:2:30]),
                                         [1:3, 4:6, 7:9, 10:12, 13:15]),
                                 ArrayNode(randn(2, 5)))),
                    [1:1, 2:3, 4:5])
BagNode with 3 obs
  └── ProductNode with 5 obs
        ├── BagNode with 5 obs
        │     ⋮
        ├── ArrayNode(3×5 Array with Float64 elements) with 5 obs
        ⋮
        └── ArrayNode(2×5 Array with Float64 elements) with 5 obs

julia> printtree(ds; htrunc=3)
BagNode with 3 obs
  └── ProductNode with 5 obs
        ├── BagNode with 5 obs
        │     ⋮
        ├── ArrayNode(3×5 Array with Float64 elements) with 5 obs
        ├── BagNode with 5 obs
        │     ⋮
        └── ArrayNode(2×5 Array with Float64 elements) with 5 obs

This can be used to print a non-truncated version of a model:

julia> printtree(ds)
BagNode with 3 obs
  └── ProductNode with 5 obs
        ├── BagNode with 5 obs
        │     └── ArrayNode(4×10 Array with Float64 elements) with 10 obs
        ├── ArrayNode(3×5 Array with Float64 elements) with 5 obs
        ├── BagNode with 5 obs
        │     └── BagNode with 15 obs
        │           └── ArrayNode(2×30 Array with Float64 elements) with 30 obs
        └── ArrayNode(2×5 Array with Float64 elements) with 5 obs

Traversal encoding

Callling with trav=true enables convenient traversal functionality with string indexing:

julia> m = reflectinmodel(ds)
BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10))
  └── ProductModel … ↦ ArrayModel(Dense(40, 10))
        ├── BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10))
        │     ⋮
        ├── ArrayModel(Dense(3, 10))
        ⋮
        └── ArrayModel(Dense(2, 10))

julia> printtree(m; trav=true)
BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)) [""]
  └── ProductModel … ↦ ArrayModel(Dense(40, 10)) ["U"]
        ├── BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)) ["Y"]
        │     └── ArrayModel(Dense(4, 10)) ["a"]
        ├── ArrayModel(Dense(3, 10)) ["c"]
        ├── BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)) ["g"]
        │     └── BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)) ["i"]
        │           └── ArrayModel(Dense(2, 10)) ["j"]
        └── ArrayModel(Dense(2, 10)) ["k"]

This way any node in the model tree is swiftly accessible, which may come in handy when inspecting model parameters or simply deleting/replacing/inserting nodes to tree (for instance when constructing adversarial samples). All tree nodes are accessible by indexing with the traversal code:.

julia> m["Y"]
BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10))
  └── ArrayModel(Dense(4, 10))

The following two approaches give the same result:

julia> m["Y"] === m.im.ms[1]
true

Counting functions

Other functions provided by HierarchicalUtils.jl:

julia> nnodes(ds)
9

julia> nleafs(ds)
4

julia> NodeIterator(ds) |> collect
9-element Array{AbstractNode,1}:
 BagNode with 3 obs
 ProductNode with 5 obs
 BagNode with 5 obs
 ArrayNode(4×10 Array with Float64 elements) with 10 obs
 ArrayNode(3×5 Array with Float64 elements) with 5 obs
 BagNode with 5 obs
 BagNode with 15 obs
 ArrayNode(2×30 Array with Float64 elements) with 30 obs
 ArrayNode(2×5 Array with Float64 elements) with 5 obs

julia> NodeIterator(ds, m) |> collect
9-element Array{Tuple{AbstractNode,AbstractMillModel},1}:
 (BagNode with 3 obs, BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)))
 (ProductNode with 5 obs, ProductModel … ↦ ArrayModel(Dense(40, 10)))
 (BagNode with 5 obs, BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)))
 (ArrayNode(4×10 Array with Float64 elements) with 10 obs, ArrayModel(Dense(4, 10)))
 (ArrayNode(3×5 Array with Float64 elements) with 5 obs, ArrayModel(Dense(3, 10)))
 (BagNode with 5 obs, BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)))
 (BagNode with 15 obs, BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10)))
 (ArrayNode(2×30 Array with Float64 elements) with 30 obs, ArrayModel(Dense(2, 10)))
 (ArrayNode(2×5 Array with Float64 elements) with 5 obs, ArrayModel(Dense(2, 10)))

julia> LeafIterator(ds) |> collect
4-element Array{ArrayNode{Array{Float64,2},Nothing},1}:
 ArrayNode(4×10 Array with Float64 elements) with 10 obs
 ArrayNode(3×5 Array with Float64 elements) with 5 obs
 ArrayNode(2×30 Array with Float64 elements) with 30 obs
 ArrayNode(2×5 Array with Float64 elements) with 5 obs

julia> TypeIterator(BagModel, m) |> collect
4-element Array{BagModel{T,Aggregation{Float32,Tuple{SegmentedMean{Float32,Array{Float32,1}},SegmentedMax{Float32,Array{Float32,1}}}},ArrayModel{Dense{typeof(identity),Array{Float32,2},Array{Float32,1}}}} where T<:AbstractMillModel,1}:
 BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10))
 BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10))
 BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10))
 BagModel … ↦ ⟨SegmentedMean(10), SegmentedMax(10)⟩ ↦ ArrayModel(Dense(21, 10))

julia> PredicateIterator(x -> nobs(x) ≥ 10, ds) |> collect
3-element Array{AbstractNode,1}:
 ArrayNode(4×10 Array with Float64 elements) with 10 obs
 BagNode with 15 obs
 ArrayNode(2×30 Array with Float64 elements) with 30 obs

For the complete showcase of possibilites, refer to HierarchicalUtils.jl and this notebook