Distributed table structures and data manipulation operations built on top of Dagger.jl

Dev Build Status Coverage Code Style: Blue


The package registered in the general repository, so you can add it by typing:

julia> ]add DTables


Below you can find a quick example on how to get started with DTables.

There's a lot more you can do though, so please refer to the documentation!

# launch a Julia session with threads/workers

julia> using DTables

julia> dt = DTable((a=rand(100), b=rand(100)), 10)
DTable with 10 partitions
Tabletype: NamedTuple

julia> m = map(r -> (x=sum(r), id=Threads.threadid(),), dt)
DTable with 10 partitions
Tabletype: NamedTuple

julia> xsum = reduce((x, y) -> x + y, m, init=0, cols=[:x])
EagerThunk (running)

julia> threads_used = reduce((acc, el) -> union(acc, el), m, init=Set(), cols=[:id])
EagerThunk (running)

julia> fetch(xsum)
(x = 95.71209812014976,)

julia> fetch(threads_used)
(id = Set(Any[5, 4, 6, 13, 2, 10, 9, 12, 8, 3]),)