MoYe

Stable Dev Build Status Coverage

The MoYe.jl library draws significant inspiration from NVIDIA's CuTe and is built with similar underlying structures.

The name Mo Ye is derived from an ancient Chinese legend of swordsmiths.

Installation

pkg> add MoYe

Quick Start

julia> data = [i for i in 1:48];
julia> a = MoYeArray(data, @Layout((6,8)))
6×8 MoYeArray{Int64, 2, ViewEngine{Int64, Ptr{Int64}}, Layout{2, Tuple{Static.StaticInt{6}, Static.StaticInt{8}}, Tuple{Static.StaticInt{1}, Static.StaticInt{6}}}}:
 1   7  13  19  25  31  37  43
 2   8  14  20  26  32  38  44
 3   9  15  21  27  33  39  45
 4  10  16  22  28  34  40  46
 5  11  17  23  29  35  41  47
 6  12  18  24  30  36  42  48

julia> subtile_a = @tile a static((3,4)) (1, 2) # partition a into subtiles of shape 3 x 4, returns the subtile at (1,2)
3×4 MoYeArray{Int64, 2, ViewEngine{Int64, Ptr{Int64}}, Layout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{4}}, Tuple{Static.StaticInt{1}, Static.StaticInt{6}}}}:
 25  31  37  43
 26  32  38  44
 27  33  39  45

julia> workitems_a = @parallelize subtile_a static((3,2)) (1,1) # 3 x 2 threads, returns what thread (1,1) is working on
1×2 MoYeArray{Int64, 2, ViewEngine{Int64, Ptr{Int64}}, Layout{2, Tuple{Static.StaticInt{1}, Static.StaticInt{2}}, Tuple{Static.StaticInt{0}, Static.StaticInt{12}}}}:
 25  37

julia> for i in eachindex(workitems_a)
           workitems_a[i] = 0
       end

julia> a
6×8 MoYeArray{Int64, 2, ViewEngine{Int64, Ptr{Int64}}, Layout{2, Tuple{Static.StaticInt{6}, Static.StaticInt{8}}, Tuple{Static.StaticInt{1}, Static.StaticInt{6}}}}:
 1   7  13  19   0  31   0  43
 2   8  14  20  26  32  38  44
 3   9  15  21  27  33  39  45
 4  10  16  22  28  34  40  46
 5  11  17  23  29  35  41  47
 6  12  18  24  30  36  42  48
 
 julia> @tile subtile_a static((3,1)) (1, 2) # if you want, you can always tile a subtile
3×1 MoYeArray{Int64, 2, ViewEngine{Int64, Ptr{Int64}}, Layout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{1}}, Tuple{Static.StaticInt{1}, Static.StaticInt{0}}}}:
 31
 32
 33

Tile Iterator

julia> data = collect(1:36);

julia> A = MoYeArray(data, @Layout((4,9)))
4×9 MoYeArray{Int64, 2, ViewEngine{Int64, Ptr{Int64}}, Layout{2, Tuple{Static.StaticInt{4}, Static.StaticInt{9}}, Tuple{Static.StaticInt{1}, Static.StaticInt{4}}}} with indices static(1):static(4)×static(1):static(9):
 1  5   9  13  17  21  25  29  33
 2  6  10  14  18  22  26  30  34
 3  7  11  15  19  23  27  31  35
 4  8  12  16  20  24  28  32  36

julia> tiled_A = zipped_divide(A, (@Layout(2), @Layout(3))) # 2 × 3 tile
6×6 MoYeArray{Int64, 2, ViewEngine{Int64, Ptr{Int64}}, Layout{2, Tuple{Tuple{Static.StaticInt{2}, Static.StaticInt{3}}, Tuple{Static.StaticInt{2}, Static.StaticInt{3}}}, Tuple{Tuple{Static.StaticInt{1}, Static.StaticInt{4}}, Tuple{Static.StaticInt{2}, Static.StaticInt{12}}}}} with indices static(1):static(6)×static(1):static(6):
  1   3  13  15  25  27
  2   4  14  16  26  28
  5   7  17  19  29  31
  6   8  18  20  30  32
  9  11  21  23  33  35
 10  12  22  24  34  36

julia> for i in axes(tiled_A, 2)
           @show view(tiled_A, :, i)
       end
view(tiled_A, :, i) = [1, 2, 5, 6, 9, 10]
view(tiled_A, :, i) = [3, 4, 7, 8, 11, 12]
view(tiled_A, :, i) = [13, 14, 17, 18, 21, 22]
view(tiled_A, :, i) = [15, 16, 19, 20, 23, 24]
view(tiled_A, :, i) = [25, 26, 29, 30, 33, 34]
view(tiled_A, :, i) = [27, 28, 31, 32, 35, 36]

Current Status

Tensor Core MMA: Traits-Level Support - see mma_unpack

One of the future goals is to support composable tiled mma. Contributions from the community are very much welcome and encouraged. If you're interested in helping out, please don't hesitate to get in touch or submit a pull request.

Notes on WMMA

Supporting WMMA is not a priority for MoYe, it is considered an outdated class of API.