Transformers.jl

Julia implementation of Transformers models

This is the documentation of Transformers: The Julia solution for using Transformer models based on Flux.jl

Installation

In the Julia REPL:

julia> ]add Transformers

For using GPU, make sure CUDA.jl is runable on your computer:

julia> ]add CUDA; build

Implemented model

You can find the code in example folder.

Example

Using pretrained Bert with Transformers.jl.

using Transformers
using Transformers.Basic
using Transformers.Pretrain

ENV["DATADEPS_ALWAYS_ACCEPT"] = true

bert_model, wordpiece, tokenizer = pretrain"bert-uncased_L-12_H-768_A-12"
vocab = Vocabulary(wordpiece)

text1 = "Peter Piper picked a peck of pickled peppers" |> tokenizer |> wordpiece
text2 = "Fuzzy Wuzzy was a bear" |> tokenizer |> wordpiece

text = ["[CLS]"; text1; "[SEP]"; text2; "[SEP]"]
@assert text == [
    "[CLS]", "peter", "piper", "picked", "a", "peck", "of", "pick", "##led", "peppers", "[SEP]", 
    "fuzzy", "wu", "##zzy",  "was", "a", "bear", "[SEP]"
]

token_indices = vocab(text)
segment_indices = [fill(1, length(text1)+2); fill(2, length(text2)+1)]

sample = (tok = token_indices, segment = segment_indices)

bert_embedding = sample |> bert_model.embed
feature_tensors = bert_embedding |> bert_model.transformers

Module Hierarchy

Basic functionality of Transformers.jl, provide the Transformer encoder/decoder implementation and other convenient function.

Functions for download and loading pretrain models.

Helper struct and DSL for stacking functions/layers.

Functions for loading some common Datasets

Implementation of gpt-1 model

Implementation of BERT model

Outline