Transformers.jl
Julia implementation of Transformers models
This is the documentation of Transformers
: The Julia solution for using Transformer models based on Flux.jl
Installation
In the Julia REPL:
julia> ]add Transformers
For using GPU, make sure CuArrays
is runable on your computer:
julia> ]add CuArrays; build
Implemented model
You can find the code in example
folder.
- Attention is all you need
- Improving Language Understanding by Generative Pre-Training
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Example
Using pretrained Bert with Transformers.jl
.
using Transformers
using Transformers.Basic
using Transformers.Pretrain
ENV["DATADEPS_ALWAYS_ACCEPT"] = true
bert_model, wordpiece, tokenizer = pretrain"bert-uncased_L-12_H-768_A-12"
vocab = Vocabulary(wordpiece)
text1 = "Peter Piper picked a peck of pickled peppers" |> tokenizer |> wordpiece
text2 = "Fuzzy Wuzzy was a bear" |> tokenizer |> wordpiece
text = ["[CLS]"; text1; "[SEP]"; text2; "[SEP]"]
@assert text == [
"[CLS]", "peter", "piper", "picked", "a", "peck", "of", "pick", "##led", "peppers", "[SEP]",
"fuzzy", "wu", "##zzy", "was", "a", "bear", "[SEP]"
]
token_indices = vocab(text)
segment_indices = [fill(1, length(text1)+2); fill(2, length(text2)+1)]
sample = (tok = token_indices, segment = segment_indices)
bert_embedding = sample |> bert_model.embed
feature_tensors = bert_embedding |> bert_model.transformers
Module Hierarchy
Basic functionality of Transformers.jl, provide the Transformer encoder/decoder implementation and other convenient function.
Functions for download and loading pretrain models.
Helper struct and DSL for stacking functions/layers.
Functions for loading some common Datasets
Implementation of gpt-1 model
Implementation of BERT model
Outline
- Tutorial
- Transformers.Basic
- Transformers.Stacks
- The Stack NNTopo DSL
- NNTopo Syntax
- "Chain" the functions
- Loop unrolling
- Multiple argument & jump connection
- Specify the variables you want
- Interpolation
- Nested Structure
- Collect Variables
- Stack
- Transformers.Pretrain
- Transformers.GenerativePreTrain
- Transformers.BidirectionalEncoder
- Transformers.Datasets (not complete)