FranklinParser.EOSConstant
EOS

Mark the end of the string to parse (helps with corner cases where a token ends a document without being followed by a space).

FranklinParser.F_DIV_OPENConstant
F_DIV_OPEN

Finder for @@div checking that div matches a simplified rule for allowed CSS class names. The complete rule being -?[_a-zA-Z]+[_a-zA-Z0-9-]* which we simplify here to [a-zA-Z]+[_a-zA-Z0-9-] and allow , for separation so @@d1,d2 is allowed and corresponds to a setting where we pass two classes class="d1 d2".

FranklinParser.F_EMOJIConstant
F_EMOJI

Finder for emojis (those will have to be validated separately to check Julia recognises them).

FranklinParser.F_LANG_3Constant
F_LANG_*

Finder for code blocks. I.e. something like a sequence of 3, 4 or 5 backticks followed by a valid combination of letter defining a language.

FranklinParser.F_LINE_RETURNConstant
F_LINE_RETURN

Finder for a line return () followed by any number of whitespaces or tabs. These will subsequently be checked to see if they are followed by something that constitutes a list item or not.

FranklinParser.F_LX_COMMANDConstant
F_LX_COMMAND

Finder for latex command. First character is [a-zA-Z]. We do allow numbers (there's no ambiguity because \com1 is not allowed to mean \com{1} unlike in LaTeX). Underscores are allowed inside the command but not at the very start or very end to avoid confusion respectively with the escaped _ character or the emphasis in markdown, * are not allowed anywhere (including at the end). See also the check pattern.

FranklinParser.HTML_ENTITY_PATConstant
HTML_ENTITY_PAT

Pattern for an html entity. Ref: https://dev.w3.org/html5/html-author/charref.

Examples:

  • ⊓
  • ⊓
  • ⊓
  • ⊓

Note: longest entity is &CounterClockwiseContourIntegral; so capping the max number of characters to 32.

FranklinParser.LX_COMMAND_PATConstant
LX_COMMAND_PAT

Allowed latex command name. Underscore are allowed inside the command but not at extremities. The star * is not allowed anywhere.

Examples:

  • \com
  • \ab1_cd*
FranklinParser.MD_IGNOREConstant
MD_IGNORE

Tokens that may be left over after partition but should be ignored in text blocks.

FranklinParser.MD_TOKENSConstant
MD_TOKENS

Dictionary of tokens for Markdown. Note that for each, there may be several possibilities to consider in which case the order is important: the first case that works will be taken.

Dev: F* are greedy match, see `mdutils.jl`.

Try: https://spec.commonmark.org/dingus

FranklinParser.SPACE_CHARConstant
SPACE_CHARS

List of characters that correspond to a \s regex + EOS.

Ref: https://github.com/JuliaLang/julia/blob/master/base/strings/unicode.jl.

FranklinParser.AbstractSpanType
AbstractSpan

Section of a parent String with a specific meaning for Franklin. All subtypes of AbstractBlock must have an ss field corresponding to the substring associated to the block. This field is necessarily of type SubString{String}.

FranklinParser.BlockType
Block <: AbstractSpan

Blocks are defined by an opening and a closing Token, they may be nested. For instance braces block are formed of an opening { and a closing }.

FranklinParser.BlockTemplateType
BlockTemplate

Template for a block to find. A block goes from a token with a given opening name to one of several possible closing names. Blocks can allow or disallow nesting. For instance brace blocks can be nested {.{.}.} but not comments. When nesting is enabled, Franklin will try to find the closing token taking into account the balance in opening-closing tokens.

FranklinParser.ChompType
Chomp

Structure to encapsulate rules around a token such as whether it's fine at the end of a string, what are allowed following characters and, in the greedy case, what characters are allowed.

FranklinParser.GroupType
Group <: AbstractSpan

A Group contains 1 or more more Blocks and will map to either a Paragraph or something else like a code block.

FranklinParser.TokenType
Token <: AbstractSpan

A token is a subtype of AbstractSpan which typically determines the start or end of a block. It can also be used for special characters.

FranklinParser.TokenFinderType
TokenFinder

Structure to find a token keeping track of how many characters should be seen, some rules with respect to positioning or following chars (see Chomp) and possibly a validator that checks whether a candidate respects a rule.

FranklinParser.TextBlockFunction
TextBlock

Spans of text which should be left to the fallback engine (such as CommonMark for instance). Text blocks can also have inner tokens that are non-block delimiters such as emojis or html entities.

FranklinParser._find_blocks!Function
_find_blocks!(...)

Helper function to resolve each of the passes looking at a different set of templates.

FranklinParser.aggregate!Method
aggregate!(blocks, items, acc, case)

Merge a bunch of blocks into a parent block. For instance at this point there may be multiple BLOCKQUOTE_LINE, this function will aggregate them into one BLOCKQUOTE block.

Arguments

* blocks: the current vector of blocks we're working with
* items:  list of names of blocks that would trigger the aggregation
* acc:    list of names of blocks that would be taken in the aggregation
* case:   name of the block resulting from the aggregation
FranklinParser.checkMethod
check(tokenfinder, ss)

Check whether a substring verifies the regex of a token finder.

FranklinParser.contentMethod
content(block)

Return the content of a Block, for instance the content of a {...} block would be .... Note EOS is a special '0 length' case to deal with the fact that a text can end with a token (which would then be an overlapping token and an EOS).

FranklinParser.dedentMethod
dedent(s)

Remove the common leading whitespace from each non-empty line. The returned text is decoupled from the original text (forced to String).

This is used in the context of lxdef in Franklin for instance, see tryformlxdef.

FranklinParser.env_not_closed_exceptionMethod
env_not_closed_exception(b)

Throw a FranklinParserException caused by an environment opened by a block b and either not formed properly or left open.

FranklinParser.find_blocksMethod
find_blocks(tokens, templates)

Given a list of tokens and a dictionary of block templates, find all blocks matching templates. The blocks are sorted by order of appearance and inner blocks are weeded out.

FranklinParser.find_tokensMethod
find_tokens(s, templates)

Go through a text left to right, one (valid) char at the time and keep track of sequences of chars that match specific tokens. The list of tokens found is returned.

Arguments

  • s: the initial text
  • templates: dictionary of possible tokens

Errors

This should not throw any error, everything should be explicitly handled by a code path.

FranklinParser.fixed_lookaheadMethod
fixed_lookahead(tokenfinder, candidate, at_eos)

Applies a fixed lookahead step corresponding to a token finder. This is used as a helper function in find_tokens.

FranklinParser.form_dbb!Method
form_dbb!(blocks)

Find CU_BRACKETS blocks that start with {{ and and with }} and mark them as :DBB.

FranklinParser.form_links!Method
form_links!(blocks)

Here we catch the following:

* [A]     LINK_A   for <a href="ref(A)">html(A)</a>
* [A][B]  LINK_AR  for <a href="ref(B)">html(A)</a>
* [A](B)  LINK_AB  for <a href="escape(B)">html(A)</a>
* ![A]    IMG_A    <img src="ref(A)" alt="esc(A)" />
* ![A](B) IMG_AB   <img src="escape(B)" alt="esc(A)" />
* [A]: B  REF      (--> aggregate B, will need to distinguish later)

where 'A' is necessarily non empty, 'B' may be empty.

Note: currently we DO NOT support links with titles such as the following out of simplicity:

  • [A]: B C
  • A

this allows to not have to check whether B is a link and C is text. If the user wants links with titles, they should create a command for it. We also do not support link destinations between <...>.

Note: in the case of a LINK_A, we check around if the previous non whitespace character and the next non whitespace character don't happen to be } {. In that specific case, the link is

FranklinParser.forward_matchFunction
forward_match(refstring, next_chars, is_followed)

Return a TokenFinder corresponding to a forward lookup checking if a sequence of characters matches a refstring and is followed (or not followed if is_followed==false) by a char out of a list of chars (next_chars).

FranklinParser.fromMethod
from(o)

Given a SubString ss, returns a valid string index where the substring starts. If ss is a String, return 1. Returns an Int. ```

FranklinParser.get_classesMethod
get_classes(divblock)

Return the classe(s) of a div block. E.g. @@c1,c2 will return "c1 c2" so that it can be injected in a <div class="...".

FranklinParser.greedy_lookaheadMethod
greedy_lookahead(tokenfinder, nchars, probe_char)

Applies a greedy lookahead step corresponding to a token finder. This is used as a helper function in find_tokens.

FranklinParser.greedy_matchMethod
greedy_match(head_chars, tail_chars, check)

Lazily accept the next char and stop as soon as it fails to verify λ(c).

FranklinParser.insertMethod
insert(token)

For tokens representing special characters, insert the relevant string.

FranklinParser.md_grouperMethod
md_grouper(blocks)

Form begin-end spans keeping track of tokens and group text and inline blocks after partition, this helps in forming paragraphs.

FranklinParser.next_charsFunction
next_chars(o, n)

Return the characters just after the object. Empty vector if there isn't the number of characters required.

FranklinParser.parent_stringMethod
parent_string(o)

Returns the parent string corresponding to s; i.e. s itself if it is a String, or the parent string if s is a SubString. Returns a String.

FranklinParser.partitionMethod
partition(s, tokenizer, blockifier, tokens; disable, postproc)

Go through a piece of text, either with an existing tokenization or an empty one, tokenize if needed with the given tokenizer, blockify with the given blockifier, and return a partition of the text into a vector of Blocks.

Args

KwArgs

* disable:  list of token names to ignore (e.g. if want to allow math)
* postproc: postprocessing to
FranklinParser.previous_charsFunction
previous_chars(o, n)

Return the characters just before the object. Empty vector if there isn't the number of characters required.

FranklinParser.process_emphasis_tokens!Method
process_emphasis_tokens!(tokens)

Process emphasis token candidates and either take them or discard them if they don't look correct.

  • sTs with token T is s is a space
  • xTs with token T is a valid CLOSE if x is a character and s a space
  • sTx with token T is a valid OPEN if x is a character and s a space
  • xTy with token T is a valid MIXED if x, y are characters
FranklinParser.process_line_return!Method
process_line_return!(blocks, tokens, i)

Process a line return followed by any number of white spaces and one or more characters. Depending on these characters, it will lead to a different interpretation and an update of the token.

if the next non-space character(s) is/are:

  • another lret –> interpret as paragraph break (double line skip)
  • two -,* or _ –> a hrule that will need to be validated later
  • one *, +, -, etc. –> an item candidate
  • | –> table row candidate
  •             --> a blockquote (startswith >).

We disambiguate the different cases based on the two characters after the whitespaces of the line return (the line return token captures [ ]*).

FranklinParser.remove_inner!Method
remove_inner!(blocks)

Remove blocks which are part of larger blocks (these will get re-formed and re-processed at an ulterior step).

FranklinParser.split_argsMethod
split_args(s)

Take a string like 'foo "bar baz" 1' and return a string that is split along whitespaces preserving quoted strings. So ["foo", ""bar baz"", "1"].

FranklinParser.subsMethod
subs(...)

Facilitate taking a SubString of an AS. The bounds given are expected to be valid String indices. Returns a SubString.

FranklinParser.toMethod
to(o)

Given a SubString ss, returns a valid string index where the substring ends. If ss is a String, return the last index. Returns an Int.

FranklinParser.tokenizer_factoryMethod
tokenizer_factory(; templates, postproc)

Arguments:

templates: a dictionary or matchers to find tokens.
postproc: a function to apply on tokens after they've been found e.g. to merge
    them or filter them etc.

Returns:

A function that takes a string and returns a vector of tokens.