CitableParserBuilder.AnalysisType

Citable analysis of a string value.

An Analysis has five members: a token string value, and four abbreviated URNs, one each for the lexeme, form, rule and stem.

Base.:==Method

Override Base.== for Analysis.

==(a1, a2)
Base.:==Method

Override Base.== for AnalyzedToken.

==(atoken1, atoken2)
Base.:==Method

Override Base.== for AbbreviatedUrn.

==(au1, au2)
Base.eltypeMethod

Implement base element type for AnalyzedTokens.

eltype(analyses)
Base.iterateMethod

Implement iteration with state for AnalyzedTokens.

iterate(analyses, state)
Base.iterateMethod

Implement iteration for AnalyzedTokens.

iterate(analyses)
Base.objectidMethod

Default implementation of function to find the object identifier of AbbreviatedUrn.

objectid(au)
Base.showMethod

Override Base.show for AnalyzedTokens.

show(io, analyses)
Base.showMethod

Override Base.show for AnalyzedToken.

show(io, atoken)
Base.showMethod

Override Base.show for AbbreviatedUrn.

show(io, au)
CitableBase.cexMethod

Format an AnalyzedTokens collection as a delimited-text string.

cex(analyses; delimiter)

Required function for Citable abstraction.

CitableBase.cexMethod

Serialize an AnalyzedToken as delimited text (required for Citable interface).

cex(at; delimiter)

Uses abbreviated URNs. These can be expanded to full CITE2 URNs when read back with a URN registry, or the delimited function can be used with a URN registry to write full CITE2 URNs.

CitableBase.fromcexMethod

Parse a one-line delimited-text representation into an AnalyzedToken, using abbreviated URNs for identifiers. Note that for a sigle CEX line, the AnalyzedToken will have a single Analysis in its vector of analyses.

fromcex(s, ; delimiter, configuration, strict)
CitableBase.fromcexMethod

Parse a delimited-text string into an AnalyzedTokens collection.

fromcex(trait, s, ; delimiter, configuration, strict)
CitableBase.labelMethod

Label for analyses.

label(analyses)

Required function for Citable abstraction.

CitableBase.labelMethod

Label for AnalyzedToken (required for Citable interface).

label(at)
CitableBase.urnMethod

Unique identifier for AnalyzedToken (required for Citable interface).

urn(at)
CitableBase.urntypeMethod

Typeof URN identifying analyses in an an AnalyzedTokens collection.

urntype(analyses)

Required function for Citable abstraction.

CitableBase.urntypeMethod

Identify URN type for an AnalyzedToken as CtsUrn.

urntype(at)

Required function for Citable abstraction.

CitableParserBuilder.abbreviateMethod

Constructs an AbbreviatedUrn string from a Cite2Urn.

abbreviate(urn)

Example:

julia> abbreviate(Cite2Urn("urn:cite2:kanones:lsj.v1:n123"))
"lsj.n123"

Example: a pipeline abbreviating a Cite2Urn and forming a LexemeUrn from the abbreviated string value.

julia> Cite2Urn("urn:cite2:kanones:lsj.v1:n123") |> abbreviate |> LexemeUrn
LexemeUrn("lsj", "n123")
CitableParserBuilder.analysisFunction

Parse delimited-text representaiton into an Analysis. If delimited-text form uses full Cite2Urns, these are abbreviated.

analysis(s)
analysis(s, delim)
CitableParserBuilder.delimitedMethod

Serialize an Analysis to delimited text. Abbreviated URNs are expanded to full CITE2 URNs using registry as the expansion dictionary.

delimited(a; delim, registry)
CitableParserBuilder.delimitedMethod

Serialize an AnalyzedTokens object as delimited text (required for Citable interface).

delimited(atcollection; delim, registry)

Uses abbreviated URNs. These can be expanded to full CITE2 URNs when read back with a URN registry, or the delimited function can be used with a URN registry to write full CITE2 URNs.

CitableParserBuilder.expandMethod

Constructs a Cite2Urn from an AbbreviatedUrn and a dictionary mapping collection identifiers in AbbreviatedUrns's to full Cite2Urns for a versioned collection.

CitableParserBuilder.fstsafeMethod

Compose SFST representation of an AbbreviatedUrn.

fstsafe(au)

Example:

julia> LexemeUrn("lexicon.lex123") |> fstsafe
"<u>lexicon\.lex123</u>"
CitableParserBuilder.lexemedictionaryMethod

From a vector of AnalyzedTokens and an index of tokens in a corpus, construct a dictionary keyed by lexemes, mapping to a further dictionary of surface forms to passages.

lexemedictionary(parses, tokenindex)
CitableParserBuilder.lexemehistoMethod

Compute histogram of lexemes in AnalyzedTokens.

lexemehisto(parses)

All distinct lexemes for a token are counted; there is no weighting of counts for lexically ambiguous tokens.

CitableParserBuilder.parsecorpusMethod

Use a CitableParser to parse a CitableTextCorpus with each citable node containing containg a single token of type LexicalToken.

parsecorpus(c, p; data, countinterval)

Returns anAnalyzedTokens object.

CitableParserBuilder.parselistMethod

Parse a list of tokens with a CitableParser.

parselist(vocablist, p; data, countinterval)

Returns a Dict mapping strings to a (possibly empty) vector of Analysis objects. Blank lines in input are silently ignored.

CitableParserBuilder.parselistMethod

Read a list of tokens from file f and parse with p.

parselist(f, p, reader; data, countinterval)

Returns a Dict mapping strings to a (possibly empty) vector of Analysis objects.

CitableParserBuilder.parselistMethod

Read a list of tokens from URL u and parse with p.

parselist(u, p, reader; data, countinterval)

Returns a Dict mapping strings to a (possibly empty) vector of Analysis objects.

CitableParserBuilder.parsepassageMethod

Parse a CitablePassage with text for a single token with a CitableParser.

parsepassage(cn, p; data)

Returns a single AnalyzedToken.

CitableParserBuilder.parsepassageMethod

Parse a CitablePassage with text for a single token with a CitableParser.

parsepassage(ct, p; data)

Returns a single AnalyzedToken.

CitableParserBuilder.readfstMethod

Read SFST output from file f, and parse into a dictionary keying tokens to a (possibly empty) array of SFST strings.

readfst(f)
CitableParserBuilder.relationsblockFunction

Compose a CEX relationset block for a set of analyses.

relationsblock(urn, label, v; ...)
relationsblock(urn, label, v, delim; registry)