NewlineLexers.jl
Quote-aware newline finder.
julia> data = collect(codeunits(""" abc\n "efg\n" \n """));
# Quote-unaware newline finder
julia> newlines = findall(==(UInt8('\n')), data)
3-element Vector{Int64}:
5
11
14
# escape open quote close quote
julia> l = Lexer(IOBuffer(data), UInt8('\\'), UInt8('"'), UInt8('"'));
julia> out = Int32[];
# Doesn't include the newline that appears inside a string
julia> find_newlines!(l, data, out); # max size of `data` is 2GiB
julia> out
2-element Vector{Int32}:
5
14
Acknowledgement
This package was heavily inspired by the simdjson library by Daniel Lemire, namely by his branchless approach to finding escape characters which we reused almost verbatim where applicable.