# Palindromes<:CombinedParser: a Tutorial for writing your combinable Parser

Palindromes are an interesting example for parsing because intuitively programmers as well as laymen understand the problem:

The text is identical when read from left to right, as we are used to do, or when read from right to left in reverse, when we read only the letters and discard all non-word characters.

This example enables you to write and optimize your custom CombinedParser based off a minimal template.

using CombinedParsers
using CombinedParsers.Regexp

## 1. Regular Expression

The PCRE test case contains nice examples of non-trivial palindromes.

# Defines parsers and output for pcre tests:
CombinedParsers.Regexp.@pcre_tests;

# The regular expression in the first quoted line is cryptic.
pt = pcre_test"""
/^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i 1221 0: 1221 1: 1221 2: 1 Satan, oscillate my metallic sonatas! 0: Satan, oscillate my metallic sonatas! 1: <unset> 2: <unset> 3: Satan, oscillate my metallic sonatas 4: S A man, a plan, a canal: Panama! 0: A man, a plan, a canal: Panama! 1: <unset> 2: <unset> 3: A man, a plan, a canal: Panama 4: A Able was I ere I saw Elba. 0: Able was I ere I saw Elba. 1: <unset> 2: <unset> 3: Able was I ere I saw Elba 4: A \= Expect no match The quick brown fox No match """ Pattern: r(e)"^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$"i
Test Examples:
1. 1221
2. Satan, oscillate my metallic sonatas!
3. A man, a plan, a canal: Panama!
4. Able was I ere I saw Elba.
Not Examples:
1. The quick brown fox


As to why and how this PCRE pattern matches palindromes requires arcane reasoning even to the initiated:

re = Regex(pt.pattern...)
r"^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$"i I figure the expression is hard to construct and come up with. The easy part is that the pattern needs to ignore case "i and whitespace \W. The pattern makes intense use of backreferences and subroutines. s=pt.test[3].sequence match(re, s) RegexMatch("A man, a plan, a canal: Panama!", 1=nothing, 2=nothing, 3="A man, a plan, a canal: Panama", 4="A") The matched captures are purely technical (Pattern 4 is the first character). ### Tree display of regex I find it hard to understand the compact captures (.), even in a nested tree display: cp = Regcomb(pt.pattern...) ๐ Sequence |> map(#29) |> regular expression combinator with 4 capturing groups โโ ๐ Sequence โ โโ ^ AtStart โ โโ (?>[^\W]*) ValueNotIn |> Repeat |> Atomic โโ ๐ Sequence โโ |๐ Either โ โโ (|๐) Either |> Capture 1 โ โ โโ ๐ Sequence |> map(#29) โ โ โ โโ ๐ Sequence โ โ โ โโ ๐ Sequence โ โ โโ Always โ โโ (|๐) Either |> Capture 3 โ โโ ๐ Sequence |> map(#29) โ โ โโ ๐ Sequence โ โ โโ ๐ Sequence โ โโ ๐ Sequence โ โโ (?>[^\W]*) ValueNotIn |> Repeat |> Atomic โ โโ [^ โ โ ] ValueNotIn โ โโ (?>[^\W]*) ValueNotIn |> Repeat |> Atomic โโ (?>[^\W]*) ValueNotIn |> Repeat |> Atomic โโ |๐ Either โโ$ AtEnd
โโ (?=๐) Sequence |> map(#56) |> PositiveLookahead
โโ
โ

## Next...

• optimize with memoization
• match also palindromes with odd number of letters
• elaborate on iteration documentation
• comparative benchmarking, conditional on palindrome length