CURRENT BIBLIOGRAPHY
I'm just keeping this here, to keep me honest about what I'm reading.

A Play on Regular Expressions: A Functional Pearl. Sebastian Fischer, Frank Huch, Thomas Wilke.
 Solid introduction to parsing with semirings in Haskell
 Haskell is straightforward and accessible
 Readable, breezy style
 Semirings covered: Recognition, Counting, ViterbiBest (Maxlong, Firstlong)

Algebraic Foundations of Semiring Parsing. Yudong Liu
 Discusses semiring parsing as unified framework
 Unites semiring parsing and finite state transducers
 Describes a mechanical C++ framework for the pieces/parts
 Looks a lot like Barge/Barre
 Good overview, fairly accessible.

An Algorithm for RELAX NG Validation James Clark.
 Describes the Interleaf operator in some detail.

Bit Coded Regular Expression Parsing. Lasse Nielsen.
 Presentation, so short on details
 Describes bit coding as a way to do comparisons on regexs with less than machinewidth alternatives

Bitsets Match Regular Expressions, Compactly. Paul Khuong
 Describes the Bitap algorithm for bitmap encoding
 Describes graph minimization as assuming that all states are singular, and breaking them up as we prove the assumption wrong.

Constructing Human Grade Parsers. Joe Karter.
 Wishlist of parser capabilities outside the actual parsing engine
 Parse errors should be recoverable
 Parser should be able to produce partial output
 Requires tooling outside the formalities of regex and semiring parsing

Design and Analysis of String Representation. R. Sekar.
 Describes Owens, et. al.'s DFA generator as a realization of McNaughtonYamada
 Describes alternative search algorithms

Efficient and Flexible Incremental Parsing. S. L. Graham
 Describes incremental parsing, in which a document tree and parse tree are maintained and updated together.

ErrorCorrecting Parser Combinators S. Doaitse Swierstra, Pablo R. Azero Alcocer
 Haskellbased parsing language
 Describes errorhandling in parsing processes
 Describes recovering from errors to continue parsing
 Uses combinators, though

Total Parser Combinators Guillaume Allais
 Describes a combinatorbased parser with guaranteed termination
 Uses Brzozowski as its basis
 Proves that the combinator recognizer is a semiring
 Written in Agda (eek!)

Kleene Meets Church. Fritz Henglein
 Has the "theory vs practice" description
 Ideal: parsing + catamorphic processing
 Actual: FAs + adhoc instrumentation

On the Complexity and Performance of Parsing with Derivatives Michael D. Adams, Celeste Hollenbeck, Matthew Might
 Provides both theoretical and mechanical optimizations
 Replaces the recursion nullability with dataflow analysis
 Provides basis for Darais's implementation

Recognising and Generating Terms using Derivatives of Parsing Expression Grammars. Tony GarnockJones, Mahdi Eslamimehr, Alessandro Warth.
 Describes the derivatives of PEGs
 Describes the δPEG algorithm without backtracking for negation
 Describes the derivation of prioritized choice

Regular Expression Derivatives Reexamined Scott Owens, John Reppy, Aaron Turon.
 The paper that started the current project.
 Section 3.3: Using derivatives for DFA construction.
 Describes equivalence sets in the face of symbol class operators
 Describes regular vectors for multithreaded matching.
 Describes integration of complement and intersection operations

Regular Expressions as Types. Fritz Henglein
 Introduced the "adhoc" slander about PCRE. 😊
 Describes regex as a set membership problem
 Describes types systems as "regular languages"
 Clarifies emitter location for streamprocessing
 On the Kleene star operation
 May require different Kleene stars, or a visitor with some context awareness.

Semiring Parsing. Joshua Goodman.
 Mentioned by Liu
 Covers the initial history of semiring parsing
 Describes derivations forests (see Might & Adams) in a concise way (sec 2.5.1)
 Describes Viterbi nbest as a probablistic disambiguation strategy

Stream Processing Using Grammars and Regular Expressions Urlik Turk Rasmussen

Has a great explanation of the "Regular Expressions as Types" approach.

Leads into regex equivalence sets

Discusses disambiguation strategies with equivalence sets

Discusses bit coding of unions (no μregex or eregex)

(Con) Uses bitcoding as a parse tree extraction tool

Discusses alternative needs
 Recognition
 Matching (capture groups)
 Parsing (parse trees)

Discusses twopass analysis (Might) vs onepass (Adams)

Discusses difficulty of regex composition (see Conway 2018)

Discusses CFGs as an alternative DSL for regex composition (PEG, EBNF)

Is actually 5 papers; the other 4 cover algorithm's evolution

Fantastic overview overall


Taxonomies and Toolkits of Regular Expression Algorithms Bruce William Watson
 Solid overview
 Fairly long
 Excellent coverage of DFA minimization
 Discusses prefix/suffix scanning via AhoCorasik, et.al.
 Cover Brzozowski's algorithm as a performancebased alternative DFA constructor

Yacc is Dead Matthew Might, David Darais
 Started this whole mess
 Still a pretty good introduction.