mirror of
https://https.git.savannah.gnu.org/git/m4.git
synced 2026-01-27 09:54:42 +00:00
While a recursive descent parser is easy to write, it involves a LOT of function calls and boilerplate. Merely parsing "eval(1)" requires descending through ALL 11 levels of operator precedence, only for each layer to discover there is no operator. Better is the Pratt style of LR(1) parsing [1], which can handle any grammar where no two consecutive non-terminals or epsilon appear in the right side of any rule [2]. Now, parsing is done with just two mutually recursive functions; "eval(1)" works with just two function calls (primary() determines the value, and parse_expr() determines no operators are present), while more complicated expressions still produce the correct results but with less recursion. While at it, I noticed that "eval(1||(1/0))" used to produce a cryptic message: m4:stdin:1: bad expression in eval (excess input): 1||(1/0) despite the similar "eval(1||1/0)" suppressing that as part of short-circuiting. It turns out that my initial implementation of short-circuiting in 1.4.8b (back in 2007!) was never fully tested on more complex situations. To test that the new implementation is indeed faster, I wrote an m4 solution [3] to an Advent of Code challenge [4] that required computing 2000 iterations of a 24-bit linear feedback shift register over 2000 input values (--trace shows nearly 20 million eval calls). On my machine, runtime with an unoptimized pre-patch m4 was at 78 seconds, post-patch it completes in 66 seconds. [1] https://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/ [2] https://en.wikipedia.org/wiki/Operator-precedence_parser [3] https://repo.or.cz/aoc_eblake.git/blob/1b122791d4:/2024/day22.m4 [4] https://adventofcode.com/2024/day/22 * NEWS: Document the bug fix. Also document recent compilation fixes. * cfg.mk (indent_args): Teach indent not to mangle int casts. * doc/m4.text (Eval): Add coverage for the bug fix. Adjust one error output that is now more precise. * src/eval.c (logical_or_term, logical_and_term, or_term, xor_term) (and_term, equality_term, cmp_term, shift_term, add_term, mult_term) (exp_term, unary_term, simple_term): Delete, replaced by... (primary, parse_expr): ...new functions. (evaluate): Adjust caller. (cherry picked from commit e3c4d07c56c0f85ef998067b59191c17c2188ab8)
3.1 KiB
3.1 KiB