TinyParser (https://github.com/Sheph/tinyp)
This is a tiny LALR(1) parser for math formulas (such as e.g. -(3 + 7) * 90 / 200.123).
"write a math formula parser" is a common coding interview task and in general, something similar may come handy in day-to-day work, e.g. writing a simple DSL like e.g. wireshark filter expressions.
Sure, you can use flex / bison / llvm whatever, but for some simple tasks this is an overkill.
The thing is, almost every time I see people do this it's some kind of overly complicated thing with tons of code, some crazy stack-based state machines or whatever. Here I want to show how simple this task actually is, the code in this repo:
- Pure standard C++11, no third-party dependencies
- Can parse any kind of math formula, e.g.
-(3 + 7) * 90 / 200.123 - Provides error reporting with line and column numbers, e.g.
parse error: 1:5: expected ")" instead of end of string - Supports variables, e.g.
myvar1 * (123 + myvar2)wheremyvar1andmyvar2are variables defined outside of formula - Supports both numbers and string literals, e.g.
("abc" + "hello \" world") * 12 + "test"also works - Has "pretty-printer" that can print a parsed expression in "normalized form"
- Bonus! Besides parsing math formulas there's a parser for simple "filter expressions", e.g.
(a >= 12 & c <> -56) | a > c
And it's all done with very little code that's very simple. e.g. parsing code is basically BNF translated into C++:
Inspite of their simplicity, LALR(1) parsers are pretty powerful, they can be used to parse langugaes such as C, Lua and Java. This particular code base can
be easily extended to support more complicated DSLs.
TinyP::Expr and TinyP::ExprVisitor can be extended to support more complicated expressions.
One can come up with more TinyP::ExprVisitor implementations that can do more besides pretty-printing and
evaluating expressions and so on.
cmake . && makecd into out/bin and run e.g.:
echo "-(3 + 7) * 90 / 200.123" | ./tinyp_tool formulaor:
echo "(a >= 12 & c <> -56) | a > c" | ./tinyp_tool filter a=1 c=2you can use files as input e.g.:
cat ./formula1.txt | ./tinyp_tool formulayou can use numeric and string variables e.g.:
echo "(2 + var1) * var2" | ./tinyp_tool formula @var1=test var2=3some formula:
$ echo "(1.34 * -2 / 12.3) * 97.45 - 111.123" | ./tinyp_tool formula
normalized input: ((((1.34 * (0 - 2)) / 12.3) * 97.45) - 111.123)
eval result: num(-132.356)multiply string:
$ echo "\"test\" * -(-((6)))" | ./tinyp_tool formula
normalized input: ("test" * (0 - (0 - 6)))
eval result: str(testtesttesttesttesttest)javascript banana (well, not exactly 😄):
$ echo "\"ba\"++\"aa\"" | ./tinyp_tool formula
normalized input: ("ba" + (0 + "aa"))
eval result: str(ba0aa)more like javascript banana 😄:
$ echo "\"ba\"+0/0+\"a\"" | ./tinyp_tool formula
normalized input: (("ba" + (0 / 0)) + "a")
eval result: str(ba-nana)parse error:
$ echo "(1.34 * - -2 / 12.3) * 97.45 - 111.123" | ./tinyp_tool formula
parse error: 1:12: expected number instead of operator(-)eval error:
$ echo "5 + \"A\"*v1*\"B\"" | ./tinyp_tool formula v1=5
vars:
v1=num(5.000000)
normalized input: (5 + (("A" * v1) * "B"))
eval error: 1:5: str(AAAAA) * str(B) - operation not supportedsome vars:
$ echo "5+var1*var2+6" | ./tinyp_tool formula var1=6 var2=5
vars:
var2=num(5.000000)
var1=num(6.000000)
normalized input: ((5 + (var1 * var2)) + 6)
eval result: num(41)now let's just change type of var1 😄:
$ echo "5+var1*var2+6" | ./tinyp_tool formula @var1=6 var2=5
vars:
var2=num(5.000000)
var1=str(6)
normalized input: ((5 + (var1 * var2)) + 6)
eval result: str(5666666)