We started looking into a more flexible way of creating our parser infrastructure. Especially the generation of lexer and parser from the grammar was a long winded process, that included a lot of manual tweaking. The most important advantage of using the MySQL server yacc grammar is however that we always stay in sync easily. Though, this is true only for the server version the grammar we picked is for. But MySQL Workbench needs more flexibility, supporting a whole range of server versions (from 5.1 up to the latest 5.7.8). Hence we decided to switch to a different tool: ANTLR. Not so surprising, however: the yacc based parser is still part of MySQL Workbench, because it’s not possible to switch such an important part in one single step. However over time the ANTLR based parsers will ultimately become our central parsing engine and one day we can rule out the yacc parser entirely.
Files created by ANTLR are the biggest single source files I have ever seen. The MySQLLexer.c file is 40MB in size with almost 590K LOC. No wonder our project metrics have changed remarkably, though not only because of the ANTLR based parser. Here are the current numbers (collected by a script shipped with the source zip):