My Java 5 Antlr grammar 6
After years of toying with Antlr but never actually doing anything useful with it; and after months of merely skimming articles on new Java 5 language features I thought I’d deepen my understanding of both and update the Antlr Java 1.4 grammar to recognize the new language features.
Suprisingly it only took about a week to polish off, and most of that was spent just cleaning up the grammar and deciding between different ways to structure the output AST (rather than strugling with Antlr tool itself).
For anyone interested, my grammar can be found on my site and it’s also hosted at antlr.org
As for how well tested it is: I’ve used it to successfully recognise the entire Java 5 source base, jdigraph’s source base (an early adopter of generics) and some custom tests I developed. I found the accompanying AST recogniser (as opposed to the token stream recogniser) is a good tool for confirming your tree is structured as you expected it to be. I also visually inspected the AST to ensure its structure was what I expected. An alternative to visual inspection would be great to automatically ensure the grammar stays correct over a lifetime of modifications. I’d love to hear from anyone who’s worked on such a tool. A testing framework that used XQuery or XPath querying/assertions on DOM-adapted AST (similar to PMD’s AST interface) would be my first approach. Comments or experience anyone?
More comments on Antlr and Java 5’s new language features later.
Trackbacks
Use the following link to trackback from your own site:
http://www.michaelstudman.com/fullfathomfive/articles/trackback/45
Comments
Comments are closed for this article
Now you are truly an l33t h4x0r with kung foo 5k1llz! Woot!
Grammar has been updated to:
Dear Mr Ed.
Thanks so much for that obscene appellation. I guess now I'm a 1337 h4x0r, my membership to http://www.hacknot.info will now be revoked?
Michael.
I have a problem with your Java 5 grammar (or at least I think I do). The following Java code for a class variable:
int x [];
parses correctly with an ARRAY_DECLARATOR appearing as a child of the TYPE node, but the code:
int [] x;
does not appear to add an ARRAY_DECLARATOR node at all. What am I doing wrong, as I undertand that both forms of declaration are valid. The older Java parser (1.3) works as I expect.
Any ideas?
Geoff, all fixed! The latest version on ANTLR and this site is now working as expected. It was, in essence, a failed and minute attempt at reuse that caused me to introduce this bug. D'oh!
Would you happen to have the 1.22.1 version of your grammar? Last year, I modified version 1.22.1 of your grammar to fit a code generator that I wrote. Now I've got your version 1.22.3, and I'd like to merge my changes into it. But I do not have the original 1.22.1, so I can't easily see what my changes are!
Charlweed