My Java 5 Antlr grammar 6

Posted by Michael Studman Sun, 25 Jul 2004 03:00:00 GMT

After years of toying with Antlr but never actually doing anything useful with it; and after months of merely skimming articles on new Java 5 language features I thought I’d deepen my understanding of both and update the Antlr Java 1.4 grammar to recognize the new language features.

Suprisingly it only took about a week to polish off, and most of that was spent just cleaning up the grammar and deciding between different ways to structure the output AST (rather than strugling with Antlr tool itself).

For anyone interested, my grammar can be found on my site and it’s also hosted at antlr.org

As for how well tested it is: I’ve used it to successfully recognise the entire Java 5 source base, jdigraph’s source base (an early adopter of generics) and some custom tests I developed. I found the accompanying AST recogniser (as opposed to the token stream recogniser) is a good tool for confirming your tree is structured as you expected it to be. I also visually inspected the AST to ensure its structure was what I expected. An alternative to visual inspection would be great to automatically ensure the grammar stays correct over a lifetime of modifications. I’d love to hear from anyone who’s worked on such a tool. A testing framework that used XQuery or XPath querying/assertions on DOM-adapted AST (similar to PMD’s AST interface) would be my first approach. Comments or experience anyone?

More comments on Antlr and Java 5’s new language features later.

Trackbacks

Use the following link to trackback from your own site:
http://www.michaelstudman.com/fullfathomfive/articles/trackback/45

Comments
  1. Avatar
    Mr Ed Mon, 26 Jul 2004 04:08:38 GMT

    Now you are truly an l33t h4x0r with kung foo 5k1llz! Woot!

  2. Avatar
    Michael Studman Wed, 28 Jul 2004 17:45:17 GMT

    Grammar has been updated to:

    • Fixed tree structure bug with classOrInterface - thanks to Pieter Vangorpto for spotting this
    • Fixed bug where incorrect handling of SR and BSR tokens would cause type parameters to be recognised as type arguments.
    • Enabled type parameters on constructors, annotations on enum constants and package definitions
    • Fixed problems when parsing if ((char.class.equals(c))) {} - solution by Matt Quail at http://www.cenqua.com (the new name for Cortex)
  3. Avatar
    Michael Studman Wed, 28 Jul 2004 17:47:04 GMT

    Dear Mr Ed.

    Thanks so much for that obscene appellation. I guess now I'm a 1337 h4x0r, my membership to http://www.hacknot.info will now be revoked?

    Michael.

  4. Avatar
    Geoff Roy Sun, 17 Oct 2004 19:12:00 GMT

    I have a problem with your Java 5 grammar (or at least I think I do). The following Java code for a class variable:

    int x [];

    parses correctly with an ARRAY_DECLARATOR appearing as a child of the TYPE node, but the code:

    int [] x;

    does not appear to add an ARRAY_DECLARATOR node at all. What am I doing wrong, as I undertand that both forms of declaration are valid. The older Java parser (1.3) works as I expect.

    Any ideas?

  5. Avatar
    Michael Studman Mon, 01 Nov 2004 15:50:04 GMT

    Geoff, all fixed! The latest version on ANTLR and this site is now working as expected. It was, in essence, a failed and minute attempt at reuse that caused me to introduce this bug. D'oh!

  6. Avatar
    Charlweed Tue, 15 Feb 2005 14:25:11 GMT

    Would you happen to have the 1.22.1 version of your grammar? Last year, I modified version 1.22.1 of your grammar to fit a code generator that I wrote. Now I've got your version 1.22.3, and I'd like to merge my changes into it. But I do not have the original 1.22.1, so I can't easily see what my changes are!

    Charlweed

Comments are closed for this article