[Draft]
Abstract
Grammar is a pure java syntax analyzer library.
Proposal
Grammar library can be use with a simple maven dependency. Grammar are defined with terminals, non-terminals and rules. For example an expression is a non-terminal, a number is a terminal and a rule is 'expression -> expression + number'. You write the rule content in a java file, and you can attach some data to terminals, non-terminals. Data attached to terminals are provided by some lexer, whereas data attached to non-terminals are generated using a method attached to rule. All terminals, non-terminals and rules are java object you will give to one of provided grammars.
Background
Most of java compilation tools ask to define your grammar is a special file. Then you will run the tool on this special file and you will obtain a java source file. Usually the special file contains some piece of pseudo java code (with special token like $n to access to information attached to the n token of the rule). However SableCC propose another approach : no pseudo java code in special file, you will complete generated files after. I think that this approach is cleaner.
Rationale
Because of pre-compilation grammar are not enough use in refractor tools or analyze tools, we prefer to use regular expression. Regular expression are very powerful however they becomes really difficult to understand and maintain when we want to recognize some complex informations.
With a dynamically generated grammar I think refractor tools or analyze tools could be enhanced (if the tools provide you predefined terminals and ask you to define what you need to recognize).
Moreover dynamically generated grammar allow some rules reusability. For example (if then else), (while), (for) can be part of grammar reusable from one language to another. The condition to allow the reusability is to accept different terminals for block start (a '{' in java or 'then' in lua), block stop ... We can also share grammar construction for boolean or arithmetic expressions.
With dynamically generated we can also imagine dynamically grammar transformer, for example the end user decide if he want multiplication having higher priority than addition or not then the grammar is generated (without pre-compilation issues).
In last, the code is not so hard to understand. Java brings Set, List and Map implementations and they are widely use to perform grammar automate construction.
Current Status
Look for a community.
Meritocracy
TODO
Community
A single developer.
Core Developers
Gael Lalire starts to write the code.
Alignment
Apache provide widely used utilities, this project may add a missing utility.
Known Risks
Orphaned Projects
Because there is only one committer, there is a risk of being orphaned.
Inexperience With Open Source
TODO
Reliance On Salaried Developers
The project development occurs in volunteer time. No corporations is sponsoring it.
Relationships with Other Apache Products
As an utility project, grammar should not have many dependencies. In initial code there is no dependency (only JDK 5 classes). Grammar could use Log4J.
Initial Source
http://commons-grammar.googlecode.com/svn/trunk/
Required Resources
Mailing lists, Subversion Directory and Issue Tracking if this project should be a TLP. However commons project could be the right place for such project.
Initial Committers
Gael Lalire (gael dot lalire at gmail dot com)
Sponsors
Volunteers, please.
Champion
Volunteers, please.
Mentors
Volunteers, please.
Sponsoring Entity
TODO