Background

Recently, I've been noticing that some people (myself included) are having a hard time with the verbosity of the POM syntax. It seems we could do a lot more with the use of XML attributes and some decent ID validators behind the scenes. Some of this is as simple as reworking the Modello model to use attributes, others may require revisions to the underlying Plexus-based configuration mechanism. At any rate, I want to open the discussion for a new syntax here.

Current syntax sample

Below is a sample of what a just-barely-customized POM might look like. Note that it only has three dependencies. Also note the extra lines occupied by simple list delimiters, and the 'implementation' attribute used to specify a String list element (seems reasonable to assume String can be a default list item type)...

<project>
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.myco.somegroup</groupId>
    <artifactId>my-parent</artifactId>
    <version>1.0-SNAPSHOT</version>
  </parent>
  <artifactId>my-artifact</artifactId>
  <version>1.0-SNAPSHOT</version>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>servletapi</groupId>
      <artifactId>servletapi</artifactId>
      <version>2.3</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>plexus</groupId>
      <artifactId>plexus-utils</artifactId>
      <version>1.0.2</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>

        <configuration>
          <something>some value</something>
        </configuration>

        <executions>
          <execution>
            <id>first</id>

            <configuration>
              <excludes>
                <exclude implementation="java.lang.String">**/BadTest.java</exclude>
              </excludes>
            </configuration>

            <goals>
              <goal>test</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

First proposed alternative

I think we can make use of two things that will dramatically reduce the verbosity of the above POM: XML attributes, and good ID validation. The first will make better use of 'container' elements, and the second has the potential to define and enforce a sub-syntax for IDs, like dependency IDs for example. Consider the following revision of the above POM:

<project modelVersion="4.0.0" id=":my-artifact:1.0-SNAPSHOT">

  <parent groupId="org.myco.somegroup" artifactId="my-parent" version="1.0"/>

  <dependency id="junit:junit:3.8.1" scope="test"/>
  <dependency id="servletapi:servletapi:2.3" scope="provided"/>
  <dependency id="plexus:plexus-utils:1.0.2"/>

  <build>
    <plugin id="org.apache.maven.plugins:maven-surefire-plugin:">
      <configuration>
        <something>some value</something>
      </configuration>

      <execution id="first">
        <configuration>
          <excludes type="java.lang.String">
            <exclude>**/BadTest.java</exclude>
          </excludes>
        </configuration>

        <goal name="test"/>
      </execution>
    </plugin>
  </build>
</project>

Here, IDs are using a terse format that looks like groupId:artifactId:version, where missing information would be expressed through blank values. For example, the project's groupId is inherited from the parent, so its ID looks like ":my-artifact:1.0-SNAPSHOT". Also, a plugin section that doesn't want to tie the build to a particular plugin version would look like "org.apache.maven.plugins:maven-surefire-plugin:".

Also, XML elements used solely to delimit lists of other elements are eliminated. We can depend on XSD validation for the positioning and grouping of list elements. For things like managed dependencies and managed plugins, these should probably be grouped under a <managedInformation> section or something. But list delimiter elements are useless, since the presence of multiple of the same element implies a list.

Finally, the 'implementation' attribute on all configuration list elements has been moved to the list delimiter for that configuration - the only place where I see it making sense to have a list delimiter XML element. This removes the need to replicate this attribute again and again, and adds the benefit of templatizing the list handling for configurations. In other words, it has semantics similar to:

java.util.List<String>

This is just a first stab at revising the POM syntax...possibly other options could include a maven-handled namspace convention, where plugin configuration can be directly embedded in the build section to improve coherence.

  • No labels

11 Comments

  1. I tried to bring this up twice before but it got shot down (smile)
    So definitely +1!

    Some comments:

    o id=":maven-compiler-plugin:" should also be allowed as currently the pom allows to specify just the artifactId for the plugin;

    o Also allow attributes artifactId, groupId, version, type
    instead of id, just like in the <parent/> example above. (why is this one different?)

    o Grouping tags are not all bad; for instance <pluginRepositories> v.s <repositories>, where the child elements of both groups will just be <repository>, since it's the same object but in a different context. So grouping tags for specifying different context should be allowed.
    It also helps in visual editors to be able collapse parts of the tree.

    o Maybe force usage of attributes for all data types that can be represented in a single line string, like an include/exclude pattern. Use <exclude pattern="**/*Bad*.java"/>. Allowing <exclude>line \n anotherline \n </exclude> is error prone.

    o add phase="..." to <execution>, and drop the 'id'. Id's in xml are only used to reference to it from other elements (thus creating a graph instead of a tree). Although it makes sense for a <profile>, but then again, a profile should have a name: <profile name="default"/>. And ofcourse it's mandatory for <repository>.

  2. Just to quickly respond:

    o the parent example above is an oversight, but I agree that we could allow the group/artifact/version as an explicit alternative

    o I agree that anything that's a one-liner should probably be expressed in an attribute...in fact, my inclination is to say anything that won't have complex sub-sections should be an empty element with attributes.

    o the execution id is the execution name, for merging purposes...didn't mean to convey that any of the id attributes above should be XML id's...it's just another oversight on my part.

    o RE: grouping tags, that's not a deal breaker for me...it just seems a little verbose sometimes, esp. when listing the groups in an execution...

  3. Unknown User (cberry)

    I could not agree more. The current XML syntax is far too verbose. If you want a feeling for just how overwhelming it is; go to http://maven.apache.org/maven2/maven-model/maven.html and try to figure out where you are in the document hierarchy. You have to scroll up and down to even get a feel for where you're at. I assure you that people will not like it.

    I really like John's simplified version. It's much cleaner.

    IMO, only add "list wrappers" if they are absolutely required – such as in Kenney's example

    Also – should this really wait for 2.1?? As they say; "you're only a virgin once". FWIW, I think 2.0 will get crucified if it goes out with the current XML syntax...

  4. My instinct is to say that we need to get 2.0 out in the wild, to get a better feel for how to make the syntax better (among other things).

    I know that we're going to take a hit for the verbosity of the POM, but one of the principles we used when designing m2 is that the POM could come from ANYWHERE...whether that be an alternate XML syntax, database, etc. Now, we have a ways to go in order to realize that ideal, but we'll get there.

    I think we're going to be too limited in terms of feedback for big design issues like this until we get a wider community providing suggestions and participating. Just look at the number of JIRA issues filed since JavaOne...the number of issues has doubled, and we've picked up something like 4-6 active mojo developers since then. If we release a 2.0, we're more likely to get a wider sample of use cases, etc.

    I really hate to say it, but this may be one of those cases where the X.0 release is cool, but just wait for the X.1 release...!

    Maybe I'm on crack here, but that's my gut reaction.

    1. I have to agree with John here, and I think that the tooling support will pick up soon as it should be easy to generate most of the code required for editing the POM.

      1. I have to say that I'm extremely nervous about going to the tooling answer for anything as core as this. I think EJB is a good lesson in avoiding real design discussions in favor of some vaporware tool.

        My 2-cents, of course, but I'd rather actually solve the problem. I'd rather be able to run Maven reasonably efficiently using Vi and a Sun JDK over an SSH connection, if need be.

  5. an emphatic -1 on id to replace groupId/artifactId/version, especially "org.apache.maven.plugins:maven-surefire-plugin:"

    I think "implementation" shouldn't be needed at all, plexus should be able to do the conversion.

    I agree with kenney that the retaining the grouping tags is probably ok, though maybe we could have an "if you only have one you don't need the grouping tags" equivalent. I think this is where a lot of the verbosity comes from.

    When it comes to attributes, I don't really mind either way. I've argued for them before and lost, but I also don't think they are the big problem here.

    I think 2.0 can go out as is, and then we will have to support the old format. I don't see any problem with supporting a terse format as well as the original.

    If we are to do this though, we need to be very careful about making things confusing:

    • which things are lists and which are not
    • which fields are attributes and which are nested.
  6. Looking more at this, I worry about creating different formats. Theoretically, its fine - but from a usability standpoint its confusing. We should stick with one. I understand John's argument about tooling being a crutch - but hopefully the templated pom sections will make this all moot?

  7. Unknown User (jochen.wiedmann@gmail.com)

    I am -1 one introducing another format. Whether the "terse" format would be preferrable over the current format, is a matter of tast. But that's not the point. My main fear is, that the possibility of alternative formats would restart in every project. User A prefers the terse format and creates a POM. User B prefers the "classic" layout and adds dependencies. User A notices B's changes and reformats the POM. Soon we have a discussion. Another discussion. A meeting. A style guide for POM's. Please, no.

    1. Do you have this issue with Spring bean configuration files?

      I think as long as the simplifications are optional adn can be mixed in with the current longer form, it shouldn't be a big deal.

  8. I'm -1 to change xml attributes/elements to a String that we'll have to parse manually
    eg. id="junit:junit:3.8.1" will make us parse by hand the content instead of using the xml parser, will increase the complexity of the tools that use the pom, and prevent the use of xml editors, tools, ...

    An simplification may be default values to the current pom, eg if I don't specify version or groupId of a dependency it takes the one from the current artifact pom.

    I'm +1 to allow switch between subelements and attributes when possible, but keeping backwards compatibility.