Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Update obsolete sections that are now implemented.

...

The DFDL Workgroup of the Open Grid Forum also has been saving some issues targeted at DFDL 2.0

Recursion

One of the first things people want to model in DFDL always seems to be a binary legacy document formats like RTF or older MS Word documents. These have recursive structures where a section can contain text and other sections. DFDL v1.0 was not designed with document formats in mind, but rather with more traditional "data sets" or files of data in mind.

...

Layering - Data Source/Target Indirection

(This has an initial implementation now. The API may still evolve.)

The layering feature of Daffodil needs to be extended to enable new external layer transforms to be added via external jars.

...

  1. repeating sequence and choice groups (minOccurs and maxOccurs)
  2. complex type derivations
  3. attributes (already mentioned above)
  4. substitution groups - to enable separate compilation of multi-part DFDL schemas that are very large. (might be overkill - unclear if this is truly needed.)

XML Schema 1.1 / Schematron

(Schematron is now implemented. Rules can be separate or embedded in the schema.)

This new standard supports richer validation rules. They are useful since XML Schema 1.0's validation capabilities are so limited.

Alternatively, embedding schematron rules directly in a DFDL schema is an option.

...

Very often one wants dfdl:lengthKind='delimited' or dfdl:lengthKind='explicit' for simple types, but dfdl:lengthKind="implicit" for complex types. Separating the dfdl:lengthKind into two properties, or having the ability to specify either way, would simplify many schemas that otherwise have a error-prone need to have a dfdl:ref='complex' format reference on every element of complex type to override the default dfdl:lengthKind. That or you have to split the schema and put all simple types in one file (and use only those simple types), and all complex types in another.

Table and Range Lookups / Symbolic Enumerations

(see concrete ProposalThis is now implemented.)

Often one has a representation containing enumerations - integer values - which have symbolic meanings. The parsed result from such data wants to contain strings so the logical infoset is readable and understandable. A means is needed to specify a table of integer constants and their corresponding strings, to be used for parsing, and unparsing. Ranges are a generalization where a symbolic string is used to name all the integers that fall in a range.

...