Data Format Description Language (DFDL) v1.0 Specification
OGF Proposed Recommendation GFD-P-R.174, January 31, 2011


1.4 Scope of version 1.0

The goals of version 1.0 are as follows:
  1. Leverage XML technology and concepts
  2. Support very efficient parsers/formatters
  3. Avoid features that require unnecessary data copying
  4. Support round-tripping, that is, read and write data in a described format from the same description
  5. Keep simple cases simple
  6. Simple descriptions should be "human readable" to the same degree that XSDL is.
The general features of version 1.0 are as follows:
  1. Text and binary data parsing and unparsing
  2. Validate the data when parsing and unparsing using XSDL validation.
  3. Defaulted input and output for missing values
  4. Reference – use of a previously read value in subsequent expressions
  5. Choice – capability to select among format variations
  6. Hidden sequence of elements - description of an intermediate representation not exposed in the final result
  7. Basic Math – in DFDL expressions
  8. Out-of-band value handling
  9. Speculative parsing to resolve uncertainty.
  10. Very general parsing capability: Lookahead/Push-back

Version 1.0 of DFDL is a language capable of expressing a wide range of binary and text-based data formats.

DFDL is capable of describing binary data as found in the data structures of COBOL, C, PL1, Fortran, etc. In particular, it is able to describe repeating sub-arrays where the length of an array is stored in another location of the structure.

DFDL is capable of describing a wide variety of textual data formats such as HL7, X12, and SWIFT. Textual data formats often use syntax delimiters, such as initiators, separators and terminators to delimit fields.

DFDL has certain composition properties. I.e., two formats can be nested or concatenated and a working format results.

The following topics have been deferred to future versions of the standard:
  • Extensibility: There are real examples of proprietary data format description languages that we use as our base of experience from which to derive standard DFDL. However, there are no examples of extensible format description languages. Therefore, while extensibility is desirable in DFDL, there is not yet a base of experience with extensibility from which to derive a standard.
  • Rich Layering: Some formats require data to be described in multiple passes. Combining these into one DFDL schema requires very rich layering functionality. In these layers one element's value content becomes the representation of another element. DFDL V1.0 allows description of only a limited kind of layering.

Copyright (C) Open Grid Forum (2005-2010). All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the OGF or other organizations, except as needed for the purpose of developing Grid Recommendations in which case the procedures for copyrights defined in the OGF Document process must be followed, or as required to translate it into languages other than English.