Data Format Description Language (DFDL) v1.0 Specification
OGF Proposed Recommendation GFD-P-R.174, January 31, 2011
DFDL is a language for describing data formats. A DFDL description allows data to be read from its native format and to be presented as an instance of an information set or indeed converted to the corresponding XML document. DFDL also allows data to be taken from an instance of an information set and written out to its native format.
DFDL achieves this by leveraging W3C XML Schema Definition Language (XSDL) 1.0. [XSDLV1]
An XML schema is written for the logical model of the data. The schema is augmented with special DFDL annotations. These annotations are used to describe the native representation of the data. This is an established approach that is already being used today in commercial systems.
<w>5</w>
<x>7839372</x>
<y>8.6E-200</y>
<z>-7.1E8</z>
<xs:complexType name="example1">
<xs:sequence>
<xs:element name="w" type="xs:int"/>
<xs:element name="x" type="xs:int"/>
<xs:element name="y" type="xs:double"/>
<xs:element name="z" type="xs:float"/>
</xs:sequence>
</xs:complexType>
Now, suppose we have the same data but represented in a non-XML format. A binary representation of the data could be visualized like this (shown as hexadecimal):
0000 0005 0077 9e8c
169a 54dd 0a1b 4a3f
ce29 46f6
<xs:complexType >
<xs:sequence dfdl:byteOrder="bigEndian">
<xs:element name="w" type="xs:int">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="binary"
binaryNumberRep="binary"
byteOrder="bigEndian"
lengthKind="implicit"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="x" type="xs:int ">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="binary"
binaryNumberRep="binary"
byteOrder="bigEndian"
lengthKind="implicit"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="y" type="xs:double">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="binary"
binaryFloatRep="ieee"
byteOrder="bigEndian"
lengthKind="implicit"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="z" type="xs:float" >
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="binary"
byteOrder="bigEndian"
lengthKind="implicit"
binaryFloatRep="ieee" />
</xs:appinfo>
</xs:annotation>
</xs:element>
</xs:sequence>
</xs:complexType>
This simple DFDL annotation expresses that the data are represented in a binary format and that the byte order will be big endian. This is all that a DFDL parser needs to read the data.
Consider if the same data are represented in a text format:
5,7839372,8.6E-200,-7.1E8
Once again, we can annotate the same data model, this time with properties that provide the character encoding, the field separator (comma) and the decimal separator (period):
<xs:complexType >
<xs:sequence >
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:sequence encoding="UTF-8" byteOrder="bigEndian"
separator="," />
</xs:appinfo>
</xs:annotation>
<xs:element name="w" type="xs:int">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="text"
encoding="UTF-8"
textNumberRep ="standard"
textNumberPattern="####0"
textStandardGroupingSeparator=","
textStandardDecimalSeparator="."
lengthKind="delimited"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="x" type="xs:int">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="text"
encoding="UTF-8"
textNumberRep ="standard"
textNumberPattern="#######0"
textStandardGroupingSeparator=","
textStandardDecimalSeparator="."
lengthKind="delimited"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="y" type="xs:double">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="text"
encoding="UTF-8"
textNumberRep ="standard"
textNumberPattern="0.0E+000"
textStandardGroupingSeparator=","
textStandardDecimalSeparator="."
lengthKind="delimited"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="z" type="xs:float">
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:element representation="text"
encoding="UTF-8"
textNumberRep ="standard"
textNumberPattern="0.0E0"
textStandardGroupingSeparator=","
textStandardDecimalSeparator="."
lengthKind="delimited"/>
</xs:appinfo>
</xs:annotation>
</xs:element>
</xs:sequence>
</xs:complexType>
Copyright (C) Open Grid Forum (2005-2010). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the OGF or other organizations, except as needed for the purpose of developing Grid Recommendations in which case the procedures for copyrights defined in the OGF Document process must be followed, or as required to translate it into languages other than English.