Data Format Description Language (DFDL) v1.0 Specification
OGF Proposed Recommendation GFD-P-R.174, January 31, 2011


11. Properties Common to both Content and Framing

Property Name

Description

byteOrder

Enum or DFDL Expression

Valid values ‘bigEndian’, ‘littleEndian’.

This property can be computed by way of an expression which returns the string 'bigEndian' or 'littleEndian'. The expression must not contain forward references to elements which have not yet been processed.

Note that there is, intentionally, no such thing as 'native' endian.1

This also applies to character data for multi-byte character sets when the encoding is not specific. E.g., UTF-16 and UTF-32. Note that when the character set encoding is specific about the byte order (e.g., UTF-16BE), then the byteOrder property is ignored when processing text/strings having that encoding.

Note: The Unicode byte order mark is treated as a normal character and does not affect encoding.

Annotation: dfdl:element, dfdl:simpleType, dfdl:sequence, dfdl:choice, dfdl:group

encoding

Enum or DFDL Expression

Values are IANA charsets or CCSIDs.2

This property can be computed by way of an expression which returns the appropriate string. The expression must not contain forward references to elements which have not yet been processed.

Note that there is, deliberately, no concept of 'native' encoding.3

Conforming DFDL v1.0 processors must accept at least 'UTF-8'', 'UTF-16', 'UTF-16BE', 'UTF-16LE', 'ASCII', and 'ISO-8859-1' as encoding names. Encoding names are case-insensitive, so 'utf-8' and 'UTF-8' are equivalent. The 'UTF-16' encoding requires that dfdl:byteOrder is defined.

Annotation: dfdl:element, dfdl:simpleType, dfdl:sequence, dfdl:choice, dfdl:group

utf16Width

Enum

Valid values are 'fixed', variable'.

Applies only when encoding is 'UTF-16', 'UTF-16BE', UTF16-LE' or their CCSID equivalents.

Specifies whether the encoding 'UTF-16' should be treated as a fixed or variable width encoding. 'UTF-16' is a variable width encoding and 'UCS-2' is the fixed width subset. However it is common for users to specify 'UTF-16' when they mean when they should be specifying 'UCS-2' This property effectively converts 'UTF-16' to 'UCS-2'.

Annotation: dfdl:element, dfdl:simpleType, dfdl:sequence, dfdl:choice, dfdl:group

ignoreCase

Enum

Valid values are 'yes', 'no'.

Whether mixed case data is accepted when matching delimiters and data values, such as dfdl:textBooleanTrue, on input.

On unparsing always use the delimiters or value as specified.

Annotation: dfdl:element, dfdl:simpleType, dfdl:sequence, dfdl:choice, dfdl:group

1 The concept of native-endian is avoided in DFDL since a DFDL schema containing such a property binding does not contain a complete description of data, but rather an incomplete one which is parameterized by characteristics of the machine and implementation where the DFDL processor is executed. In DFDL this same behavior is achieved using variables or, for example, by use of external setting of pre-defined variables to set dfdl:byteOrder.
2 CCSID stands for Coded Character Set ID, a decimal number representation for a codepage specifier..[CCSID].
3 The concept of native character encoding is avoided in DFDL since a DFDL schema containing such a property binding does not contain a complete description of data, but rather an incomplete one which is parameterized by characteristics of the operating environment where the DFDL processor executes. In DFDL this same behavior is achieved through use of true parameterization using variables or, for example, by use of external setting of pre-defined variables to set dfdl:encoding.

Copyright (C) Open Grid Forum (2005-2010). All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the OGF or other organizations, except as needed for the purpose of developing Grid Recommendations in which case the procedures for copyrights defined in the OGF Document process must be followed, or as required to translate it into languages other than English.