About delimited data

Delimited data is a collection of records or fields with variable lengths. Delimiters are used to signal the beginning or end of the record or field as shown in the following figure:

Figure 1. An example of delimited data where individual fields of the record are delimited by commas (LastName, FirstName and CustomerId), and the records of the file are delimited by the end of line.

Records can also have identifying codes, known as records IDs. In the data in Figure 2, each record begins with a record ID and ends with a delimiter.

Figure 2. Records with identifying codes

As shown in Figure 2, records can repeat individually; the end of the repetition is marked by the record ID of the next record. Records can also repeat as a set of records; the end of the repetition is marked by the record ID of the next record that is not in the set.

Records contain fields; each field begins with a field delimiter. Fields end with either the delimiter beginning in the next field or with the end of the record.

Fields can be complex, containing components that are delimited with a component delimiter. Fields can also contain multiple data values - they repeat and use a repeat delimiter between values as shown with the phone1 and phone2 fields.

To implement a Flat File Schema for this sample data with record IDs, you would map each characteristic of the data as shown in the following table:
Flat File Characteristic Flat File Schema Implementation
Delimited format for the highest level of structure in the data For the root node set the Structure property to Delimited
Record delimiter For the root node or a record node set the child delimiter
Record IDs
  • For the root node or a record node, set the record identifier property to the respective ID value
  • For the root node or a record node, set the record identifier offset to 1
Repeating records For record nodes, set the maximum occurrence property to unbounded or to a specific value
Repeating sets of records Create a group node for each set and make the records in the set the content of the group
Field delimiter Record nodes, set the child delimiter property
Repeating fields Record nodes, set the repeat delimiter property
Complex fields Define each as a subrecord that is child to the record containing the field
Components within complex fields Define as field children in the subrecord
Component delimiter Record node for the complex field, set the child delimiter property



Feedback | Notices


Timestamp icon Last updated: Wednesday, 15 June 2016


http://pic.dhe.ibm.com/infocenter/wci/v7r0m0/topic/com.ibm.wci.doc/ref_About_Delimited_Data.html