ESQL field references

This topic describes ESQL field references.

The full syntax for field references is as shown below:
Note: Start of changeYou must use a name or an asterisk throughout.End of change

[<] and [>] are legitimate. An example using < and > is shown in the following figure.

Note to users

For backward compatibility the LAST keyword is still supported, but its use is deprecated. LAST cannot be used as part of an index expression: [LAST] is valid, and is equivalent to [<], but [LAST3] is not valid.

The LAST keyword has been replaced by the following arrow syntax, which allows both a direction of search and index to be specified:
      Field [ > ]                   -- The first element, equivalent to [ 1 ]
      Field [ > (a + b) * 2 ]
      Field [ < ]                   -- The last element, equivalent to [ LAST ]
      Field [ < 1 ]	                -- The last element, equivalent to [ LAST ]
      Field [ < 2 ]	                -- The last but one element
      Field [ < (a + b) / 3 ]

A field reference consists of a correlation name, followed by zero or more path elements separated by periods (.). The correlation name identifies a well-known starting point in a message tree. The starting point must therefore be either a declared reference variable or one of the predefined start points, for example InputRoot. The path elements define a path from the start point to the desired field.

For example:
InputRoot.XML.Data.Invoice
starts the broker at the location InputRoot (that is, the root of the input message to a Compute node) and then performs a sequence of navigations. First, it navigates from root to the first child field XML, then to the first child field of the XML field Data. Finally, the broker navigates to the first child field of the Data field Invoice. Whenever this field reference occurs in an ESQL program, the invoice field is accessed.
This form of field reference is simple, convenient, and is the most commonly used. However, it does have two limitations:
  • Because the names used must be valid ESQL identifiers, you can use only names that conform to the rules of ESQL. That is, the names can contain only alphanumeric characters including underscore, the first character cannot be numeric, and names must be at least one character long. You can avoid these limitations by enclosing such names in double quotation marks. For example:
    InputRoot.XML."Customer Data".Invoice
    If you need to refer to fields that contain quotation marks, use two pairs of quotation marks around the reference. For example:
    Body.Message."""hello"""

    Some identifiers are reserved as keywords, but you can use double quotation marks around these identifiers to indicate that they must not be interpreted as keywords. For example, SET is a keyword. If you have a message that contains a field named SET that you want to refer to, code Body."SET". Keywords are not case sensitive, therefore SET, Set, and all other combinations of uppercase and lowercase letters are recognized as keywords.

    The element Item in the Invoice message is an example of this. Item is a reserved keyword, so when you refer to this element, you must enclose it in double quotation marks. For example:

    InputBody.Invoice.Purchases."Item"[1].Author  
  • Because the names of the fields appear in the ESQL program, they must be known when the program is written. This limitation can be avoided by using the alternative syntax that uses braces ( { ... } ). This syntax allows you to use any expression that returns a non-null value of type character.
    For example:
    InputRoot.XML."Customer Data".{'Customer-' || 
    	CurrentCustomer}.Invoice
    in which the invoices are contained in a folder with a name is formed by concatenating the character literal Customer- with the value in CurrentCustomer (which must be a declared variable of type character).

Enclosing anything in double quotation marks in ESQL makes it an identifier; enclosing anything in single quotation marks makes it a character literal. You must enclose all character strings (CHARACTER, BLOB, or BIT) in single quotation marks.

The existing syntax for field references and the meaning of that syntax is not changed in any way. The main features of the enhanced function are:
  • Each element of each field reference that contains a name clause can also contain a namespace clause defining the namespace to which the specified name belongs.
  • Each namespace name can be defined by either a simple identifier or by an expression (enclosed in curly braces). If an identifier is the name of a declared namespace constant, the value of the constant is used. If an expression is used, it must return a non-null value of type character.
  • A namespace clause of * explicitly states that namespace information is to be ignored when locating elements in a tree.
  • A namespace clause with no identifier, expression, or *, that is, only the : present, explicitly targets the notarget namespace
The meaning of the different combinations of type, namespace, name, and index clauses are as follows:
[..], *, *[..], (..), (..)[..], (..)*, (..)*[..]
None of these forms specifies a name or namespace. The target element is, therefore, located either by its type and index, or by its index only. These forms all existed prior to this change and their behavior has not changed in any way.
NameId, NameId[..], (..)NameId, (..)NameId[..]
All these forms specify a name but no namespace. The target element is located by namespace and name, and also by type and index where appropriate.

The namespace is taken to be the only namespace in the namespace path containing this name. The only namespace in the path is the notarget namespace.

These forms all existed before this change. Although their behavior has changed in that they now compare both name and namespace, existing transforms should see no change in their behavior because all existing transforms create their elements in the notarget namespace.

:*, :*[..], (..):*, (..):*[..]
All these forms specify the notarget namespace but no name. The target element is located by its namespace and also by type and index where appropriate.
:NameId, :NameId[..], (..):NameId, (..):NameId[..]
All these forms specify a name and the notarget namespace. The target element is located by namespace and name and also by type and index where appropriate.
*:*, *:*[..], (..)*:*, (..)*:*[..]
None of these forms specifies a name or a namespace. The target element is located either by its type and index, or by its index only.
*:NameId, *:NameId[..], (..)*:NameId, (..)*:NameId[..]
All these forms specify a name but no namespace. The target element is located by name and also by type and index where appropriate.
SpaceId:*, SpaceId:*[..], (..)SpaceId:*, (..)SpaceId:*[..]
All these forms specify a namespace but no name. The target element is located by namespace and also by type and index where appropriate.
SpaceId:NameId, SpaceId:NameId[..], (..)SpaceId:NameId, (..)SpaceId:NameId[..]
All these forms specify a namespace and name. The target element is located by namespace and name and also by type and index where appropriate.

In all the preceding cases a name, or namespace, provided by an expression contained in braces ({}) is equivalent to a name provided as an identifier.

By definition, the name of the notarget namespace is the empty string. The empty string can be selected by expressions which evaluate to the empty string, the empty identifier "", or by reference to a namespace constant defined as the empty string.

The use of field references usually implies searching for an existing element. This is true for field references that are the targets of SET statements and those in the AS clauses of SELECT statements. If the required element does not exist, it is created.

In all these situations, there are a variety of circumstances in which the broker cannot tell what the required name or namespace is, and in these situations the following general principles apply :
  • If the name clause is absent or does not specify a name, and the namespace clause is absent or does not specify or imply a namespace (that is, there is no name or namespace available), one of the following conditions applies:
    • If the assignment algorithm does not copy the name from some existing element, the new element has both its name and namespace set to the empty string and its name flag is not set automatically.

      In the absence of a type specification, the element's type is not Name or NameValue, which effectively indicates that the new element is nameless

      .
    • Otherwise, if the assignment algorithm chooses to copy the name from some existing element, the new element has both its name and namespace copied from the existing element and its Name flag is set automatically
  • If the name clause is present and specifies a name, but the namespace clause is absent or does not specify or imply a namespace (that is, a name is available but a namespace is not), the new element has its:
    • Name set to the given value
    • Namespace set to the empty string
    • Name flag set automatically
  • If the name clause is absent or does not specify a name, but the namespace clause is present and specifies or implies a namespace (that is, a namespace is available but a name is not), the new element has its:
    • Namespace set to the given value
    • Name set to the empty string
    • Name flag set automatically
  • If the name clause is present and specifies a name, and the namespace clause is present and specifies or implies a namespace, the new element has its:
    • Name set to the given value
    • Namespace set to the given value
    • Name flag set automatically
There are also cases where the broker creates elements in addition to those referenced by field references:
  • Tree copy: new elements are created by an algorithm that uses a source tree as a template. If the algorithm copies the name of a source element to a new element, its namespace is copied as well.
  • Anonymous select expressions: SELECT clauses are not obliged to have AS clauses; those that do not have them, set the names of the newly created elements to default values (see SELECT function).

    These defaults can be derived from element names, column names or can simply be manufactured sequence names. If the name is an element name, this is effectively a tree copy, and the namespace name is copied as above.

    Otherwise, the namespace of the newly-created element is derived by searching the path, that is, the name is be treated as the NameId syntax of a field reference.

Each element of a field reference can contain an index clause. This clause is denoted by brackets ( [ ... ] ) and accepts any expression that returns a non-null value of type integer. This clause identifies which of several fields with the same name is to be selected. Fields are numbered from the first, starting at one. If this clause is not present, it is assumed that the first field is required. Thus, the two examples below have exactly the same meaning:
InputRoot.XML.Data[1].Invoice
InputRoot.XML.Data.Invoice[1] 
This construct is most commonly used with an index variable, so that a loop steps though all such fields in sequence. For example:
WHILE count < 32 DO
     SET TOTAL = TOTAL + InputRoot.XML.Data.Invoice[count].Amount;
END WHILE; 
Use this kind of construct with care, because it implies that the broker must count the fields from the beginning each time round the loop. If the repeat count is large, performance is poor. In such cases, a better alternative is to use a field reference variable.
Index expressions can optionally be preceded by a less-than sign ( < ), indicating that the required field is to be indexed from the last field, not the first. In this case, the index 1 refers to the last element and the index 2 refers to the penultimate element. For completeness, you can use a greater-than sign to indicate counting from the first field. The example below shows ESQL code that handles indexes where there are four fields called Invoice.
InputRoot.XML.Data.Invoice       -- Selects the first
InputRoot.XML.Data.Invoice[1]    -- Selects the first
InputRoot.XML.Data.Invoice[>]    -- Selects the first
InputRoot.XML.Data.Invoice[>1]   -- Selects the first
InputRoot.XML.Data.Invoice[>2]   -- Selects the second
InputRoot.XML.Data.Invoice[<]    -- Selects the fourth
InputRoot.XML.Data.Invoice[<1]   -- Selects the fourth
InputRoot.XML.Data.Invoice[<2]   -- Selects the third
InputRoot.XML.Data.Invoice[<3]   -- Selects the second 
An index clause can also consist of an empty pair of brackets ( [] ). This selects all fields with matching names. Use this construct with functions that expect lists (for example, SELECT or CARDINALITY).

Each element of a field reference can contain a type clause. These are denoted by parentheses ( ( ) ), and accept any expression that returns a non-null value of type integer. The presence of a type expression restricts the fields that are selected to those of the matching type. This construct is most commonly used with generic XML, where there are many element types and it is possible for one XML element to contain both attributes and further XML elements with the same name.

For example:
<Item Value = '1234' >
     <Value>5678</Value>
</Item>
Here, the XML element Item has two child elements both called Value". The child elements can be distinguished by using type clauses: Item.(XML.Attribute)Value to select the attribute, and Item.(XML.Element)Value to select the element.

Related concepts
ESQL

Related tasks
Developing ESQL
Accessing known multiple occurrences of an element