XML

XML is a meta-markup language that defines how to write your own markup languages. Unlike HTML, XML markup languages are case-sensitive and all documents must be well-formed (more about this below). Well-formed XML-based markup can be parsed by generic parsers and processors regardless of the tags and attributes chosen for the application.

A tag is an entity in XML that defines an element. Tags are identifiers that are enclosed in angle brackets (< and >). For every opening tag there must be a closing tag. Closing tags are similar to opening tags except that there is a slash (/) before the tag name. In between the tags is the value of the element defined by the tag. For example, here is a <NAME> element defined using NAME tags:

<NAME>Joe Bloggs</NAME>

XML elements can be nested to define structure and white-space can be used to make the structure easier to identify:

<PERSON>
  <FIRST_NAME>Joe</FIRST_NAME>
  <SURNAME>Bloggs</SURNAME>
  <E_MAIL>jbloggs@acme.com</E_MAIL>
</PERSON>

XML is deemed to be well-formed:

a. If every element has an opening and closing tag.

b. Elements do not overlap (i.e. the elements delimited by opening and closing tags nest properly within each other).

c. There is a root element.

d. Case-sensitivity is respected and

e. <, >, &, , and " characters are escaped.

The following is not-well-formed XML because the elements overlap:

<BOLD>The quick brown <ITALICS>fox
jumps</BOLD> over the lazy dog.</ITALICS>

Characters with meaning in XML are escaped using &amp; for a &, &lt; for a <, &gt; for a >, &apos; for a , and &quot; for a ". These are called character entities.

The requirement for a root element makes this XML invalid:

<NAME>Joe Bloggs</NAME>
<NAME>Jane Doe</NAME>

as no single element forms the root. The following is valid, however, as NAME_LIST forms the root element:

<NAME_LIST>
  <NAME>Joe Bloggs</NAME>
  <NAME>Jane Doe</NAME>
</NAME_LIST>

XML elements can have attributes. Attributes are specified as part of the tag and can be used to hold meta-data about the elements (this is what they are usually used for but there is no prescription for their use).

<NAME_LIST ELEMENTS="4" RANGE="A-D">
  <NAME SEX="MALE">Hop Along</NAME> 
  <NAME SEX="MALE">Joe Bloggs</NAME> 
  <NAME SEX="MALE">P Cutter</NAME> 
  <NAME SEX="FEMALE">Jane Doe</NAME>
</NAME_LIST>

XML supports empty tags. These are tags where the start tag and end tag are combined into one and there is no element data. These tags start with a < and end with a />. Typically attributes are used to store the data in these tags. For example, here is an empty PERSON tag with NAME and SEX attributes:

<PERSON NAME="Joe Bloggs" SEX="MALE"/>

Comments can be entered in an XML document using an opening <!-- tag and a closing --> tag. For example:

<!--This is an empty PERSON tag-->
<PERSON NAME="Joe Bloggs" SEX="MALE"/>

That was XML in a nutshell.