Extensible Markup Language (XML) is a framework for defining document markup languages and is predicted to become the primary approach to document exchange over the Internet. In simple terms, a document markup language is a set of elements (frequently called tags) that support one or more of the following document characteristics:
XML can provide the following benefits:
XML implementers can define their own tag sets to describe document content. The precision of those descriptions is left to the implementer. For example, one implementation might use a <name> tag and another might find it more useful to use a <city_name> tag.
The content and structure of an XML document is defined by its grammar. The Document Type Definition (DTD) is an example of such a grammar. Other XML-based schemas are evolving. Grammar is used in this documentation as a generic reference to such schemas.
The grammar describes the valid tags, attributes (characteristics of tags, such as identifiers), and other content for the XML document. Whether an XML document is created as a static file or dynamically generated, the author is responsible for ensuring compliance with the grammar.
Since the early days of computer networking, there's been a need to facilitate the exchange of information among users. The size of potential user populations expands with the use of large networks, such as the Internet, and with the increase in computer-based communication. XML is not the first common document format, but it has advantages over comparable document exchange formats.
XML is the best format for source documents, because it enables the delivery of content in the most appropriate output format (such as HTML, Portable Document Format, and PostScript) and formats for applications (EDI, electronic data interchange).
The structure and meaning of XML document content are known (as defined in the document's grammar). The tag structure of XML enables unambiguous parsing of XML documents.
Grammars define structure and content of XML documents and can be used in performing more efficient searches. For example, user agents could support searching a collection of documents that were authored against a particular grammar. Searching by tag names, tag attributes, data content, and location within a document are other search strategies that XML documents enable.
Separating document content from document structure is especially important when the content for Web documents must be dynamically generated using programs. Such separation enables Web team members (Web page authors, business logic programmers, and graphics designers) to work in parallel with limited impact on one another's work.
XSL stylesheets can manipulate XML content in order to present different material to different users in different forms. For example, an auto parts catalog can be presented to a shopper as a view that includes the prices, descriptions, and order numbers for parts. The catalog view for the auto mechanic could include the information available to shoppers plus schematics that show the position of the installed part. The manufacturer's view could include information about subcomponents and materials.
In addition, XSL syntax supports the use of Cascading Style Sheets (CSS) to control the presentation of XML documents. Putting the presentation controls in a separate file from content enables XML implementers to create multiple views for an XML document without changing the document itself.
XML implementations can have the Web server send an XML document and its associated XSL stylesheets to the client once. Each stylesheet can provide a different view of portions or all of the document data. The user selects the stylesheet to apply. Changing from one view (stylesheet) to the next would not involve sending another request to the server.
XML applications support many document encodings, including Unicode double-byte character set (UTF-16) and a compressed version (UTF-8). Therefore, XML documents can include virtually any languages and scripts.
In HTML, the <A> tag links a document to another document or to a target within the same document. Those links are unidirectional (from the source document to the target). The <A> tag includes the address (URL) of the target link and a text label for the link.
In contrast, XML supports two types of advanced linking: XLink and XPointer. The XLink and XPointer standards are evolving. With XLink, any tag can be a link. Optional XLink attributes provide additional information about the link itself and about the target document. Other attributes control how the link is activated and what happens when the link is activated. A single link (called an extended link) can even point to multiple targets.
In HTML, an <A> tag can point to a heading, paragraph, or list within a document. The target section must be a named anchor or tag that permits the ID attribute. In XML, XPointer links can refer to any part of an XML document, even those without identifiers. XPointers link to points in the Document Object Model (DOM) tree (that represents the XML document) and consist of object references, such as root().child(2, address). XPointers can also point to ranges within a document.