W3C

XQuery 1.0 and XPath 2.0 Data Model

W3C Working Draft 12 November 2003

This version:
http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/
Latest version:
http://www.w3.org/TR/xpath-datamodel/
Previous version:
http://www.w3.org/TR/2003/WD-xpath-datamodel-20030502/
Editors:
Mary Fernández (XML Query WG), AT&T Labs <mff@research.att.com>
Ashok Malhotra (XML Query and XSL WGs), Microsoft <ashokma@microsoft.com>
Jonathan Marsh (XSL WG), Microsoft <jmarsh@microsoft.com>
Marton Nagy (XML Query WG), Science Applications International Corporation (SAIC) <marton.nagy@saic.com>
Norman Walsh (XSL WG), Sun Microsystems <Norman.Walsh@Sun.COM>

This document is also available in these non-normative formats: XML.


Abstract

This document defines the W3C XQuery 1.0 and XPath 2.0 Data Model, which is the data model of at least [XPath 2.0], [XSLT 2.0], and [XQuery], and any other specifications that reference it. This data model is based on the [XPath 1.0] data model and earlier work on an [XML Query Data Model]. This document is the result of joint work by the [XSL Working Group] and the [XML Query Working Group].

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a Public Working Draft for review by W3C Members and other interested parties. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The XQuery 1.0 and XPath 2.0 Data Model has been defined jointly by the XML Query Working Group and the XSL Working Group (both part of the XML Activity).

This is a Last Call Working Draft which consolidates changes and editorial improvements undertaken in response to feedback received during the previous Last Call publication which begin on 2 May 2003. A list of the Last Call issues addressed by the Working Groups is also available.

Comments on this document are due on 15 February 2004. Comments should be sent to the W3C mailing list public-qt-comments@w3.org. (archived at http://lists.w3.org/Archives/Public/public-qt-comments/) with “[DM]” at the beginning of the subject field.

Patent disclosures relevant to this specification may be found on the XML Query Working Group's patent disclosure page at http://www.w3.org/2002/08/xmlquery-IPR-statements and the XSL Working Group's patent disclosure page at http://www.w3.org/Style/XSL/Disclosures.html.

Table of Contents

1 Introduction
2 Concepts
    2.1 Terminology
    2.2 Notation
        2.2.1 Prefix Bindings
    2.3 Node Identity
    2.4 Document Order
    2.5 Types
3 Data Model Construction
    3.1 Direct Construction
    3.2 Construction from an Infoset
    3.3 Construction from a PSVI
        3.3.1 Mapping PSVI Additions to Types
        3.3.2 Mapping xsi:nil on Element Nodes
        3.3.3 Storing xs:dateTime, xs:date, and xs:time Values in the Data Model
        3.3.4 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values
4 Data Model Serialization
5 Accessors
    5.1 base-uri Accessor
    5.2 node-kind Accessor
    5.3 node-name Accessor
    5.4 parent Accessor
    5.5 string-value Accessor
    5.6 typed-value Accessor
    5.7 type Accessor
    5.8 children Accessor
    5.9 attributes Accessor
    5.10 namespaces Accessor
    5.11 nilled Accessor
6 Nodes
    6.1 Document Nodes
        6.1.1 Overview
        6.1.2 Accessors
        6.1.3 Construction from an Infoset
        6.1.4 Construction from a PSVI
    6.2 Element Nodes
        6.2.1 Overview
        6.2.2 Accessors
        6.2.3 Construction from an Infoset
        6.2.4 Construction from a PSVI
    6.3 Attribute Nodes
        6.3.1 Overview
        6.3.2 Accessors
        6.3.3 Construction from an Infoset
        6.3.4 Construction from a PSVI
    6.4 Namespace Nodes
        6.4.1 Overview
        6.4.2 Accessors
        6.4.3 Construction from an Infoset
        6.4.4 Construction from a PSVI
    6.5 Processing Instruction Nodes
        6.5.1 Overview
        6.5.2 Accessors
        6.5.3 Processing Instruction Information Items
        6.5.4 Construction from a PSVI
    6.6 Comment Nodes
        6.6.1 Overview
        6.6.2 Accessors
        6.6.3 Comment Information Items
        6.6.4 Construction from a PSVI
    6.7 Text Nodes
        6.7.1 Overview
        6.7.2 Accessors
        6.7.3 Construction from an Infoset
        6.7.4 Construction from a PSVI
7 Atomic Values
    7.1 New Datatypes
        7.1.1 xdt:untypedAny
        7.1.2 xdt:untypedAtomic
        7.1.3 xdt:anyAtomicType
        7.1.4 xdt:dayTimeDuration
        7.1.5 xdt:yearMonthDuration
8 Sequences

Appendices

A XML Information Set Conformance
B References
    B.1 Normative References
    B.2 Other References
C Glossary (Non-Normative)
D Example (Non-Normative)
E Accessor Summary (Non-normative)
F Infoset Construction Summary (Non-normative)
G PSVI Construction Summary (Non-normative)


1 Introduction

This document defines the XQuery 1.0 and XPath 2.0 Data Model, which is the data model of [XPath 2.0], [XSLT 2.0] and [XQuery]

The XQuery 1.0 and XPath 2.0 Data Model (henceforth "data model") serves two purposes. First, it defines precisely the information contained in the input to an XSLT or XQuery processor. Second, it defines all permissible values of expressions in the XSLT, XQuery, and XPath languages. A language is closed with respect to a data model if the value of every expression in the language is guaranteed to be in the data model. XSLT 2.0, XQuery 1.0, and XPath 2.0 are all closed with respect to the data model.

The data model is based on the [Infoset] (henceforth "Infoset"), but it requires the following new features to meet the [XPath 2.0 Requirements] and [XML Query Requirements]:

As with the Infoset, the XQuery 1.0 and XPath 2.0 Data Model specifies what information in the documents is accessible, but it does not specify the programming-language interfaces or bindings used to represent or access the data.

Every value in the data model is a sequence of zero or more items.

Every node is one of the seven kinds defined in 6 Nodes. Connected nodes form a tree that consists of a root node plus all the nodes that are reachable directly or indirectly from the root node via the dm:children, dm:attributes, and dm:namespaces accessors. Every node belongs to exactly one tree, and every tree has exactly one root node. [Definition: A tree whose root node is a document node is referred to as a document.] [Definition: A tree whose root node is some other kind of node is referred to as a fragment.]

An atomic value encapsulates an XML Schema atomic type and a corresponding value of that type. They are defined in 7 Atomic Values. A sequence is an ordered collection of nodes, atomic values, or any mixture of nodes and atomic values. A sequence cannot be a member of a sequence. A single item appearing on its own is modeled as a sequence containing one item. Sequences are defined in 8 Sequences.

Note:

In XPath 1.0, the data model only defines nodes. The primitive data types (number, boolean, string, node-set) are part of the expression language, not the data model.

The data model can represent various values including not only the input and the output of a stylesheet or query, but all values of expressions used during the intermediate calculations. Examples include the input document or document repository (represented as a document node or a sequence of document nodes), the result of a path expression (represented as a sequence of nodes), the result of an arithmetic or a logical expression (represented as an atomic value), a sequence expression resulting in a sequence of items, etc.

In this document, we provide a precise definition of the properties of nodes in the XQuery 1.0 and XPath 2.0 Data Model, how they are accessed, and how they relate to values in the Infoset. We note wherever the XQuery 1.0 and XPath 2.0 Data Model differs from that of XPath 1.0.

2 Concepts

This section outlines a number of general concepts that apply throughout this specification.

2.1 Terminology

For a full glossary of terms, see C Glossary.

In this specification the words must, must not, should, should not, may and recommended are to be interpreted as described in [RFC 2119].

[Definition: In this specification, the term implementation-defined refers to a feature where the implementation is allowed some flexibility, and where the choices made by the implementation should be described in the vendor's documentation.]

[Definition: The term implementation-dependent refers to a feature where the behavior may vary from one implementation to another, and where the vendor is not expected to provide a full specification of the behavior.] (This might apply, for example, to limits on the size of data models that can be constructed.)

In all cases where this specification leaves the behavior implementation-defined or implementation-dependent, the implementation has the option of providing mechanisms that allow the user to influence the behavior.

Paragraphs labeled as Notes or described as examples are non-normative.

2.2 Notation

In addition to prose, we define a set of accessor functions to explain the data model. The accessors defined by the data model are shown with the prefix dm:. The prefix is always shown in italics to emphasize that these functions are abstract; they exist to explain the interface between the data model and specifications that rely on the data model: they are not and cannot be made accessible directly from the host language.

The signature of accessors is shown using the same style as [Functions and Operators]. For example:

dm:typed-value($n as node()) as xdt:anyAtomicType*

Some accessors can accept or return sequences. The following notation is used to denote sequence values:

  • V* denotes a sequence of zero or more items of type V.

  • V? denotes a sequence of exactly zero or one items of type V.

  • V+ denotes a sequence of one or more items of type V.

In a sequence, V may be a node or an atomic value.

There are some functions in the data model that are partial functions. We use the occurrence indicators ? or * when specifying the return type of such functions. For example, a node may have one parent node or no parent. If the node argument has a parent, the dm:parent accessor returns a singleton sequence. If the node argument does not have a parent, it returns the empty sequence. The signature of dm:parent specifies that it returns an empty sequence or a sequence containing one node:

dm:parent($n as node()) as node()?

This document relies on the [Infoset]. Information items and properties are indicated by the styles information item and [property], respectively.

This document frequently uses the term expanded-QName. [Definition: An expanded-QName is a pair of values consisting of a namespace URI and a local name. They belong to the value space of the XML Schema type xs:QName. When this document refers to xs:QName we always mean the value space, i.e. a namespace URI, local name pair (and not the lexical space referring to constructs of the form prefix:local-name).]

2.2.1 Prefix Bindings

Several prefixes are used throughout this document for notational convenience. The following bindings are assumed.

  1. xs: bound to http://www.w3.org/2001/XMLSchema

  2. xsi: bound to http://www.w3.org/2001/XMLSchema-instance

  3. xdt: bound to http://www.w3.org/2003/11/xpath-datatypes

  4. fn: bound to http://www.w3.org/2003/05/xpath-functions

In practice, any prefix that is bound to the appropriate URI may be used.

2.3 Node Identity

Because XML documents are tree-structured, we define the data model using conventional terminology for trees. The data model is a node-labeled, directed graph, in which each node has a unique identity. Every node in the data model is unique: identical to itself, and not identical to any other node.

This concept should not be confused with the concept of a unique ID, which is a unique name assigned to an element by the author to represent references using ID/IDREF correlation.

2.4 Document Order

[Definition: A document order is defined among all the nodes used during a given query or transformation. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order is the order returned by an in-order, depth-first traversal of the data model.] Document order is stable, which means that the relative order of two nodes will not change during the processing of a given query or transformation, even if this order is implementation-dependent.

Within a tree, document order satisfies the following constraints:

  1. The root node is the first node.

  2. The relative order of siblings is determined by their order in the XML representation of the tree. A node N1 occurs before a node N2 in document order if and only if the start of N1 occurs before the start of N2 in the XML representation.

  3. Namespace nodes immediately follow the element node with which they are associated. The relative order of namespace nodes is stable but implementation-dependent.

  4. Attribute nodes immediately follow the namespace nodes of the element with which they are associated. The relative order of attribute nodes is stable but implementation-dependent.

  5. Element nodes occur before their children; children occur before following-siblings.

The relative order of nodes in distinct trees is stable but implementation-dependent, subject to the following constraint: If any node in tree T1 is before any node in tree T2, then all nodes in tree T1 are before all nodes in tree T2.

2.5 Types

The data model uses expanded-QNames to represent the names of named types, which includes both the built-in types defined by [Schema Part 2] and named user-defined types declared in a schema and imported by a stylesheet or query. Since named types in XML Schema are global, an expanded-QName uniquely identifies such a type. The namespace name of the expanded-QName is the [target namespace] property of the type definition, and its local name is the [name] property of the type definition.

For anonymous types, the processor must construct an anonymous type name that is distinct from the name of every named type and the name of every other anonymous type. [Definition: An anonymous type name is an implementation defined, unique type name provided by the processor for every anonymous type declared in an imported schema.] Anonymous type names must be globally unique across all anonymous types that are accessible to the processor. In the formalism of this specification, we assume that the anonymous type names are xs:QNames, but in practice implementations are not required to use xs:QNames to represent the implementation-defined names of anonymous types.

The data model associates type information with element nodes, attribute nodes and atomic values. The item is guaranteed to be a valid instance of that type.

When no type information exists for an element or an attribute node we frequently use the terminology "element with unknown type" or "attribute with unknown simple type".

The data model does not represent element or attribute declaration schema components, but it supports various type-related operations. The semantics of other operations, for example, checking if a particular instance of an element node has a given type is defined in [Formal Semantics].

3 Data Model Construction

In this section, we describe the constraints on data model construction.

The data model supports well-formed XML documents conforming to [Namespaces in XML]. Documents that are not well-formed are, by definition, not XML. XML documents that do not conform to [Namespaces in XML] are not supported (nor are they supported by [Infoset]).

In other words, the data model supports the following classes of XML documents:

This document describes how to construct an instance of the data model from an [Infoset] or a Post Schema Validation Infoset (PSVI), the augmented infoset produced by an XML Schema validation episode.

An instance of the data model can also be constructed directly through application APIs, or from non-XML sources such as relational tables in a database.

The data model supports some kinds of values that are not supported by [Infoset]. Examples of these are well-formed document fragments and sequences of documents nodes. The data model also supports values that are not nodes. Examples of these are atomic values, sequences of atomic values, or sequences mixing nodes and atomic values. These are necessary to be able to represent the results of intermediate expressions in the data model during expression processing.

3.1 Direct Construction

Although this document describes construction of a data model in terms of infoset properties, an infoset is not an absolutely necessary precondition for building an instance of the data model.

There are no constraints on how an instance of the data model may be constructed directly, save that the resulting data model instance must satisfy all of the constraints described in this document.

3.2 Construction from an Infoset

An instance of the data model can be constructed from an [Infoset]. A data model can only be constructed from infosets that satisfy the following general constraints:

  • All general and external parsed entities must be fully expanded. The Infoset must not contain any unexpanded entity reference information items.

  • The infoset must provide all of the properties identified as "required" in this document. The properties identified as "optional" may be used, if they are present. All other properties are ignored.

Constructing and instance of the data model from an information set must be consistent with the description provided for each node type.

3.3 Construction from a PSVI

An instance of the data model can be constructed from a PSVI that has been strictly, laxly, or skip validated or validated using any combination assessment modes. Constructing an instance of the data model from a PSVI must be consistent with the description provided in this section and with the description provided for each node type.

[Definition: An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than 'valid' for the [validity] property in the PSVI.]

The data model supports incompletely validated documents. Elements and attributes that are not valid are treated as having unknown types.

The most significant difference between Infoset construction and PSVI construction occurs in the area of type assignment. Other differences can also arise from schema processing: default attribute and element values may be provided, whitespace normalization of element content may occur, and the user-supplied lexical form of elements and attributes with atomic types may be lost.

3.3.1 Mapping PSVI Additions to Types

A PSVI element or attribute information item may have a [validity] property. The [validity] property may be "valid", "invalid", or "notKnown" and reflects the outcome of schema-validity assessment. The only information that can be inferred from an invalid or not known validity value is that the information item is well-formed, therefore, we must associate very general type information with the element or attribute node if it is not known to be valid.

The precise definition of the type of an element or attribute information item depends on the properties of the Infoset or PSVI. In a PSVI, XML Schema only guarantees the existence of either the [type definition] property, or the [type definition namespace], [type definition name] and [type definition anonymous] properties. If the type definition refers to a union type, there are further properties defined, that refer to the type definition which actually validated the item's normalized value. These properties are either the [member type definition], or the [member type definition namespace], [member type definition name] and [member type definition anonymous] properties. If these are available, the type of an element or attribute will refer to the member type that actually validated the schema normalized value.

If the [validity] property exists and is "valid", the type of an element or attribute information item is represented by an expanded-QName whose namespace and local name correspond to the first applicable items in the following list:

  • If [member type definition] exists and its {name} property is present:

    • The {target namespace} and {name} properties of the [member type definition] property.

  • If the [type definition] property exists and its {name} property is present:

    • The {target namespace} and {name} properties of the [type definition] property.

  • If [member type definition anonymous] exists:

    • If it is false: the [member type definition namespace] and the [member type definition name].

    • Otherwise, the namespace and local name of the appropriate anonymous type name.

  • If [type definition anonymous] exists:

    • If it is false: the [type definition namespace] and the [type definition name]

    • Otherwise, the namespace and local name of the appropriate anonymous type name.

If the [validity] property does not exist or is not "valid", the type of an element is xdt:untypedAny and the type of an attribute is xdt:untypedAtomic.

3.3.2 Mapping xsi:nil on Element Nodes

[Schema Part 2] introduced a mechanism for signaling that an element should be accepted as valid when it has no content despite a content type which does not require or even necessarily allow empty content. That mechanism is the xsi:nil attribute.

The data model exposes this special semantic in the nilled property.

If the [validity] property exists on an element node and is "valid" then if the [nil] property exists and is true, then nilled property is "true". In all other cases, including all cases where schema validity assessment was not attempted or did not succeed, the nilled property is "false".

3.3.3 Storing xs:dateTime, xs:date, and xs:time Values in the Data Model

[Schema Part 2] permits xs:dateTime, xs:date, and xs:time values both with and without timezones and therefore only specifies a partial ordering between date and time values. In the data model, it is necessary to preserve timezone information.

In order to achieve this goal, xs:dateTime, xs:date, and xs:time values must be stored with care. If the lexical representation of the value includes a timezone, it is converted to UTC as defined by [Schema Part 2] and the timezone in the lexical representation is converted to a xdt:dayTimeDuration value. Implementations must keep track of both these values for each xs:dateTime, xs:date, and xs:time stored.

Lexical representations that do not have a timezone are assumed to be in UTC for the purposes of normalization only. An empty sequence is used for their timezone.

Thus, for the purpose of validation, "2003-01-02T11:30:00-05:00" is converted to "2003-01-02T16:30:00Z", but in the data model it must be stored as as "(2003-01-02T16:30:00Z, -PT5H0M)". The value "2003-01-16T16:30:00" is stored as "(2003-01-02T16:30:00Z, ())" because it has no timezone.

3.3.4 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values

For xs:dateTime, xs:date and xs:time, the typed value is the atomic value that is determined from its stored form as follows:

  • If the timezone component is not the empty sequence, then the value contains the time component, normalized to the timezone specified by the timezone component, as well as the timezone component. The stored values "(2003-01-02T16:30:00Z, -PT5H0M)" produce the value "2003-01-02T11:30:00-05:00".

  • If the timezone component is the empty sequence, then the time component without any indication of timezone. The stored values "(2003-01-02T16:30:00Z, ())" produce the value "2003-01-02T16:30:00".

4 Data Model Serialization

Constructing an Infoset from an instance of the data model, for example in order to perform schema validity assessment, is accomplished by serializing the document and parsing it. Implementations are not required to implement this process literally, but they must obtain the same result as if they had.

Serialization of the data model is governed by [Serialization].

5 Accessors

A set of accessors is defined on all seven kinds of nodes, see 6 Nodes. Some accessors return a constant empty sequence on certain node kinds. Some node kinds have additional accessors that are not summarized here.

In order for applications to be able to operate on instances of the data model, the model must expose properties of the items it contains. The data model does this by defining a family of accessor functions. These are not functions in the literal sense, they are not available for users or applications to call directly, rather they are descriptions of the interface that an implementation of the data model must expose to applications. Functions and operators available to end-users are described in [Functions and Operators].

5.1 base-uri Accessor

dm:base-uri($n as node()) as xs:anyURI?

The dm:base-uri accessor returns the base URI of a node as a sequence containing zero or one URI reference. For more information about base URIs, see [XML Base].

It is defined on all seven node types.

5.2 node-kind Accessor

dm:node-kind($n as node()) as xs:string

The dm:node-kind accessor returns a string identifying the kind of node. It will be one of “document”, “element”, “attribute”, “processing-instruction”, “comment”, or “text”.

It is defined on all seven node types.

5.3 node-name Accessor

dm:node-name($n as node()) as xs:QName?

The dm:node-name accessor returns the name of the node as a sequence of zero or one xs:QNames.

It is defined on all seven node types.

5.4 parent Accessor

dm:parent($n as node()) as node()?

The dm:parent accessor returns the parent of a node as a sequence containing zero or one nodes.

It is defined on all seven node types.

5.5 string-value Accessor

dm:string-value($n as node()) as xs:string

The dm:string-value accessor returns the string value of a node.

It is defined on all seven node types.

5.6 typed-value Accessor

dm:typed-value($n as node()) as xdt:anyAtomicType*

The dm:typed-value accessor returns the typed-value of the node as a sequence of zero or more atomic values.

It is defined on all seven node types.

5.7 type Accessor

dm:type($n as node()) as xs:QName?

The dm:type accessor returns the name of the type of a node as a sequence of zero or one xs:QNames.

It is defined on all seven node types.

5.8 children Accessor

dm:children($n as node()) as node()*

The dm:children accessor returns the children of a node as a sequence containing zero or more nodes.

It is defined on all seven node types.

5.9 attributes Accessor

dm:attributes($n as node()) as attribute()*

The dm:attributes accessor returns the attributes of a node as a sequence containing zero or more attribute nodes.

It is defined on all seven node types.

5.10 namespaces Accessor

dm:namespaces($n as node()) as namespace()*

The dm:namespaces accessor returns the namespaces associated with a node as a sequence containing zero or more namespace nodes.

It is defined on all seven node types.

5.11 nilled Accessor

dm:nilled($n as node()) as xs:boolean?

The dm:nilled accessor returns true if the node is "nilled", see 3.3.2 Mapping xsi:nil on Element Nodes.

It is defined on all seven node types, but always returns the empty sequence for all nodes except elements.

6 Nodes

[Definition: The category of Node values contains seven distinct kinds of nodes: document, element, attribute, text, namespace, processing instruction, and comment.] The seven kinds of nodes are defined in the following subsections.

6.1 Document Nodes

6.1.1 Overview

Document nodes encapsulate XML documents. Documents have the following properties:

  • base-uri, possibly empty.

  • children, possibly empty.

  • unparsed-entities, possibly empty.

  • document-uri, possibly empty.

Document nodes must satisfy the following constraints.

  1. Every document node must have a unique identity, distinct from all other nodes.

  2. The children must consist exclusively of element, processing instruction, comment, and text nodes if it is not empty. Attribute, namespace, and document nodes can never appear as children

  3. The sequence of nodes in the children property is ordered and must be in document order.

  4. The children property must not contain two consecutive text nodes.

  5. If a node N is a child of a document node D, then the parent of N must be D.

  6. If a node N has a parent document node D, then N must be among the children of D.

  7. The children property must not contain two nodes with the same identity.

In the [Infoset], a document information item must have at least one child, its children must consist exclusively of element information items, processing-instruction information items and comment information items, and exactly one of the children must be an element information item. This data model is more permissive: a document node may be empty, it may have more than one element node as a child, and it also permits text nodes as children.

Implementations that support DTD processing and access to the unparsed entity accessors use the unparsed-entities property to associate information about an unordered collection of unparsed entities with a document node.

6.1.2 Accessors

dm:base-uri

Returns the value of the base-uri property if it exists and is not empty, otherwise returns ().

dm:node-kind

Returns "document".

dm:node-name

Returns ().

dm:parent

Returns ()

dm:string-value

Returns the concatenation of the string-values of all its text node descendants in document order.

dm:typed-value

Returns dm:string-value of the node as an xdt:untypedAtomic value.

dm:type

Returns ()

dm:children

Returns the value of the children property.

dm:attributes

Returns ()

dm:namespaces

Returns ()

dm:nilled

Returns ()

Three additional accessors are defined on document nodes:

dm:unparsed-entity-system-id( $node  as document(),
$entityname  as xs:string) as xs:string?

The dm:unparsed-entity-system-id accessor returns the system identifier of an unparsed external entity declared in the specified document. If no entity with the name specified in $entityname exists, or if the entity is not an external unparsed entity, the empty sequence is returned.

dm:unparsed-entity-public-id( $node  as document(),
$entityname  as xs:string) as xs:string?

The dm:unparsed-entity-public-id accessor returns the public identifier of an unparsed external entity declared in the specified document. If no entity with the name specified in $entityname exists, or if the entity is not an external unparsed entity, or if the entity has no public identifier, the empty sequence is returned.

dm:document-uri($node as document()) as xs:string?

The dm:document-uri accessor returns the absolute URI of the resource from which the document node was constructed, if the absolute URI is available. If there is no URI available, or if it cannot be made absolute when the data model is constructed, the empty sequence is returned.

For example, if a collection of documents is returned by the fn:collection function, the dm:document-uri may serve to distinguish between them even though each has the same dm:base-uri.

6.1.3 Construction from an Infoset

The document information item is required. A Document Node is constructed for each document information item.

The following infoset properties are required: [children] and [base URI].

The following infoset properties are optional: [unparsed entities].

Document node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding element, processing instruction, comment, or text node is constructed and that sequence of nodes is used as the value of the children property.

If present among the [children], the [document type declaration] information item is ignored.

unparsed-entities

If the [unparsed entities] property is present and is not the empty set, the values of the unparsed entity information items must be used to support the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id accessors.

The internal structure of the values of the unparsed-entities property is implementation defined.

6.1.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

6.2 Element Nodes

6.2.1 Overview

Element nodes encapsulate XML elements. Elements have the following properties:

  • base-uri, possibly empty.

  • node-name

  • parent, possibly empty

  • type

  • children, possibly empty

  • attributes, possibly empty

  • namespaces, possibly empty

  • nilled

Element nodes must satisfy the following constraints.

  1. Every element node must have a unique identity, distinct from all other nodes.

  2. The children must consist exclusively of element, processing instruction, comment, and text nodes if it is not empty. Attribute, namespace, and document nodes can never appear as children

  3. The sequence of nodes in the children property is ordered and must be in document order.

  4. The children property must not contain two consecutive text nodes.

  5. The children property must not contain two nodes with the same identity.

  6. The attributes of an element must have distinct xs:QNames.

  7. The namespace nodes of an element must have distinct names. At most one of the namespace nodes of an element has no name (this is the default namespace).

  8. If a node N is a child of an element E, then the parent of N must be E.

  9. Exclusive of attribute and namespace nodes, if a node N has a parent element E, then N must be among the children of E. (Attribute and namespace nodes have a parent, but they do not appear among the children of their parent.)

    The data model permits element nodes without parents (to represent partial results during expression processing, for example). Such element nodes must not appear among the children of any other node.

  10. If an attribute node A has a parent element E, then A must be among the attributes of E.

    The data model permits attribute nodes without parents. Such attribute nodes must not appear among the attributes of any element node.

  11. If a namespace node N has a parent element E, then N must be among the namespaces of E.

    The data model permits namespace nodes without parents. Such namespace nodes must not appear among the namespaces of any element node.

The data model does not enforce a constraint that the namespaces of an element must include namespace nodes for each of the namespace URIs used in the element name and the names of its attributes, or of namespace URIs used in the content of elements and attributes of type xs:QName. Applications of the data model (such as XSLT and XQuery) may enforce such constraints in particular circumstances, but these constraints are not part of the data model.

6.2.2 Accessors

dm:base-uri

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the element has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-kind

Returns "element".

dm:node-name

Returns the value of the node-name property.

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the string value calculated as follows:

  • If the element has a type of xdt:untypedAny, a complex type with complex content, or a complex type with mixed content, returns the concatenation of the string-values of all its text node descendants in document order. It returns "" if the element has no text node descendants.

  • If the element has a complex type with empty content, returns "".

  • If the element has a simple type or a complex type with simple content:

    • If the element type is xs:string, or a type derived from xs:string, returns that string.

    • If the element type is xs:anyURI, returns the characters of the URI.

    • If the element type is xs:QName returns the value calculated as follows:

      • If the value has no namespace URI and the in-scope namespaces map the default namespace to any namespace URI, then an error is raised ("default namespace is defined").

      • If the value has a namespace URI, then there must be at least one prefix mapped to that URI in the in-scope namespaces. If there is no such prefix, an error is raised ("no prefix defined for namespace"). If there is more than one such prefix, the one that is chosen is implementation dependent.

      If no error occurs, returns a string with the lexical form of a xs:QName using the prefix chosen as described above, and the local name of the value.

    • If the element type is xs:dateTime, xs:date, or xs:time, returns the original lexical representation of the typed value recovered as follows: if an explicit timezone was present, the normalized value is adjusted using the explicit timezone; if an explicit timezone was not present, the Z is dropped from the normalized value. The normalized value and the explicit timezone, if present, are converted separately to xs:string and concatenated to yield the string value.

    • In all other cases, returns the concatenation of the string-values of all its text node descendants in document order.

dm:typed-value

Returns the typed value calculated as follows:

  • If the element has a type of xdt:untypedAny or a complex type with mixed content, returns the string value of the node as an instance of xdt:untypedAtomic.

  • If the element has a simple type or a complex type with simple content, returns a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

    For xs:dateTime, xs:date and xs:time, the typed value is the atomic value that is determined from its tuple representation as described in 3.3.4 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values.

  • If the node has a complex type with empty content, returns ().

  • If the node has a complex type with complex content, raises a type error, which may be handled by the host language.

dm:type

Returns the value of the type property.

dm:children

Returns the value of the children property.

dm:attributes

Returns the value of the attributes property. The order of attribute nodes is stable but implementation dependent.

dm:namespaces

Returns the value of the namespaces property. The order of namespace nodes is stable but implementation dependent.

dm:nilled

Returns the value of the nilled property.

6.2.3 Construction from an Infoset

The element information items are required. An Element Node is constructed for each element information item.

The following infoset properties are required: [namespace name], [local name], [children], [attributes], [in-scope namespaces], [base URI], and [parent].

Element node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

parent

The node that corresponds to the value of the [parent] property.

type

All element nodes constructed from an infoset have the type xdt:untypedAny.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property.

Implementations may ignore namespace information items for namespaces which do not appear in the expanded QName of any element or attribute information item. This can arise when xs:QNames are used in content.

nilled

All element nodes constructed from an infoset have a nilled property of "false".

6.2.4 Construction from a PSVI

The following Element Node properties are effected by PSVI properties.

type
  • If the [validity] property exists and is “valid” on this element and all of its ancestors, type is assigned as described in 3.3.1 Mapping PSVI Additions to Types

  • Otherwise, xdt:untypedAny.

children
  • If the [schema normalized value] PSVI property exists, the processor may, depending on the implementation, use a sequence of nodes containing the processing instruction and comment nodes corresponding to the processing instruction and comment information items found in the [children] property, plus a single text node whose string value is the the [schema normalized value] for the children property. The order of these nodes is implementation defined.

  • Otherwise, the sequence of nodes constructed from the information items found in the [children] property.

    For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding element, processing instruction, comment, or text node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

nilled

[Schema Part 2] introduced a mechanism for signaling that an element is valid even when it has no content despite a content type which does not allow empty content. The data model exposes this special semantic in the nilled property.

If the [validity] property exists on an element node and is "valid" then if the [nil] property exists and is true, then nilled property is "true". In all other cases, including all cases where schema validity assessment was not attempted or did not succeed, the nilled property is "false".

All other properties have values that are consistent with construction from an infoset.

6.3 Attribute Nodes

6.3.1 Overview

Attribute nodes represent XML attributes. Attributes have the following properties:

  • node-name

  • string-value

  • parent, possibly empty

  • type

Attribute nodes must satisfy the following constraints.

  1. Every attribute node must have a unique identity, distinct from all other nodes.

  2. If a attribute node A has a parent element E, then A must be among the attributes of E.

    The data model permits attribute nodes without parents (to represent partial results during expression processing, for example). Such attributes must not appear among the attributes of any element node.

For convenience, the element node that owns this attribute is called its "parent" even though an attribute node is not a "child" of its parent element.

6.3.2 Accessors

dm:base-uri

If the attribute has a parent, returns the value of the dm:base-uri of its parent; otherwise it returns ().

dm:node-kind

Returns "attribute".

dm:node-name

Returns the value of the node-name property.

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value calculated as follows:

  • If the attribute type is xdt:untypedAtomic, xs:string, or a type derived from xs:string, returns that string.

  • If the attribute type is xs:anyURI, returns the characters of the URI.

  • If the attribute type is xs:QName returns the value calculated as follows:

    • If the value has no namespace URI, then an error is raised ("default namespace is defined") if the in-scope namespaces map the default namespace to any namespace URI.

    • If the value has a namespace URI, then there must be at least one prefix mapped to that URI in the in-scope namespaces. If there is no such prefix, an error is raised ("no prefix defined for namespace"). If there is more than one such prefix, the one that is chosen is implementation dependent.

    If no error occurs, returns a string with the lexical form of a xs:QName using the prefix chosen as described above, and the local name of the value.

  • If the attribute type is xs:dateTime, xs:date, or xs:time, returns the original lexical representation recovered as follows: if an explicit timezone was present, the normalized value is adjusted using the explicit timezone; if an explicit timezone was not present, the Z is dropped from the normalized value. The normalized value and the explicit timezone, if present, are converted separately to xs:string and concatenated to yield the string value.

dm:typed-value

Returns the value calculated as follows:

  • If the attribute has a type of xdt:untypedAtomic, returns the string value of the node as an instance of xdt:untypedAtomic.

  • If the attribute has a simple type, returns a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

    For xs:dateTime, xs:date and xs:time, the typed value is the atomic value that is determined from its tuple representation as described in 3.3.4 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values.

dm:type

Returns the value of the type property.

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

6.3.3 Construction from an Infoset

The attribute information items are required. An attribute node is constructed for each attribute information item.

The following infoset properties are required: [namespace name], [local name], [normalized value], [attribute type], and [owner element].

Attribute node properties are derived from the infoset as follows:

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

string-value

The [normalized value] property.

parent

The element node that corresponds to the value of the [owner element] property.

type
  • If the [attribute type] property has one of the following values: ID, IDREF, IDREFS, ENTITY, ENTITIES, NMTOKEN, or NMTOKENS, an xs:QName with the [attribute type] as the local name and "http://www.w3.org/2001/XMLSchema" as the namespace name.

  • Otherwise, xdt:untypedAtomic.

6.3.4 Construction from a PSVI

The following Attribute Node properties are effected by PSVI properties.

string-value
  • The [schema normalized value] PSVI property if that exists, otherwise

  • the [normalized value] property.

type
  • If the [validity] property does not exist on this node or any of its ancestors, Infoset processing applies.

    Note that this processing is only performed if no part of the subtree that contains the node was schema validated. In particular, Infoset-only processing does not apply to subtrees that are "skip" validated in a document.

  • If the [validity] property exists and is "valid", type is assigned as described in 3.3.1 Mapping PSVI Additions to Types

  • Otherwise, xdt:untypedAtomic.

All other properties have values that are consistent with construction from an infoset.

Note: attributes from the XML Schema instance namespace, "http://www.w3.org/2001/XMLSchema-instance", (xsi:schemaLocation, xsi:type, etc.) appear as ordinary attributes in the data model. They will be validated appropriately by schema processors and will simply appear as attributes of type xs:anySimpleType if they haven't been schema validated.

6.4 Namespace Nodes

6.4.1 Overview

Namespace nodes encapsulate XML namespaces. Namespaces have the following properties:

  • prefix, possibly empty.

  • uri

  • parent, possibly empty

Namespace nodes must satisfy the following constraints.

  1. Every namespace node must have a unique identity, distinct from all other nodes.

  2. If a namespace node N has a parent element E, then N must be among the namespaces of E.

    The data model permits namespace nodes without parents, see below.

In XPath 1.0, namespace nodes were directly accessible by applications, by means of the namespace axis. In XPath 2.0 the namespace axis is deprecated, and it is not available at all in XQuery 1.0. XPath 2.0 implementations are not required to expose the namespace axis, though they may do so if they wish to offer backwards compatibility.

The information held in namespace nodes is instead made available to applications using functions defined in [Functions and Operators]. Some properties of namespace nodes are not exposed by these functions: in particular, properties related to the identity of namespace nodes, their parentage, and their position in document order. Implementations that do not expose the namespace axis can therefore avoid the overhead of maintaining this information.

Implementations that expose the namespace axis must provide unique namespace nodes for each element. Each element has an associated set of namespace nodes, one for each distinct namespace prefix that is in scope for the element (including the xml prefix, which is implicitly declared by [Namespaces in XML] and one for the default namespace if one is in scope for the element. The element is the parent of each of these namespace nodes; however, a namespace node is not a child of its parent element. Elements never share namespace nodes.

6.4.2 Accessors

dm:base-uri

Returns ().

dm:node-kind

Returns "namespace".

dm:node-name

If the implementation preserves information about the prefixes declared, returns an xs:QName with the value of the prefix property in the local-name and an empty namespace name, otherwise returns ().

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the uri property.

dm:typed-value

Returns the dm:string-value of the node as an xdt:untypedAtomic value.

dm:type

Returns ().

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

6.4.3 Construction from an Infoset

The namespace information items are required.

The following infoset properties are required: [prefix], [namespace name].

Namespace node properties are derived from the infoset as follows:

prefix

The [prefix] property.

uri

The [namespace name] property.

parent

The element to which this Namespace Node applies, if the implementation exposes any mechanism for accessing the dm:parent accessor of Namespace Nodes.

6.4.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

6.5 Processing Instruction Nodes

6.5.1 Overview

Processing instruction nodes encapsulate XML processing instructions. Processing instructions have the following properties:

  • target

  • content

  • base-uri, possibly empty

  • parent, possibly empty

Processing instruction nodes must satisfy the following constraints.

  1. Every processing instruction node must have a unique identity, distinct from all other nodes.

  2. The target must be an NCName.

6.5.2 Accessors

dm:base-uri

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the processing instruction has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-kind

Returns "processing-instruction".

dm:node-name

Returns an xs:QName with the value of the target property in the local-name and an empty namespace name.

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the content property.

dm:typed-value

Returns the dm:string-value of the processing instruction as a xs:string value.

dm:type

Returns ().

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

6.5.3 Processing Instruction Information Items

The processing instruction information items are optional.

Although the data model is able to represent processing instructions, it may be unnecessary or even onerous for some applications to do so. Applications should construct nodes in the data model to represent processing instructions. The decision whether or not to represent processing instructions is considered outside the scope of the data model, consequently the data model makes no attempt to control or identify if any or all processing instructions are ignored.

A Processing Instruction Node is constructed for each processing instruction information item that is not ignored.

The following infoset properties are required: [target], [content], [base URI], and [parent].

Processing instruction node properties are derived from the infoset as follows:

target

The value of the [target] property.

content

The value of the [content] property.

base-uri

The value of the [base URI] property.

parent

The node corresponding to the value of the [parent] property.

There are no processing instruction nodes for processing instructions that are children of a document type declaration information item.

6.5.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

6.6 Comment Nodes

6.6.1 Overview

Comment nodes encapsulate XML comments. Comments have the following properties:

  • content

  • parent

Comment nodes must satisfy the following constraints.

  1. Every comment node must have a unique identity, distinct from all other nodes.

  2. The string "--" must not occur within the content.

6.6.2 Accessors

dm:base-uri

If the comment has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-kind

Returns "comment".

dm:node-name

Returns ().

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the content property.

dm:typed-value

Returns the dm:string-value of the comment as an xs:string.

dm:type

Returns ().

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

6.6.3 Comment Information Items

The comment information items are optional.

Although the data model is able to represent comments, it may be unnecessary or even onerous for some applications to do so. Applications should construct nodes in the data model to represent comments. The decision whether or not to represent comments is considered outside the scope of the data model, consequently the data model makes no attempt to control or identify if any or all comments are ignored.

A Comment Node is constructed for each comment information item that is not ignored.

The following infoset properties are required: [content] and [parent].

Comment node properties are derived from the infoset as follows:

content

The value of the [content] property.

parent

The node corresponding to the value of the [parent] property.

There are no comment nodes for comments that are children of a document type declaration information item.

6.6.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

6.7 Text Nodes

6.7.1 Overview

Text nodes encapsulate XML character content. Text has the following properties:

  • content

  • parent

Text nodes must satisfy the following constraint:

  1. A text node must not contain the empty string as its content.

In addition, document and element nodes impose the constraint that two consecutive text nodes can never occur as adjacent siblings.

6.7.2 Accessors

dm:base-uri

If the text node has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

dm:node-kind

Returns "text".

dm:node-name

Returns ().

dm:parent

Returns the value of the parent property.

dm:string-value

Returns the value of the content property.

dm:typed-value

Returns the dm:string-value of the node as an xdt:untypedAtomic value.

dm:type

Returns xdt:untypedAtomic.

dm:children

Returns ().

dm:attributes

Returns ().

dm:namespaces

Returns ().

dm:nilled

Returns ().

6.7.3 Construction from an Infoset

The character information items are required. A Text Node is constructed for each maximal sequence of character information items.

The following infoset properties are required: [character code] and [parent].

The following infoset properties are optional: [element content white space].

A sequence of character information items is maximal if it satisfies the following constraints:

  1. All of the information items in the sequence have the same parent.

  2. The sequence consists of adjacent character information items uninterrupted by other types of information item.

  3. No other such sequence exists that contains any of the same character information items and is longer.

Text node properties are derived from the infoset as follows:

content

A string comprised of characters that correspond to the [character code] properties of each of the character information items. Applications may ignore character information items for which [element content white space] exists and is "true".

Applications may construct text nodes in the data model to represent insignificant white space. This decision is considered outside the scope of the data model, consequently the data model makes no attempt to control or identify if any or all insignificant white space is ignored.

If insignificant white space is not ignored, it is treated exactly like any other character information item. After the data model has been constructed, there is no accessor which directly identifies insignificant white space. (Applications with access to type information may be able to determine if white space is significant or not.)

Regardless of insignificant white space handling, the content of the text node is not necessarily W3C normalized as described in the [Character Model]. It is the responsibility of data producers to provide appropriately normalized text, and the responsibility of programmers to make sure that operations do not de-normalize text.

parent

The node corresponding to the value of the [parent] property.

6.7.4 Construction from a PSVI

Construction from a PSVI is identical to construction from the Infoset.

7 Atomic Values

[Definition: An atomic value is a value in the value space of an atomic type labeled with that atomic type.] [Definition: An atomic type is a primitive simple type or a type derived by restriction from another atomic simple type. Types derived by list or union are not atomic.]

There are 21 primitive atomic types, the 19 defined in Section 3.2 Primitive datatypesXS2 of [Schema Part 2] and xdt:untypedAtomic and xdt:anyAtomicType. The “xdt:” types, including the atomic types xdt:dayTimeDuration and xdt:yearMonthDuration, derived by restriction from xs:duration, are described below.

The value space of the atomic values is the union of the value spaces of the atomic types. This value space clearly includes those atomic values whose type is primitive, but it also includes those whose type is derived by restriction, as derivation by restriction always limits the value space.

Implementors may extend the set of types available. The value space of those types, as well as the behavior of those types when used in expressions, is implementation defined.

An XML Schema simple type definition has a [variety] which may be atomic, union, or list.

An atomic value can be constructed from the value's lexical representation. Given a string and an atomic type, the atomic value is constructed in such a way as to be consistent with validation. If the string does not represent a valid value of the type, an error is raised. When xs:untypedAtomic is specified as the type, no validation takes place. The details of the construction are described in Section 5 Constructor FunctionsFO and the related Section 17 CastingFO section of [Functions and Operators].

A string value can be constructed from an atomic value. Such a value is constructed by converting the atomic value to its string representation as described in Section 17 CastingFO. Using the canonical lexical representation for atomic values may not always be compatible with XPath 1.0. These and other backwards incompatibilities are described in Section H Backwards Compatibility with XPath 1.0 (Non-Normative)XP.

7.1 New Datatypes

7.1.1 xdt:untypedAny

The abstract complex type xdt:untypedAny is a subtype of xs:anyType and serves as a special type annotation to indicate elements that have not been validated by a XML Schema or a DTD or that have received a type annotation of xs:anyType in the PSVI.

This datatype cannot be used in [Schema Part 1] element declarations, nor can it be used as a base for user-defined complex types. It can be used in the [XPath 2.0] SequenceType XP production (for example in a function signature) to specify the required type of an element. This datatype resides in the namespace http://www.w3.org/2003/11/xpath-datatypes.

7.1.2 xdt:untypedAtomic

The abstract atomic type xdt:untypedAtomic is a subtype of xdt:anyAtomicType and serves as a special type annotation to indicate elements or attributes that have not been validated by a XML Schema or DTD, or that have received a type annotation of xs:anySimpleType in the PSVI. It is also used as the type of the typed value of such nodes, and of text nodes.

This datatype cannot be used in [Schema Part 1] element or attribute declarations, nor can it be used as a base for user-defined atomic types, an item type for user-defined list types, or a member type of user-defined union types. It can be used in the [XPath 2.0] SequenceType XP production to define a required type (for example in a function signature), to indicate that only an untyped atomic value is acceptable. This datatype resides in the namespace http://www.w3.org/2003/11/xpath-datatypes.

7.1.3 xdt:anyAtomicType

The abstract simple type xdt:anyAtomicType is a subtype of xs:anySimpleType and is the base type for all the primitive atomic types described in [Schema Part 2], and for xdt:untypedAtomic.

This datatype cannot be used in [Schema Part 1] element or attribute declarations, nor can it be used as a base for user-defined atomic types, an item type for user-defined list types, or a member type of user-defined union types. It can be used in the [XPath 2.0] SequenceType XP production to define a required type (for example in a function signature), to indicate that any atomic value is acceptable. This datatype resides in the namespace http://www.w3.org/2003/11/xpath-datatypes.

7.1.4 xdt:dayTimeDuration

The datatype xdt:dayTimeDuration is a subtype of xs:duration whose lexical representation contains only day, hour, minute, and second components.

This datatype resides in the namespace http://www.w3.org/2003/11/xpath-datatypes.

7.1.5 xdt:yearMonthDuration

The datatype xdt:yearMonthDuration is a subtype of xs:duration whose lexical representation contains only year and month components.

This datatype resides in the namespace http://www.w3.org/2003/11/xpath-datatypes.

8 Sequences

[Definition: A sequence is an ordered collection of zero or more items.] [Definition: An item may be a node or an atomic value], i.e. a sequence may contain nodes, atomic values, or any mixture of nodes and atomic values. When a node is added to a sequence its identity remains the same. Consequently a node may occur in more than one sequence and a sequence may contain duplicate items.

An important characteristic of the data model is that there is no distinction between an item (a node or an atomic value) and a singleton sequence containing that item. An item is equivalent to a singleton sequence containing that item and vice versa.

Sequences never contain other sequences; if sequences are combined, the result is always a “flattened” sequence. In other words, appending “(d e)” to “(a b c)” produces a sequence of length 5: “(a b c d e)”. It does not produce a sequence of length 4: “(a b c (d e))”, such a nested sequence never occurs.

Note:

Sequences replace node-sets from XPath 1.0. In XPath 1.0, node-sets do not contain duplicates. In generalizing node-sets to sequences in XPath 2.0, duplicate removal is provided by functions on node sequences.

A collection of documents is represented in the data model as a sequence of document nodes.

Equality comparison of sequences is performed by comparing the items of the sequences. Whereas you can compare the identity of two nodes, you cannot compare the identity of two sequences; you can only determine whether or not they contain the same members.

A XML Information Set Conformance

This specification conforms to the XML Information Set [Infoset]. The following information items must be exposed by the infoset producer to construct a data model:

Other information items and properties made available by the Infoset processor are ignored. In addition to the properties above, the following properties from the PSV Infoset are required:

B References

B.1 Normative References

Infoset
XML Information Set, John Cowan and Richard Tobin, Editors. World Wide Web Consortium, 24 Oct 2001. This version is http://www.w3.org/TR/2001/REC-xml-infoset-20011024/. The latest version is available at http://www.w3.org/TR/xml-infoset.
Namespaces in XML
Namespaces in XML, Tim Bray, Dave Hollander, and Andrew Layman, Editors. World Wide Web Consortium, 14 Jan 1999. This version is http://www.w3.org/TR/1999/REC-xml-names-19990114. The latest version is available at http://www.w3.org/TR/REC-xml-names.
Functions and Operators
XQuery 1.0 and XPath 2.0 Functions and Operators, Ashok Malhotra, Jim Melton, and Norman Walsh, Editors. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xpath-functions-20030502/. The latest version is available at http://www.w3.org/TR/xpath-functions/.
Schema Part 1
XML Schema Part 1: Structures, Henry S. Thompson, David Beech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web Consortium, 02 May 2001. This version is http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/. The latest version is available at http://www.w3.org/TR/xmlschema-1/.
Schema Part 2
XML Schema Part 2: Datatypes, Paul V. Biron and Ashok Malhotra, Editors. World Wide Web Consortium, 02 May 2001. This version is http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/. The latest version is available at http://www.w3.org/TR/xmlschema-2/.
Serialization
XSLT 2.0 and XQuery 1.0 Serialization, Michael Kay and Norman Walsh, Editors. World Wide Web Consortium, 02 May 2003. This version is http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/. The latest version is available at http://www.w3.org/TR/xslt-xquery-serialization/.
RFC 2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. Network Working Group, IETF, Mar 1997.
Character Model
Character Model for the World Wide Web 1.0, Tex Texin, Martin J. Dürst, François Yergeau, et. al., Editors. World Wide Web Consortium, 22 Aug 2003. This version is http://www.w3.org/TR/2003/WD-charmod-20030822/. The latest version is available at http://www.w3.org/TR/charmod/.

B.2 Other References

XML Query Data Model
XML Query Data Model, Mary Fernández and Jonathan Robie, Editors. World Wide Web Consortium, 15 Feb 2001.
XML Base
XML Base, Jonathan Marsh, Editor. World Wide Web Consortium, 27 Jun 2001. This version is http://www.w3.org/TR/2001/REC-xmlbase-20010627/. The latest version is available at http://www.w3.org/TR/xmlbase/.
XPath 1.0
XML Path Language (XPath) Version 1.0, James Clark and Steven DeRose, Editors. World Wide Web Consortium, 16 Nov 1999. This version is http://www.w3.org/TR/1999/REC-xpath-19991116. The latest version is available at http://www.w3.org/TR/xpath.
XPath 2.0 Requirements
XPath Requirements Version 2.0, Mark Scardina and Mary Fernández, Editors. World Wide Web Consortium, 22 Aug 2003. This version is http://www.w3.org/TR/2003/WD-xpath20req-20030822. The latest version is available at http://www.w3.org/TR/xpath20req.
XPath 2.0
XML Path Language (XPath) 2.0, Mary F. Fernández, Michael Kay, Jonathan Robie, et. al., Editors. World Wide Web Consortium, 22 Aug 2003. This version is http://www.w3.org/TR/2003/WD-xpath20-20030822. The latest version is available at http://www.w3.org/TR/xpath20.
XSLT 2.0
XSL Transformations (XSLT) Version 2.0, Michael Kay, Editor. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xslt20-20030502/. The latest version is available at http://www.w3.org/TR/xslt20.
Formal Semantics
XQuery 1.0 and XPath 2.0 Formal Semantics, Ashok Malhotra, Kristoffer Rose, Michael Rys, et. al., Editors. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xquery-semantics-20030822/. The latest version is available at http://www.w3.org/TR/xquery-semantics/.
XML Query Working Group
XML Query Working Group, World Wide Web Consortium. Home page: http://www.w3.org/XML/Query
XSL Working Group
XSL Working Group, World Wide Web Consortium. Home page: http://www.w3.org/Style/XSL/
XQuery
XQuery 1.0: An XML Query Language, Daniela Florescu, Jonathan Robie, Jérôme Siméon, et. al., Editors. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xquery-20030822/. The latest version is available at http://www.w3.org/TR/xquery.
XML Query Requirements
XML Query (XQuery) Requirements, Don Chamberlin, Peter Fankhauser, Massimo Marchiori, and Jonathan Robie, Editors. World Wide Web Consortium, 14 Nov 2003. This version is http://www.w3.org/TR/2003/WD-xquery-requirements-20030627. The latest version is available at http://www.w3.org/TR/xquery-requirements.

C Glossary (Non-Normative)

anonymous type name

An anonymous type name is an implementation defined, unique type name provided by the processor for every anonymous type declared in an imported schema.

atomic type

An atomic type is a primitive simple type or a type derived by restriction from another atomic simple type. Types derived by list or union are not atomic.

atomic value

An atomic value is a value in the value space of an atomic type labeled with that atomic type.

document

A tree whose root node is a document node is referred to as a document.

document order

A document order is defined among all the nodes used during a given query or transformation. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order is the order returned by an in-order, depth-first traversal of the data model.

expanded-QName

An expanded-QName is a pair of values consisting of a namespace URI and a local name. They belong to the value space of the XML Schema type xs:QName. When this document refers to xs:QName we always mean the value space, i.e. a namespace URI, local name pair (and not the lexical space referring to constructs of the form prefix:local-name).

fragment

A tree whose root node is some other kind of node is referred to as a fragment.

implementation-defined

In this specification, the term implementation-defined refers to a feature where the implementation is allowed some flexibility, and where the choices made by the implementation should be described in the vendor's documentation.

implementation-dependent

The term implementation-dependent refers to a feature where the behavior may vary from one implementation to another, and where the vendor is not expected to provide a full specification of the behavior.

incompletely validated

An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than 'valid' for the [validity] property in the PSVI.

item

An item may be a node or an atomic value

Node

The category of Node values contains seven distinct kinds of nodes: document, element, attribute, text, namespace, processing instruction, and comment.

sequence

A sequence is an ordered collection of zero or more items.

D Example (Non-Normative)

We use the following XML document to illustrate the information contained in a data model:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="dm-example.xsl"?>
<catalog xmlns="http://www.example.com/catalog"
         xmlns:html="http://www.w3.org/1999/xhtml"
         xmlns:xlink="http://www.w3.org/1999/xlink"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.example.com/catalog
                             dm-example.xsd"
         xml:lang="en"
         version="0.1">

<!-- This example is for data model illustration only.
     It does not demonstrate good schema design. -->

<tshirt code="T1534017" label=" Staind : Been Awhile "
        xlink:href="http://example.com/0,,1655091,00.html"
        sizes="M L XL">
  <title> Staind: Been Awhile Tee Black (1-sided) </title>
  <description>
    <html:p>
      Lyrics from the hit song 'It's Been Awhile'
      are shown in white, beneath the large
      'Flock &amp; Weld' Staind logo.
    </html:p>
  </description>
  <price> 25.00 </price>
</tshirt>

<album code="A1481344" label=" Staind : Its Been A While "
       formats="CD">
  <title> It's Been A While </title>
  <description xsi:nil="true" />
  <price currency="USD"> 10.99 </price>
  <artist> Staind </artist>
</album>

</catalog>

The document is associated with the URI "http://www.example.com/catalog.xml", and is valid with respect to the following XML schema:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:cat="http://www.example.com/catalog"
           xmlns:xlink="http://www.w3.org/1999/xlink"
           targetNamespace="http://www.example.com/catalog"
           elementFormDefault="qualified">

<xs:import namespace="http://www.w3.org/XML/1998/namespace"
           schemaLocation="http://www.w3.org/2001/xml.xsd" />

<xs:import namespace="http://www.w3.org/1999/xlink"
           schemaLocation="http://www.cs.rpi.edu/~puninj/XGMML/xlinks-2001.xsd" />

<xs:element name="catalog">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="cat:_item" maxOccurs="unbounded" />
    </xs:sequence>
    <xs:attribute name="version" type="xs:string" fixed="0.1" use="required" />
    <xs:attribute ref="xml:base" />
    <xs:attribute ref="xml:lang" />
  </xs:complexType>
</xs:element>

<xs:element name="_item" type="cat:itemType" abstract="true" />

<xs:complexType name="itemType">
  <xs:sequence>
    <xs:element name="title" type="xs:token" />
    <xs:element name="description" type="cat:description" nillable="true" />
    <xs:element name="price" type="cat:price" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute name="label" type="xs:token" />
  <xs:attribute name="code" type="xs:ID" use="required" />
  <xs:attributeGroup ref="xlink:simpleLink" />
</xs:complexType>

<xs:element name="tshirt" type="cat:tshirtType" substitutionGroup="cat:_item" />

<xs:complexType name="tshirtType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:attribute name="sizes" type="cat:clothesSizes" use="required" />
    </xs:extension>
  </xs:complexContent>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:simpleType name="clothesSizes">
  <xs:union memberTypes="cat:sizeList">
    <xs:simpleType>
      <xs:restriction base="xs:token">
        <xs:enumeration value="oneSize" />
      </xs:restriction>
    </xs:simpleType>
  </xs:union>
</xs:simpleType>

<xs:simpleType name="sizeList">
  <xs:restriction>
    <xs:simpleType>
      <xs:list itemType="cat:clothesSize" />
    </xs:simpleType>
    <xs:minLength value="1" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="clothesSize">
  <xs:union memberTypes="cat:numberedSize cat:categorySize" />
</xs:simpleType>

<xs:simpleType name="numberedSize">
  <xs:restriction base="xs:integer">
    <xs:enumeration value="4" />
    <xs:enumeration value="6" />
    <xs:enumeration value="8" />
    <xs:enumeration value="10" />
    <xs:enumeration value="12" />
    <xs:enumeration value="14" />
    <xs:enumeration value="16" />
    <xs:enumeration value="18" />
    <xs:enumeration value="20" />
    <xs:enumeration value="22" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="categorySize">
  <xs:restriction base="xs:token">
    <xs:enumeration value="XS" />
    <xs:enumeration value="S" />
    <xs:enumeration value="M" />
    <xs:enumeration value="L" />
    <xs:enumeration value="XL" />
    <xs:enumeration value="XXL" />
  </xs:restriction>
</xs:simpleType>

<xs:element name="album" type="cat:albumType" substitutionGroup="cat:_item" />

<xs:complexType name="albumType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:sequence>
        <xs:element name="artist" type="xs:string" />
      </xs:sequence>
      <xs:attribute name="formats" type="cat:formatsType" use="required" />
    </xs:extension>
  </xs:complexContent>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:simpleType name="formatsType">
  <xs:list itemType="cat:formatType" />
</xs:simpleType>

<xs:simpleType name="formatType">
  <xs:restriction base="xs:token">
    <xs:enumeration value="CD" />
    <xs:enumeration value="MiniDisc" />
    <xs:enumeration value="tape" />
    <xs:enumeration value="vinyl" />
  </xs:restriction>
</xs:simpleType>

<xs:complexType name="description" mixed="true">
  <xs:sequence>
    <xs:any namespace="http://www.w3.org/1999/xhtml" processContents="lax"
            minOccurs="0" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:complexType name="price">
  <xs:simpleContent>
    <xs:extension base="cat:monetaryAmount">
      <xs:attribute name="currency" type="cat:currencyType" default="USD" />
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<xs:simpleType name="currencyType">
  <xs:restriction base="xs:token">
    <xs:pattern value="[A-Z]{3}" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="monetaryAmount">
  <xs:restriction base="xs:decimal">
    <xs:fractionDigits value="2" />
    <xs:pattern value="\d+\.\d{2}" />
  </xs:restriction>
</xs:simpleType>

</xs:schema>

This example exposes the data model for a document that has an associated schema and has been validated successfully against it. In general, an XML Schema is not required, that is, the data model can represent a schemaless, well-formed XML document with the rules described in 2.5 Types.

The XML document is represented by the nodes described below. The value D1 represents a document node; the values E1, E2, etc. represent element nodes; the values A1, A2, etc. represent attribute nodes; the values N1, N2, etc. represent namespace nodes; the values P1, P2, etc. represent processing-instruction nodes; the values T1, T2, etc. represent text nodes.

For brevity:

// Document node D1
dm:base-uri(D1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(D1) "document"
dm:string-value(D1) = "  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00    It's  Been  A  While    10.99    Staind  "
dm:children(D1) ([E1])
 
// Namespace node N1
dm:node-kind(N1) "namespace"
dm:node-name(N1) xs:QName("", "xml")
dm:string-value(N1) = "http://www.w3.org/XML/1998/namespace"
 
// Namespace node N2
dm:node-kind(N2) "namespace"
dm:node-name(N2) ()
dm:string-value(N2) = "http://www.example.com/catalog"
 
// Namespace node N3
dm:node-kind(N3) "namespace"
dm:node-name(N3) xs:QName("", "html")
dm:string-value(N3) = "http://www.w3.org/1999/xhtml"
 
// Namespace node N4
dm:node-kind(N4) "namespace"
dm:node-name(N4) xs:QName("", "xlink")
dm:string-value(N4) = "http://www.w3.org/1999/xlink"
 
// Namespace node N5
dm:node-kind(N5) "namespace"
dm:node-name(N5) xs:QName("", "xsi")
dm:string-value(N5) = "http://www.w3.org/2001/XMLSchema-instance"
 
// Processing Instruction node P1
dm:base-uri(P1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(P1) "processing-instruction"
dm:node-name(P1) xs:QName("", "xml-stylesheet")
dm:string-value(P1) = "type="text/xsl"  href="dm-example.xsl""
dm:parent(P1) ([D1])
 
// Element node E1
dm:base-uri(E1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E1) "element"
dm:node-name(E1) xs:QName("http://www.example.com/catalog", "catalog")
dm:string-value(E1) = "  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00    It's  Been  A  While    10.99    Staind  "
dm:typed-value(E1) fn:error()
// xs:anyType because of the anonymous type definition
dm:type(E1) xs:anyType
dm:parent(E1) ([D1])
dm:children(E1) ([E2], [E7])
dm:attributes(E1) ([A1], [A2], [A3])
dm:namespaces(E1) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A1
dm:node-kind(A1) "attribute"
dm:node-name(A1) xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:schemaLocation")
dm:string-value(A1) = "http://www.example.com/catalog                                                            dm-example.xsd"
dm:typed-value(A1) (xs:anyURI("http://www.example.com/catalog"), xs:anyURI("catalog.xsd"))
dm:type(A1) xs:anySimpleType
dm:parent(A1) ([E1])
 
// Attribute node A2
dm:node-kind(A2) "attribute"
dm:node-name(A2) xs:QName("http://www.w3.org/XML/1998/namespace", "xml:lang")
dm:string-value(A2) = "en"
dm:typed-value(A2) "en"
dm:type(A2) xs:NMTOKEN
dm:parent(A2) ([E1])
 
// Attribute node A3
dm:node-kind(A3) "attribute"
dm:node-name(A3) xs:QName("", "version")
dm:string-value(A3) = "0.1"
dm:typed-value(A3) "0.1"
dm:type(A3) xs:string
dm:parent(A3) ([E1])
 
// Comment node C1
dm:base-uri(C1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(C1) "comment"
dm:string-value(C1) = "  This  example  is  for  data  model  illustration  only.\n          It  does  not  demonstrate  good  schema  design.  "
dm:typed-value(C1)
dm:parent(C1) ([E1])
 
// Element node E2
dm:base-uri(E2) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E2) "element"
dm:node-name(E2) xs:QName("http://www.example.com/catalog", "tshirt")
dm:string-value(E2) = "  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00  "
dm:typed-value(E2) fn:error()
dm:type(E2) cat:tshirtType
dm:parent(E2) ([E1])
dm:children(E2) ([E3], [E4], [E6])
dm:attributes(E2) ([A4], [A5], [A6], [A7])
dm:namespaces(E2) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A4
dm:node-kind(A4) "attribute"
dm:node-name(A4) xs:QName("", "code")
dm:string-value(A4) = "T1534017"
dm:typed-value(A4) xs:ID("T1534017")
dm:type(A4) xs:ID
dm:parent(A4) ([E2])
 
// Attribute node A5
dm:node-kind(A5) "attribute"
dm:node-name(A5) xs:QName("", "label")
dm:string-value(A5) = "Staind  :  Been  Awhile"
dm:typed-value(A5) xs:token("Staind : Been Awhile")
dm:type(A5) xs:token
dm:parent(A5) ([E2])
 
// Attribute node A6
dm:node-kind(A6) "attribute"
dm:node-name(A6) xs:QName("http://www.w3.org/1999/xlink", "xlink:href")
dm:string-value(A6) = "http://example.com/0,,1655091,00.html"
dm:typed-value(A6) xs:anyURI("http://example.com/0,,1655091,00.html")
dm:type(A6) xs:anyURI
dm:parent(A6) ([E2])
 
// Attribute node A7
dm:node-kind(A7) "attribute"
dm:node-name(A7) xs:QName("", "sizes")
dm:string-value(A7) = "M  L  XL"
dm:typed-value(A7) (xs:anySimpleType("M"), xs:anySimpleType("L"), xs:anySimpleType("XL"))
dm:type(A7) cat:sizeList
dm:parent(A7) ([E2])
 
// Element node E3
dm:base-uri(E3) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E3) "element"
dm:node-name(E3) xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E3) = "Staind:  Been  Awhile  Tee  Black  (1-sided)"
dm:typed-value(E3) xs:token("Staind: Been Awhile Tee Black (1-sided)")
dm:type(E3) xs:token
dm:parent(E3) ([E2])
dm:children(E3) ()
dm:attributes(E3) ()
dm:namespaces(E3) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T1
dm:base-uri(T1) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T1) "text"
dm:string-value(T1) = "Staind:  Been  Awhile  Tee  Black  (1-sided)"
dm:typed-value(T1) xs:anySimpleType("Staind:  Been  Awhile  Tee  Black  (1-sided)")
dm:type(T1) xs:anySimpleType
dm:parent(T1) ([E3])
 
// Element node E4
dm:base-uri(E4) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E4) "element"
dm:node-name(E4) xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E4) = "\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(E4) fn:error()
dm:type(E4) cat:description
dm:parent(E4) ([E2])
dm:children(E4) ([E5])
dm:attributes(E4) ()
dm:namespaces(E4) ([N1], [N2], [N3], [N4], [N5])
 
// Element node E5
dm:base-uri(E5) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E5) "element"
dm:node-name(E5) xs:QName("http://www.w3.org/1999/xhtml", "html:p")
dm:string-value(E5) = "\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(E5) xdt:untypedAtomic(dm:string-value())
dm:type(E5) xs:anyType
dm:parent(E5) ([E4])
dm:children(E5) ()
dm:attributes(E5) ()
dm:namespaces(E5) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T2
dm:base-uri(T2) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T2) "text"
dm:string-value(T2) = "\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(T2) xs:anySimpleType("\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        ")
dm:type(T2) xs:anySimpleType
dm:parent(T2) ([E5])
 
// Element node E6
dm:base-uri(E6) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E6) "element"
dm:node-name(E6) xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E6) = "25.00"
// The typed-value is based on the content type of the complex type for the element
dm:typed-value(E6) cat:monetaryAmount(25.0)
dm:type(E6) cat:price
dm:parent(E6) ([E2])
dm:children(E6) ()
dm:attributes(E6) ()
dm:namespaces(E6) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T3
dm:base-uri(T3) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T3) "text"
dm:string-value(T3) = "25.00"
dm:typed-value(T3) xs:anySimpleType("25.00")
dm:type(T3) xs:anySimpleType
dm:parent(T3) ([E6])
 
// Element node E7
dm:base-uri(E7) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E7) "element"
dm:node-name(E7) xs:QName("http://www.example.com/catalog", "album")
dm:string-value(E7) = "  It's  Been  A  While    10.99    Staind  "
dm:typed-value(E7) fn:error()
dm:type(E7) cat:albumType
dm:parent(E7) ([E1])
dm:children(E7) ([E8], [E9], [E10], [E11])
dm:attributes(E7) ([A8], [A9], [A10])
dm:namespaces(E7) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A8
dm:node-kind(A8) "attribute"
dm:node-name(A8) xs:QName("", "code")
dm:string-value(A8) = "A1481344"
dm:typed-value(A8) xs:ID("A1481344")
dm:type(A8) xs:ID
dm:parent(A8) ([E7])
 
// Attribute node A9
dm:node-kind(A9) "attribute"
dm:node-name(A9) xs:QName("", "label")
dm:string-value(A9) = "Staind  :  Its  Been  A  While"
dm:typed-value(A9) xs:token("Staind : Its Been A While")
dm:type(A9) xs:token
dm:parent(A9) ([E7])
 
// Attribute node A10
dm:node-kind(A10) "attribute"
dm:node-name(A10) xs:QName("", "formats")
dm:string-value(A10) = "CD"
dm:typed-value(A10) cat:formatType("CD")
dm:type(A10) cat:formatType
dm:parent(A10) ([E7])
 
// Element node E8
dm:base-uri(E8) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E8) "element"
dm:node-name(E8) xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E8) = "It's  Been  A  While"
dm:typed-value(E8) xs:token("It's Been A While")
dm:type(E8) xs:token
dm:parent(E8) ([E7])
dm:children(E8) ()
dm:attributes(E8) ()
dm:namespaces(E8) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T4
dm:base-uri(T4) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T4) "text"
dm:string-value(T4) = "It's  Been  A  While"
dm:typed-value(T4) xs:anySimpleType("It's  Been  A  While")
dm:type(T4) xs:anySimpleType
dm:parent(T4) ([E8])
 
// Element node E9
dm:base-uri(E9) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E9) "element"
dm:node-name(E9) xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E9) = ""
// xsi:nil is true so the typed value is the empty sequence
dm:typed-value(E9) ()
dm:type(E9) cat:description
dm:parent(E9) ([E7])
dm:children(E9) ()
dm:attributes(E9) ([A11])
dm:namespaces(E9) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A11
dm:node-kind(A11) "attribute"
dm:node-name(A11) xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:nil")
dm:string-value(A11) = "true"
dm:typed-value(A11) xs:boolean("true")
dm:type(A11) xs:boolean
dm:parent(A11) ([E9])
 
// Element node E10
dm:base-uri(E10) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E10) "element"
dm:node-name(E10) xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E10) = "10.99"
dm:typed-value(E10) cat:monetaryAmount(10.99)
dm:type(E10) cat:price
dm:parent(E10) ([E7])
dm:children(E10) ()
dm:attributes(E10) ([A12])
dm:namespaces(E10) ([N1], [N2], [N3], [N4], [N5])
 
// Attribute node A12
dm:node-kind(A12) "attribute"
dm:node-name(A12) xs:QName("", "currency")
dm:string-value(A12) = "USD"
dm:typed-value(A12) cat:currencyType("USD")
dm:type(A12) cat:currencyType
dm:parent(A12) ([E10])
 
// Text node T5
dm:base-uri(T5) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T5) "text"
dm:string-value(T5) = "10.99"
dm:typed-value(T5) xs:anySimpleType("10.99")
dm:type(T5) xs:anySimpleType
dm:parent(T5) ([E10])
 
// Element node E11
dm:base-uri(E11) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E11) "element"
dm:node-name(E11) xs:QName("http://www.example.com/catalog", "artist")
dm:string-value(E11) = "  Staind  "
dm:typed-value(E11) " Staind "
dm:type(E11) xs:string
dm:parent(E11) ([E7])
dm:children(E11) ()
dm:attributes(E11) ()
dm:namespaces(E11) ([N1], [N2], [N3], [N4], [N5])
 
// Text node T6
dm:base-uri(T6) xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T6) "text"
dm:string-value(T6) = "  Staind  "
dm:typed-value(T6) xs:anySimpleType("  Staind  ")
dm:type(T6) xs:anySimpleType
dm:parent(T6) ([E11])
 

A graphical representation of the data model for the preceding example is shown below. Document order in this representation can be found by following the traditional in-order, left-to-right, depth-first traversal; however, because the image has been rotated for easier presentation, this appears to be in-order, bottom-to-top, depth-first order.

Graphical depiction of the example data model.
Graphic representation of the data model. [large view, SVG]

E Accessor Summary (Non-Normative)

This section summarizes the return values of each accessor by node type.

E.1 dm:base-uri Accessor

Document Nodes

Returns the value of the base-uri property if it exists and is not empty, otherwise returns ().

Element Nodes

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the element has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

Attribute Nodes

If the attribute has a parent, returns the value of the dm:base-uri of its parent; otherwise it returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the processing instruction has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

Comment Nodes

If the comment has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

Text Nodes

If the text node has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns ().

E.2 dm:node-kind Accessor

Document Nodes

Returns "document".

Element Nodes

Returns "element".

Attribute Nodes

Returns "attribute".

Namespace Nodes

Returns "namespace".

Processing Instruction Nodes

Returns "processing-instruction".

Comment Nodes

Returns "comment".

Text Nodes

Returns "text".

E.3 dm:node-name Accessor

Document Nodes

Returns ().

Element Nodes

Returns the value of the node-name property.

Attribute Nodes

Returns the value of the node-name property.

Namespace Nodes

If the implementation preserves information about the prefixes declared, returns an xs:QName with the value of the prefix property in the local-name and an empty namespace name, otherwise returns ().

Processing Instruction Nodes

Returns an xs:QName with the value of the target property in the local-name and an empty namespace name.

Comment Nodes

Returns ().

Text Nodes

Returns ().

E.4 dm:parent Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the parent property.

Attribute Nodes

Returns the value of the parent property.

Namespace Nodes

Returns the value of the parent property.

Processing Instruction Nodes

Returns the value of the parent property.

Comment Nodes

Returns the value of the parent property.

Text Nodes

Returns the value of the parent property.

E.5 dm:string-value Accessor

Document Nodes

Returns the concatenation of the string-values of all its text node descendants in document order.

Element Nodes

Returns the string value calculated as follows:

  • If the element has a type of xdt:untypedAny, a complex type with complex content, or a complex type with mixed content, returns the concatenation of the string-values of all its text node descendants in document order. It returns "" if the element has no text node descendants.

  • If the element has a complex type with empty content, returns "".

  • If the element has a simple type or a complex type with simple content:

    • If the element type is xs:string, or a type derived from xs:string, returns that string.

    • If the element type is xs:anyURI, returns the characters of the URI.

    • If the element type is xs:QName returns the value calculated as follows:

      • If the value has no namespace URI and the in-scope namespaces map the default namespace to any namespace URI, then an error is raised ("default namespace is defined").

      • If the value has a namespace URI, then there must be at least one prefix mapped to that URI in the in-scope namespaces. If there is no such prefix, an error is raised ("no prefix defined for namespace"). If there is more than one such prefix, the one that is chosen is implementation dependent.

      If no error occurs, returns a string with the lexical form of a xs:QName using the prefix chosen as described above, and the local name of the value.

    • If the element type is xs:dateTime, xs:date, or xs:time, returns the original lexical representation of the typed value recovered as follows: if an explicit timezone was present, the normalized value is adjusted using the explicit timezone; if an explicit timezone was not present, the Z is dropped from the normalized value. The normalized value and the explicit timezone, if present, are converted separately to xs:string and concatenated to yield the string value.

    • In all other cases, returns the concatenation of the string-values of all its text node descendants in document order.

Attribute Nodes

Returns the value calculated as follows:

  • If the attribute type is xdt:untypedAtomic, xs:string, or a type derived from xs:string, returns that string.

  • If the attribute type is xs:anyURI, returns the characters of the URI.

  • If the attribute type is xs:QName returns the value calculated as follows:

    • If the value has no namespace URI, then an error is raised ("default namespace is defined") if the in-scope namespaces map the default namespace to any namespace URI.

    • If the value has a namespace URI, then there must be at least one prefix mapped to that URI in the in-scope namespaces. If there is no such prefix, an error is raised ("no prefix defined for namespace"). If there is more than one such prefix, the one that is chosen is implementation dependent.

    If no error occurs, returns a string with the lexical form of a xs:QName using the prefix chosen as described above, and the local name of the value.

  • If the attribute type is xs:dateTime, xs:date, or xs:time, returns the original lexical representation recovered as follows: if an explicit timezone was present, the normalized value is adjusted using the explicit timezone; if an explicit timezone was not present, the Z is dropped from the normalized value. The normalized value and the explicit timezone, if present, are converted separately to xs:string and concatenated to yield the string value.

Namespace Nodes

Returns the value of the uri property.

Processing Instruction Nodes

Returns the value of the content property.

Comment Nodes

Returns the value of the content property.

Text Nodes

Returns the value of the content property.

E.6 dm:typed-value Accessor

Document Nodes

Returns dm:string-value of the node as an xdt:untypedAtomic value.

Element Nodes

Returns the typed value calculated as follows:

  • If the element has a type of xdt:untypedAny or a complex type with mixed content, returns the string value of the node as an instance of xdt:untypedAtomic.

  • If the element has a simple type or a complex type with simple content, returns a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

    For xs:dateTime, xs:date and xs:time, the typed value is the atomic value that is determined from its tuple representation as described in 3.3.4 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values.

  • If the node has a complex type with empty content, returns ().

  • If the node has a complex type with complex content, raises a type error, which may be handled by the host language.

Attribute Nodes

Returns the value calculated as follows:

  • If the attribute has a type of xdt:untypedAtomic, returns the string value of the node as an instance of xdt:untypedAtomic.

  • If the attribute has a simple type, returns a sequence of zero or more atomic values derived from the string-value of the node and its type in a way that is consistent with XML Schema validation.

    For xs:dateTime, xs:date and xs:time, the typed value is the atomic value that is determined from its tuple representation as described in 3.3.4 Retreiving the Typed Value of xs:dateTime, xs:date, and xs:time Values.

Namespace Nodes

Returns the dm:string-value of the node as an xdt:untypedAtomic value.

Processing Instruction Nodes

Returns the dm:string-value of the processing instruction as a xs:string value.

Comment Nodes

Returns the dm:string-value of the comment as an xs:string.

Text Nodes

Returns the dm:string-value of the node as an xdt:untypedAtomic value.

E.7 dm:type Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the type property.

Attribute Nodes

Returns the value of the type property.

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns xdt:untypedAtomic.

E.8 dm:children Accessor

Document Nodes

Returns the value of the children property.

Element Nodes

Returns the value of the children property.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

E.9 dm:attributes Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the attributes property. The order of attribute nodes is stable but implementation dependent.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

E.10 dm:namespaces Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the namespaces property. The order of namespace nodes is stable but implementation dependent.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

E.11 dm:nilled Accessor

Document Nodes

Returns ()

Element Nodes

Returns the value of the nilled property.

Attribute Nodes

Returns ().

Namespace Nodes

Returns ().

Processing Instruction Nodes

Returns ().

Comment Nodes

Returns ().

Text Nodes

Returns ().

F Infoset Construction Summary (Non-Normative)

This section summarizes data model construction from an Infoset for each kind of information item. General notes occur elsewhere.

F.1 Document Nodes Information Items

The document information item is required. A Document Node is constructed for each document information item.

The following infoset properties are required: [children] and [base URI].

The following infoset properties are optional: [unparsed entities].

Document node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding element, processing instruction, comment, or text node is constructed and that sequence of nodes is used as the value of the children property.

If present among the [children], the [document type declaration] information item is ignored.

unparsed-entities

If the [unparsed entities] property is present and is not the empty set, the values of the unparsed entity information items must be used to support the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id accessors.

The internal structure of the values of the unparsed-entities property is implementation defined.

F.2 Element Nodes Information Items

The element information items are required. An Element Node is constructed for each element information item.

The following infoset properties are required: [namespace name], [local name], [children], [attributes], [in-scope namespaces], [base URI], and [parent].

Element node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

parent

The node that corresponds to the value of the [parent] property.

type

All element nodes constructed from an infoset have the type xdt:untypedAny.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property.

Implementations may ignore namespace information items for namespaces which do not appear in the expanded QName of any element or attribute information item. This can arise when xs:QNames are used in content.

nilled

All element nodes constructed from an infoset have a nilled property of "false".

F.3 Attribute Nodes Information Items

The attribute information items are required. An attribute node is constructed for each attribute information item.

The following infoset properties are required: [namespace name], [local name], [normalized value], [attribute type], and [owner element].

Attribute node properties are derived from the infoset as follows:

node-name

An xs:QName constructed from the [local name] property and the [namespace name] property

string-value

The [normalized value] property.

parent

The element node that corresponds to the value of the [owner element] property.

type
  • If the [attribute type] property has one of the following values: ID, IDREF, IDREFS, ENTITY, ENTITIES, NMTOKEN, or NMTOKENS, an xs:QName with the [attribute type] as the local name and "http://www.w3.org/2001/XMLSchema" as the namespace name.

  • Otherwise, xdt:untypedAtomic.

F.4 Namespace Nodes Information Items

The namespace information items are required.

The following infoset properties are required: [prefix], [namespace name].

Namespace node properties are derived from the infoset as follows:

prefix

The [prefix] property.

uri

The [namespace name] property.

parent

The element to which this Namespace Node applies, if the implementation exposes any mechanism for accessing the dm:parent accessor of Namespace Nodes.

F.5 Text Nodes Information Items

The character information items are required. A Text Node is constructed for each maximal sequence of character information items.

The following infoset properties are required: [character code] and [parent].

The following infoset properties are optional: [element content white space].

A sequence of character information items is maximal if it satisfies the following constraints:

  1. All of the information items in the sequence have the same parent.

  2. The sequence consists of adjacent character information items uninterrupted by other types of information item.

  3. No other such sequence exists that contains any of the same character information items and is longer.

Text node properties are derived from the infoset as follows:

content

A string comprised of characters that correspond to the [character code] properties of each of the character information items. Applications may ignore character information items for which [element content white space] exists and is "true".

Applications may construct text nodes in the data model to represent insignificant white space. This decision is considered outside the scope of the data model, consequently the data model makes no attempt to control or identify if any or all insignificant white space is ignored.

If insignificant white space is not ignored, it is treated exactly like any other character information item. After the data model has been constructed, there is no accessor which directly identifies insignificant white space. (Applications with access to type information may be able to determine if white space is significant or not.)

Regardless of insignificant white space handling, the content of the text node is not necessarily W3C normalized as described in the [Character Model]. It is the responsibility of data producers to provide appropriately normalized text, and the responsibility of programmers to make sure that operations do not de-normalize text.

parent

The node corresponding to the value of the [parent] property.

G PSVI Construction Summary (Non-Normative)

This section summarizes data model construction from a PSVI for each kind of information item. General notes occur elsewhere.

G.1 Document Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

G.2 Element Nodes Information Items

The following Element Node properties are effected by PSVI properties.

type
  • If the [validity] property exists and is “valid” on this element and all of its ancestors, type is assigned as described in 3.3.1 Mapping PSVI Additions to Types

  • Otherwise, xdt:untypedAny.

children
  • If the [schema normalized value] PSVI property exists, the processor may, depending on the implementation, use a sequence of nodes containing the processing instruction and comment nodes corresponding to the processing instruction and comment information items found in the [children] property, plus a single text node whose string value is the the [schema normalized value] for the children property. The order of these nodes is implementation defined.

  • Otherwise, the sequence of nodes constructed from the information items found in the [children] property.

    For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding element, processing instruction, comment, or text node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

nilled

[Schema Part 2] introduced a mechanism for signaling that an element is valid even when it has no content despite a content type which does not allow empty content. The data model exposes this special semantic in the nilled property.

If the [validity] property exists on an element node and is "valid" then if the [nil] property exists and is true, then nilled property is "true". In all other cases, including all cases where schema validity assessment was not attempted or did not succeed, the nilled property is "false".

All other properties have values that are consistent with construction from an infoset.

G.3 Attribute Nodes Information Items

The following Attribute Node properties are effected by PSVI properties.

string-value
  • The [schema normalized value] PSVI property if that exists, otherwise

  • the [normalized value] property.

type
  • If the [validity] property does not exist on this node or any of its ancestors, Infoset processing applies.

    Note that this processing is only performed if no part of the subtree that contains the node was schema validated. In particular, Infoset-only processing does not apply to subtrees that are "skip" validated in a document.

  • If the [validity] property exists and is "valid", type is assigned as described in 3.3.1 Mapping PSVI Additions to Types

  • Otherwise, xdt:untypedAtomic.

All other properties have values that are consistent with construction from an infoset.

Note: attributes from the XML Schema instance namespace, "http://www.w3.org/2001/XMLSchema-instance", (xsi:schemaLocation, xsi:type, etc.) appear as ordinary attributes in the data model. They will be validated appropriately by schema processors and will simply appear as attributes of type xs:anySimpleType if they haven't been schema validated.

G.4 Namespace Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

G.5 Processing Instruction Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

G.6 Comment Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

G.7 Text Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.