RELAX NG Compact Syntax

Working Draft 22 August 2002

This version:
Working Draft: 22 August 2002
James Clark <>

Copyright © The Organization for the Advancement of Structured Information Standards [OASIS] 2002. All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to OASIS, except as needed for the purpose of developing OASIS specifications, in which case the procedures for copyrights defined in the OASIS Intellectual Property Rights document must be followed, or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.



This document specifies a compact, non-XML syntax for [RELAX NG].

Status of this Document

This is a working draft constructed by the editor. It is not an official committee work product and may not reflect the consensus opinion of the committee. Comments on this document may be sent to

Table of Contents

1 Introduction
2 Syntax
3 Lexical structure
4 Declarations
5 Annotations
5.1 Initial annotations
5.2 Documentation shorthand
5.3 Following annotations
5.4 Grammar annotations
6 Conformance
6.1 Validator
6.2 Structure preserving translator
6.3 Non-structure preserving translator


A Formal description
A.1 Syntax
A.2 Lexical structure
A.2.1 Character encoding
A.2.2 BOM stripping
A.2.3 Newline normalization
A.2.4 Escape interpretation
A.2.5 Tokenization
A.2.6 Literal concatenation
B Compact syntax RELAX NG schema for RELAX NG (Non-Normative)

1. Introduction

This specification describes a compact, non-XML syntax for [RELAX NG].

The goals of this syntax are:

  • maximize readability;
  • support all features of RELAX NG; it must be possible to translate a schema from the XML syntax to the compact syntax and back without losing significant information;
  • support separate translation; a RELAX NG schema may be spread amongst multiple files; it must be possible to represent each of the files separately in the compact syntax; the representation of each file must not depend on the other files.

The syntax has similarities to [XQuery Formal Semantics], to [XDuce] and to the DTD syntax of [XML 1.0].

The body of this document contains an informal description of the syntax and how it maps onto the XML syntax. Developers should consult Appendix A. Formal description for a complete, rigorous description.

2. Syntax

The following is a summary of the syntax in EBNF. The reader may find it helpful to compare this with the syntax in Section 3 of [RELAX NG]. The start symbol is topLevel.

topLevel  ::=  decl* (pattern | grammarContent*)
decl  ::=  "namespace" identifierOrKeyword "=" namespaceURILiteral
| "default" "namespace" [identifierOrKeyword] "=" namespaceURILiteral
| "datatypes" identifierOrKeyword "=" literal
pattern  ::=  "element" nameClass "{" pattern "}"
| "attribute" nameClass "{" pattern "}"
| pattern ("," pattern)+
| pattern ("&" pattern)+
| pattern ("|" pattern)+
| pattern "?"
| pattern "*"
| pattern "+"
| "list" "{" pattern "}"
| "mixed" "{" pattern "}"
| identifier
| "parent" identifier
| "empty"
| "text"
| [datatypeName] datatypeValue
| datatypeName ["{" param* "}"] [exceptPattern]
| "notAllowed"
| "external" anyURILiteral [inherit]
| "grammar" "{" grammarContent* "}"
| "(" pattern ")"
param  ::=  identifierOrKeyword "=" literal
exceptPattern  ::=  "-" pattern
grammarContent  ::=  start
| define
| "div" "{" grammarContent* "}"
| "include" anyURILiteral [inherit] ["{" includeContent* "}"]
includeContent  ::=  define
| start
| "div" "{" includeContent* "}"
start  ::=  "start" assignMethod pattern
define  ::=  identifier assignMethod pattern
assignMethod  ::=  "="
| "|="
| "&="
nameClass  ::=  name
| nsName [exceptNameClass]
| anyName [exceptNameClass]
| nameClass "|" nameClass
| "(" nameClass ")"
name  ::=  identifierOrKeyword
| CName
exceptNameClass  ::=  "-" nameClass
datatypeName  ::=  CName
| "string"
| "token"
datatypeValue  ::=  literal
anyURILiteral  ::=  literal
namespaceURILiteral  ::=  literal
| "inherit"
inherit  ::=  "inherit" "=" identifierOrKeyword
identifierOrKeyword  ::=  identifier
| keyword
identifier  ::=  (NCName - keyword)
| quotedIdentifier
quotedIdentifier  ::=  "\" NCName
CName  ::=  NCName ":" NCName
nsName  ::=  NCName ":*"
anyName  ::=  "*"
literal  ::=  literalSegment+
literalSegment  ::=  '"' (Char - '"')* '"'
| "'" (Char - "'")* "'"
keyword  ::=  "attribute"
| "default"
| "datatypes"
| "div"
| "element"
| "empty"
| "external"
| "grammar"
| "include"
| "inherit"
| "list"
| "mixed"
| "namespace"
| "notAllowed"
| "parent"
| "start"
| "string"
| "text"
| "token"

NCName is defined in [XML Namespaces]. Char is defined in [XML 1.0].

In order to use a keyword as an identifier, it must be quoted with \. It is not necessary to quote a keyword that is used as the name of an element or attribute or as datatype parameter.

The value of a literal is the concatenation of the values of its constituent literalSegments. The value of a literal segment consists of the characters between the opening and closing quote. The way to get a literal whose value contains both a single and a double quote is to divide the literal into multiple literalSegments so that the single and double quote are in separate literalSegments.

Annotations can be specified as described in Section 5.

There is no notion of operator precedence. It is an error for patterns to combine the |, &, , and - operators without using parentheses to make the grouping explicit. For example, foo | bar, baz is not allowed; instead, either (foo | bar), baz or foo | (bar, baz) must be used. A similar restriction applies to name classes and the use of the | and - operators. These restrictions are not expressed in the above EBNF but they are made explicit in the BNF in Section A.1.

The value of an anyURILiteral specified with include or external is a URI reference to a grammar in the compact syntax.

3. Lexical structure

Whitespace is allowed between tokens. Tokens are the quoted terminals appearing in the EBNF in Section 2, except that literalSegment, nsName, CName, identifier and quotedIdentifer are single tokens.

Comments are also allowed between tokens. Comments start with a # and continue to the end of the line. Comments starting with ## are treated specially; see Section 5.

A Unicode character with hex code N can be represented by the escape sequence \x{N}. Using such an escape sequence is completely equivalent to the entering the corresponding character directly. For example,

element \x{66}\x{6f}\x{6f} { empty }

is equivalent to

element foo { empty }

4. Declarations

A datatypes declaration declares a prefix used in a QName identifying a datatype. For example,

datatypes xsd = ""
element height { xsd:double }

In fact, in the above example, the datatypes declaration is not required: the xsd prefix is predeclared to the above URI.

A namespace declaration declares a prefix used in a QName specifying the name of an element or attribute. For example,

namespace rng = ""
element rng:text { empty }

As in XML, the xml prefix is predeclared.

A default namespace declaration declares the namespace used for unprefixed names specifying the name of an element (but not of an attribute). For example,

default namespace = ""
element foo { attribute bar { string } }

is equivalent to

namespace ex = ""
element ex:foo { attribute bar { string } }

A default namespace declaration may have a prefix as well. For example,

default namespace ex = ""

is equivalent to

default namespace = ""
namespace ex = ""

The URI may be empty. This makes the prefix stand for the absent namespace URI. This is necessary for specifying a name class that matches any name with an absent namespace URI. For example:

namespace local = ""
element foo { attribute * - local:* { string }* }

is equivalent to

<element xmlns="""
	  <nsName ns=""/>
      <data type="string"/>

RELAX NG has the feature that if a file does not specify an ns attribute then the ns attribute can be inherited from the including file. To support this feature, the keyword inherit can be specified in place of the namespace URI in a namespace declaration. For example,

default namespace this = inherit
element foo { element * - this:* { string }* }

is equivalent to

<element xmlns="""
      <data type="string"/>

In addition, the include and external patterns can specify inherit = prefix to specify the namespace to be inherited by the referenced file. For example,

namespace x = ""
external "foo.rng" inherit = x

is equivalent to

<externalRef href="foo.rng"

In the absence of an inherit parameter on include or external, the default namespace will be inherited by the referenced file.

In the absence of a default namespace declaration, a declaration of

default namespace = inherit

is assumed.

5. Annotations

5.1. Initial annotations

An annotation in square brackets can be inserted immediately before a pattern, param, nameClass, grammarContent or includeContent. It has the following syntax:

annotation  ::=  "[" annotationAttribute* annotationElement* "]"
annotationAttribute  ::=  name "=" literal
annotationElement  ::=  name "[" annotationAttribute* (annotationElement | literal)* "]"

Each of the annotationAttributes will turn into attributes on the corresponding RELAX NG element. Each of the annotationElements will turn into initial children of the corresponding RELAX NG element, except in the case where the RELAX NG element cannot have children, in which case they will turn into following elements.

5.2. Documentation shorthand

Comments starting with ## are used to specify documentation elements from the namespace as described in [Compatibility]. For example,

## Represents a language
element lang { 
  ## English
  "en" |
  ## Japanese

turns into

<element name="lang"
  <a:documentation>Represents a language</a:documentation>

## comments can only be used immediately before before a pattern, nameClass, grammarContent or includeContent. Multiple ## comments are allowed. Multiple adjacent ## comments without any intervening blank lines are merged into a single documentation element. Any ## comments must precede any annotation in square brackets.

5.3. Following annotations

A pattern or nameClass may be followed by any number of followAnnotations with the following syntax:

followAnnotation  ::=  ">>" annotationElement

Each such annotationElement turns into a following sibling of the RELAX NG element representing the pattern or nameClass.

5.4. Grammar annotations

An annotationElement may be used in any place where grammarContent or includeContent is allowed. For example,

namespace x = ""

start = foo

x:notation [ name="jpeg" systemId="" ]

foo = element foo { empty }

turns into

<grammar xmlns:x="" 
    <ref name="foo"/>
  <x:notation name="jpeg" systemId=""/>
  <define name="foo">
    <element name="foo">

If the name of such an element is a keyword, then it must be escaped.

6. Conformance

There are three kinds of conformant implementation.

6.1. Validator

A validator conforming to this specification must be able to determine whether a textual object is a correct RELAX NG Compact Syntax schema as specified in Appendix A. Formal description. It must also be able to determine for any XML document and for any correct RELAX NG Compact Syntax schema whether the document is valid (as defined in [RELAX NG]) with respect to the translation of the schema into XML syntax. It need not be able to output a representation of the translation of the schema into XML syntax.

6.2. Structure preserving translator

A structure preserving translator must be able to translate any correct RELAX NG Compact Syntax schema into an XML document whose data model is strictly equivalent to translation specified in Appendix A. Formal description. For this purpose, two instances of the data model (as specified in Section 2 of [RELAX NG]) are considered strictly equivalent if they are identical after applying the simplifications specified in Sections 4.2, 4.3, 4.4, 4.8, 4.9 and 4.10 of [RELAX NG]. When comparing two include or externalRef patterns for strict equivalence, the value of the href attributes are not compared; instead the referenced XML documents are compared for strict equivalence.

6.3. Non-structure preserving translator

A non-structure preserving translator must be able to translate any correct RELAX NG Compact Syntax schema into an XML document whose data model is loosely equivalent to the translation specified in Appendix A. Formal description. For this purpose, two instances of the data model (as specified in Section 2 of [RELAX NG]) are considered loosely equivalent if they are identical after applying all the simplifications specified in Section 4 of [RELAX NG].

A. Formal description

A.1. Syntax

The compact syntax is specified by a grammar in BNF. The translation into the XML syntax is specified by annotations in the grammar.

The start symbol is topLevel.

The BNF description consists of a set of production rules. Each production rule has a left-hand side and right-hand side separated by ::=. The left-hand side specifies the name of a non-terminal. The right-hand side specifies a list of one or more alternatives separated by |. Each alternative consists of a sequence of terminals and non-terminals. A non-terminal is specified by a name in italics. A terminal is either a literal string in quotes or a named non-terminal specified by a name in bold italics. An alternative can also be specified as ε, which denotes an empty sequence of tokens.

Each alternative may be followed by references to one or more named constraints that apply to that alternative.

The translation into XML syntax is specified by associating a value with each terminal and non-terminal in the derivation. Each alternative in the BNF may be followed by an expression in curly braces, which specifies how to compute the value associated with the left-hand side non-terminal. Each terminal and non-terminal on the right-hand side can be labelled with a subscript specifying a variable name. When that variable name is used within the curly braces, it refers to the value associated with that terminal or non-terminal. If an alternative consists of a single terminal or non-terminal, then the expression in curly braces can be omitted; in this case the value of the left-hand side is the value of that terminal or non-terminal.

The result of the translation is not a string containing the XML representation of a RELAX NG schema, but rather is an instance of the data model described in Section 2 of [RELAX NG]; this instance will match the RELAX NG schema for RELAX NG.

A textual object is a correct RELAX NG Compact Syntax schema if:

  • it matches the grammar specified in this section,
  • it satisfies all the constraints specified in this section, and
  • the result of the translation is a correct RELAX NG schema.

The computation of the value of a non-terminal may make use of one or more arguments. The name of such a non-terminal is always followed by a list of arguments. When the name of the non-terminal occurs on the left-hand-side, this list declares the formal arguments for the non-terminal. When the name occurs on the right-hand side of a producton, the list specifies the actual arguments which will be bound to the formal arguments during the computation of the value of the non-terminal. The expressions in curly braces on the right-hand side can refer to the formal arguments declared on the left-hand side. For example, see simpleNameClass.

In addition to explicit arguments, every non-terminal implicitly has an argument that specifies a context for the interpretation of a pattern. Normally the implicit context argument to each non-terminal is the same as its parent; an expression followed by a period followed by a non-terminal references that non-terminal with the context argument changed to be the value of that expression. For example, see topLevel and preamble. In the initial context used for the start symbol, xml is bound as a namespace prefix to, and xsd is bound as a datatype prefix to

Expressions use the following notation:

  • x denotes the value of the variable named x;
  • { } denotes an empty set;
  • ( ) denotes an empty sequence;
  • (x, y) denotes the concatenation of the sequences x and y;
  • context denotes the value of the implicit context argument;
  • true denotes boolean true;
  • false denotes boolean true;
  • inherit denotes a distinct constant used to indicate that a namespace URI should be inherited from the referencing schema;
  • "xyzzy" denotes a string consisting of the characters xyzzy;
  • foo(x, y) denotes the value of the function foo applied to the arguments x and y; the following primitive functions are used:
    • qName(x, y)returns a qualified-name with prefix x and local part y;
    • prefix(x) returns the prefix of the qualified-name x;
    • localPart(x) returns the local-part of the qualified-name x;
    • union(x, y) returns the union of the sets x and y;
    • name(x, y) returns a name with namespace URI x and local name y;
    • attribute(x, y) returns an attribute with name x and value y;
    • element(x, y, z) returns an element with name x, attributes y and children z;
    • bindPrefix(x, y, z) returns a context that is the same as x except that it has the prefix y bound to z;
    • bindDefault(x, y) returns a context that is the same as x except it has the default namespace z;
    • bindDatatypePrefix(x, y, z) returns a context that is the same as x except that it has y bound as a prefix for datatypes to the URI z;
    • lookupPrefix(x, y) returns the binding in the context x for the prefix y; it is an error if there is no applicable binding;
    • lookupDefault(x) returns the default namespace of the context x; if no default has been bound, returns inherit;
    • lookupDatatypePrefix(x, y) returns the binding as a datatype prefix in the context x for the prefix y; it is an error if there is no applicable binding;
    • mapSchemaRef(x) returns a URI; x is a URI reference of a resource containing a schema in the syntax described by this specification; the returned URI is the URI of a resource containing the translation of this schema into RELAX NG XML syntax;
    • makeNsAttribute(x) returns an empty set if x is inherit, and otherwise returns an attribute whose namespace URI is the empty string, whose local name is ns and whose value is x;
    • pair(x, y) returns a pair whose first member is x and whose second member is y;
    • emptyAnnotations() returns a pair whose first member is an empty set and whose second member is an empty sequence;
    • applyAnnotations(x, y) returns an element whose name is the name of y, whose attributes are the union of the first member of x and the attributes of y, and whose children are the concatenation of the second member of x and the children of y;
    • applyAnnotationsGroup(x, y) is equivalent to applyAnnotations(x, <group> y </group>) unless x is equal to emptyAnnotations(), in which case it is equivalent to y;
    • applyAnnotationsChoice(x, y) is equivalent to applyAnnotations(x, <choice> y </choice>) unless x is equal to emptyAnnotations(), in which case it is equivalent to y;
    • stringConcat(x, y) returns a string that is the concatenation of the strings x and y;
    • stripFirstSpace(x);
    • datatypeAttributes(x, y) returns a set of two attributes; both attributes have the empty string as their namespace URI; one attribute has local name datatypeLibrary and value x; the other attribute has local name type and value y;
    • documentationElementName() returns the name of the documentation element defined in [Compatibility], that is, the name with namespace URI and local name documentation;
  • x ? y : z is a conditional expression, which denotes y if x is true and z if x is false;
  • <foo x> y </foo> denotes an element from the RELAX NG namespace with local name foo, attributes x and content x.

topLevel  ::=
    preamblec  c.topLevelBodyx
        { x }

preamble  ::=
        { context }
    |  declc  c.preambled
        { d }

decl  ::=
    "namespace"  namespacePrefixx  "="  namespaceURILiteraly
        Constraint: xml prefix
        Constraint: xml namespace URI
        Constraint: duplicate declaration
        { bindPrefix(context, x, y) }
    |  "default"  "namespace"  "="  namespaceURILiteralx
        Constraint: xml namespace URI
        Constraint: duplicate declaration
        { bindDefault(context, x) }
    |  "default"  "namespace"  namespacePrefixx  "="  namespaceURILiteraly
        Constraint: xml prefix
        Constraint: xml namespace URI
        Constraint: duplicate declaration
        { bindDefault(bindPrefix(context, x, y), y) }
    |  "datatypes"  datatypePrefixx  "="  literaly
        Constraint: xsd prefix
        Constraint: datatypes URI
        Constraint: duplicate declaration
        { bindDatatypePrefix(context, x, y) }

namespacePrefix  ::=
        Constraint: valid prefix

datatypePrefix  ::=

namespaceURILiteral  ::=
    |  "inherit"
        { inherit }

topLevelBody  ::=
    |  grammarx
        { <grammar> x </grammar> }

grammar  ::=
        { ( ) }
    |  memberx  grammary
        { (x, y) }

member  ::=
    |  annotationElementNotKeyword

annotatedComponent  ::=
    annotationsx  componenty
        { applyAnnotations(x, y) }

component  ::=
    |  define
    |  include
    |  div

start  ::=
    "start"  assignOpx  patterny
        { <start x> y </start> }

define  ::=
    identifierx  assignOpy  patternz
        { <define name=x y> z </define> }

assignOp  ::=
        { { } }
    |  "|="
        { attribute(name("", "combine"), "choice") }
    |  "&="
        { attribute(name("", "combine"), "interleave") }

include  ::=
    "include"  anyURILiteralx  optInherity  optIncludeBodyz
        { <include href=mapSchemaRef(x) y> z </include> }

anyURILiteral  ::=
        Constraint: any URI

optInherit  ::=
        { makeNsAttribute(lookupDefault(context)) }
    |  "inherit"  "="  identifierOrKeywordx
        { makeNsAttribute(lookupPrefix(context, x)) }

optIncludeBody  ::=
        { ( ) }
    |  "{"  includeBodyx  "}"
        { x }

includeBody  ::=
        { ( ) }
    |  includeMemberx  includeBodyy
        { (x, y) }

includeMember  ::=
    |  annotationElementNotKeyword

annotatedIncludeComponent  ::=
    annotationsx  includeComponenty
        { applyAnnotations(x, y) }

includeComponent  ::=
    |  define
    |  includeDiv

div  ::=
    "div"  "{"  grammarx  "}"
        { <div> x </div> }

includeDiv  ::=
    "div"  "{"  includeBodyx  "}"
        { <div> x </div> }

pattern  ::=

innerPattern(anno)  ::=
    |  particleChoicex
        { applyAnnotations(anno, <choice> x </choice>) }
    |  particleGroupx
        { applyAnnotations(anno, <group> x </group>) }
    |  particleInterleavex
        { applyAnnotations(anno, <interleave> x </interleave>) }
    |  annotatedDataExceptx
        { applyAnnotationsGroup(anno, x) }

particleChoice  ::=
    particlex  "|"  particley
        { (x, y) }
    |  particlex  "|"  particleChoicey
        { (x, y) }

particleGroup  ::=
    particlex  ","  particley
        { (x, y) }
    |  particlex  ","  particleGroupy
        { (x, y) }

particleInterleave  ::=
    particlex  "&"  particley
        { (x, y) }
    |  particlex  "&"  particleInterleavey
        { (x, y) }

particle  ::=

innerParticle(anno)  ::=
        { applyAnnotationsGroup(anno, x) }
    |  repeatedPrimaryx  followAnnotationsy
        { (applyAnnotations(anno, x), y) }

repeatedPrimary  ::=
    annotatedPrimaryx  "*"
        { <zeroOrMore> x </zeroOrMore> }
    |  annotatedPrimaryx  "+"
        { <oneOrMore> x </oneOrMore> }
    |  annotatedPrimaryx  "?"
        { <optional> x </optional> }

annotatedPrimary  ::=
    leadAnnotatedPrimaryx  followAnnotationsy
        { (x, y) }

annotatedDataExcept  ::=
    leadAnnotatedDataExceptx  followAnnotationsy
        { (x, y) }

leadAnnotatedDataExcept  ::=
    annotationsx  dataExcepty
        { applyAnnotations(x, y) }

leadAnnotatedPrimary  ::=
    annotationsx  primaryy
        { applyAnnotations(x, y) }
    |  annotationsx  "("  innerPattern(x)y  ")"
        { y }

primary  ::=
    "element"  nameClass(true)x  "{"  patterny  "}"
        { <element> x y </element> }
    |  "attribute"  nameClass(false)x  "{"  patterny  "}"
        { <attribute> x y </attribute> }
    |  "mixed"  "{"  patternx  "}"
        { <mixed> x </mixed> }
    |  "list"  "{"  patternx  "}"
        { <list> x </list> }
    |  datatypeNamex  optParamsy
        { <data x> y </data> }
    |  datatypeNamex  datatypeValuey
        { <value x> y </value> }
    |  datatypeValuex
        { <value> x </value> }
    |  "empty"
        { <empty/> }
    |  "notAllowed"
        { <notAllowed/> }
    |  "empty"
        { <text/> }
    |  refx
        { <ref name=x/> }
    |  "parent"  refx
        { <parentRef name=x/> }
    |  "grammar"  "{"  grammarx  "}"
        { <grammar> x </grammar> }
    |  "external"  anyURILiteralx  optInherity
        { <externalRef href=mapSchemaRef(x) y/> }

dataExcept  ::=
    datatypeNamex  optParamsy  "-"  leadAnnotatedPrimaryz
        { <data x> y <except> z </except> </data> }

ref  ::=

datatypeName  ::=
        { datatypeAttributes(lookupDatatypePrefix(context, prefix(x)), localPart(x)) }
    |  "string"
        { datatypeAttributes("", "string") }
    |  "token"
        { datatypeAttributes("", "token") }

datatypeValue  ::=

optParams  ::=
        { ( ) }
    |  "{"  paramsx  "}"
        { x }

params  ::=
        { ( ) }
    |  paramx  paramsy
        { (x, y) }

param  ::=
    annotationsx  identifierOrKeywordy  "="  literalz
        { applyAnnotations(x, <param name=y> z </param>) }

nameClass(elem)  ::=
    innerNameClass(elem, emptyAnnotations())

innerNameClass(elem, anno)  ::=
        { applyAnnotationsChoice(anno, x) }
    |  nameClassChoice(elem)x
        { applyAnnotations(anno, <choice> x </choice>) }
    |  annotatedExceptNameClass(elem)x
        { applyAnnotationsChoice(anno, x) }

nameClassChoice(elem)  ::=
    annotatedSimpleNameClass(elem)x  "|"  annotatedSimpleNameClass(elem)y
        { (x, y) }
    |  annotatedSimpleNameClass(elem)x  "|"  nameClassChoice(elem)y
        { (x, y) }

annotatedExceptNameClass(elem)  ::=
    leadAnnotatedExceptNameClass(elem)x  followAnnotations(elem)y
        { (x, y) }

leadAnnotatedExceptNameClass(elem)  ::=
    annotations(elem)x  exceptNameClass(elem)y
        { applyAnnotations(x, y) }

annotatedSimpleNameClass(elem)  ::=
    leadAnnotatedSimpleNameClass(elem)x  followAnnotations(elem)y
        { (x, y) }

leadAnnotatedSimpleNameClass(elem)  ::=
    annotationsx  simpleNameClass(elem)y
        { applyAnnotations(x, y) }
    |  annotationsx  "("  innerNameClass(elem, x)y  ")"
        { y }

exceptNameClass(elem)  ::=
    nsNamex  "-"  leadAnnotatedSimpleNameClass(elem)y
        Constraint: name class except
        { <nsName makeNsAttribute(lookupPrefix(context, x))> <except> y </except> </nsName> }
    |  "*"  "-"  leadAnnotatedSimpleNameClass(elem)x
        Constraint: name class except
        { <anyName> <except> x </except> </anyName> }

simpleNameClass(elem)  ::=
        { <name makeNsAttribute(elem ? lookupDefault(context) : "")> x </name> }
    |  CNamex
        { <name makeNsAttribute(lookupPrefix(context, prefix(x)))> localPart(x) </name> }
    |  nsNamex
        { <nsName makeNsAttribute(lookupPrefix(context, x))/> }
    |  "*"
        { <anyName/> }

followAnnotations  ::=
        { ( ) }
    |  ">>"  annotationElementx  followAnnotationsy
        { (x, y) }

annotations  ::=
        { pair({ }, x) }
    |  documentationsx  "["  prefixedAnnotationAttributesy  annotationElementsz  "]"
        { pair(y, (x, z)) }

prefixedAnnotationAttributes  ::=
        { ( ) }
    |  prefixedAnnotationAttributex  prefixedAnnotationAttributesy
        Constraint: duplicate attributes
        Constraint: unqualified name
        { (x, y) }

annotationElements  ::=
        { ( ) }
    |  annotationElementx  annotationElementsy
        { (x, y) }

annotationElement  ::=
    identifierOrKeywordx  "["  annotationAttributesy  annotationContentz  "]"
        { element(name("", x), y, z) }
    |  colonAnnotationElement

annotationElementNotKeyword  ::=
    identifierx  "["  annotationAttributesy  annotationContentz  "]"
        { element(name("", x), y, z) }
    |  colonAnnotationElement

colonAnnotationElement  ::=
    prefixedNamex  "["  annotationAttributesy  annotationContentz  "]"
        { element(x, y, z) }

annotationContent  ::=
        { ( ) }
    |  annotationElementx  annotationContenty
        { (x, y) }
    |  literalx  annotationContenty
        { (x, y) }

annotationAttributes  ::=
        { ( ) }
    |  annotationAttributex  annotationAttributesy
        Constraint: duplicate attributes
        { (x, y) }

annotationAttribute  ::=
    |  unprefixedAnnotationAttribute

prefixedAnnotationAttribute  ::=
    prefixedNamex  "="  literaly
        Constraint: xmlns namespace URI
        { attribute(x, y) }

prefixedName  ::=
        Constraint: annotation inherit
        { name(lookupPrefix(context, prefix(x)), localPart(x)) }

unprefixedAnnotationAttribute  ::=
    identifierOrKeywordx  "="  literaly
        { attribute(name("", x), y) }

documentations  ::=
        { ( ) }
    |  documentationx  documentationsy
        { (element(documentationElementName(), { }, x), y) }

identifierOrKeyword  ::=
    |  keyword

keyword  ::=
    |  "default"
    |  "datatypes"
    |  "div"
    |  "element"
    |  "empty"
    |  "external"
    |  "grammar"
    |  "include"
    |  "inherit"
    |  "list"
    |  "mixed"
    |  "namespace"
    |  "notAllowed"
    |  "parent"
    |  "start"
    |  "string"
    |  "text"
    |  "token"

Constraint: valid prefix

It is an error if the value of a namespacePrefix is xmlns.

Constraint: xml prefix

It is an error if the value of namespacePrefix is xml and the the value of the namespaceURILiteral is not

Constraint: xml namespace URI

It is an error if the value of the namespaceURILiteral is and the value of the namespacePrefix is not xml.

Constraint: xsd prefix

It is an error if the value of datatypePrefix is xsd and the the value of the literal is not

Constraint: datatypes URI

It is an error if the value of the literal in a datatypes declaration is not a syntactically legal value for a datatypeLibrary as specified in Section 3 of [RELAX NG].

Constraint: duplicate declaration

It is an error if there is more than one namespace declaration of a particular prefix, more than one default namespace declaration or more than one declaration of a particular datatypes prefix.

Constraint: name class except

It is an error if the value of exceptNameClass is such that it violates the constraint in the second paragraph of Section 4.16 of [RELAX NG]: "An except element that is a child of an anyName element must not have any anyName descendant elements. An except element that is a child of an nsName element must not have any nsName or anyName descendant elements."

Constraint: unqualified name

It is an error if the namespace URI of a prefixedName in a prefixedAnnotationAttributes is the empty string.

Constraint: xmlns namespace URI

It is an error if the namespace URI of a prefixedName in a prefixedAnnotationAttribute is

Constraint: duplicate attributes

It is an error if a prefixedAnnotationAttributes or an annotationAttributes contains two attributes with the same namespace URI and local name.

Constraint: annotation inherit

It is an error if the namespace URI in the value of a prefixedName is inherit.

Constraint: any URI

It is an error if the value of the literal used with external or include declaration does not meet the requirements for the anyURI symbol specified in Section 3 of [RELAX NG].

A.2. Lexical structure

This section describes how to transform the textual representation of a RELAX NG schema in compact syntax into a sequence of tokens, which can be parsed using the grammar specified in Section A.1.

There are six distinct stages, which are logically consecutive; the result of each stage is the input to the following stage.

A.2.1. Character encoding

The textual representation of the RELAX NG schema in compact syntax may be either a sequence of Unicode characters or a sequence of bytes. In the latter case, the first stage is to transform the sequence of bytes to the sequence of characters. The sequence of bytes may have associated metadata specifying the encoding. One example of such metadata is the charset parameter in a MIME media type. If there is such metadata, then the specified encoding is used. Otherwise, the first two bytes of the sequence are examined. If these are #xFF followed by #xFE or #xFE followed by #xFF, then an encoding of UTF-16 [Unicode] will be used, little-endian in the former case, big-endian in the latter case. Otherwise an encoding of UTF-8 [Unicode] is used. It is an error if the sequence of bytes is not a legal sequence in the selected encoding.

A.2.2. BOM stripping

If the first character of the sequence is a byte order mark (#xFEFF), then it is removed.

A.2.3. Newline normalization

Representations of newlines are normalized to #xA in a similar way to [XML 1.0]. Specifically, each occurrence of a #xD character that is not followed by a #xA character or of a #xD, #xA character pair is transformed to #xA.

A.2.4. Escape interpretation

In this stage, each escape sequence of the form \x{n}, where n is a hexadecimal number, is replaced by the character with Unicode code n. The escape sequence must match the production escapeSequence; the value computed in the BNF is the Unicode code of the replacement character. It is an error if the replacement character does not match the Char production of [XML 1.0]. It is an error if the input character sequence contains a character sequence escapeOpen that does not start an escapeSequence. After an escape sequence has been replaced, scanning for escape sequences continues following the replacement character; thus \x{5C}x{5C} is transformed to \x{5C} not to \.


The \ character that opens an escape sequence may be followed by more than one x. This makes it possible for there to be a reversible transformation that maps a schema to a form containing only ASCII characters; the transformation replaces adds an extra x to each existing escape sequence, and replaces every non-ASCII character by an escape sequence with exactly one x.

escapeSequence  ::=
    escapeOpen  hexNumberx  escapeClose
        { x }

escapeOpen  ::=
    "\"  xs  "{"

xs  ::=
    |  "x"  xs

escapeClose  ::=

hexNumber  ::=
    |  hexNumberx  hexDigity
        { (x * 16) + y }

hexDigit  ::=
        { 0 }
    |  "1"
        { 1 }
    |  "2"
        { 2 }
    |  "3"
        { 3 }
    |  "4"
        { 4 }
    |  "5"
        { 5 }
    |  "6"
        { 6 }
    |  "7"
        { 7 }
    |  "8"
        { 8 }
    |  "9"
        { 9 }
    |  [Aa]
        { 10 }
    |  [Bb]
        { 11 }
    |  [Cc]
        { 12 }
    |  [Dd]
        { 13 }
    |  [Ee]
        { 14 }
    |  [Ff]
        { 15 }

A.2.5. Tokenization

In this stage, the sequence of characters is tokenized: it is transformed into a sequence of tokens, where each token corresponds to a non-terminal in the grammar in Section A.1, except that the token sequence contains literalSegment tokens instead of literal tokens.

A sequence of characters is tokenized by first finding the longest initial subsequence that:

If the longest such initial subsequence matches separator, this subsequence is discarded. Otherwise, a single non-terminal is produced from this initial subsequence. In either case, the tokenization proceeds with the rest of the character sequence. It is an error if there is no such initial subsequence.

The production rules below use some additional notation. Square brackets enclose a character class. A character class of the form [^chars] specifies any legal XML character that does not occur in chars. A legal XML character is a character that matches the Char production of [XML 1.0]. A character class of the form [chars], where chars does not being with ^, specifies any single character that occurs in chars. XML hexadecimal character references are used to denote a single character, as in XML. NCName is defined in [XML Namespaces].

identifier  ::=
    NCNamex - keyword
        { x }
    |  "\"  NCNamex
        { x }

CName  ::=
    NCNamex  ":"  NCNamey
        { qName(x, y) }

nsName  ::=
    NCNamex  ":*"
        { x }

literalSegment  ::=
    """  stringNoQuotx  """
        { x }
    |  "'"  stringNoAposx  "'"
        { x }

stringNoQuot  ::=
        { "" }
    |  [^"]x  stringNoQuoty
        { stringConcat(x, y) }

stringNoApos  ::=
        { "" }
    |  [^']x  stringNoAposy
        { stringConcat(x, y) }

documentation  ::=
    |  documentationx  documentationContinuationy
        { stringConcat(x, y) }

documentationLine  ::=
    "##"  restOfLinex
        { stripFirstSpace(x) }

documentationContinuation  ::=
    [&#xA;]x  indent  documentationLiney
        { stringConcat(x, y) }

indent  ::=
        { "" }
    |  [&#x9;&#x20;]x  indenty
        { stringConcat(x, y) }

restOfLine  ::=
        { "" }
    |  [^&#xA;]x  restOfLiney
        { stringConcat(x, y) }

separator  ::=
    |  "#"  [^&#xA;#]  restOfLine
    |  "#"

A.2.6. Literal concatenation

In this stage, each maximal sequence of consecutive literalSegment tokens is concatenated into a literal token.

literal  ::=
    |  literalSegmentx  literaly
        { stringConcat(x, y) }

B. Compact syntax RELAX NG schema for RELAX NG (Non-Normative)

# RELAX NG XML syntax specified in compact syntax.

default namespace rng = ""
namespace local = ""
datatypes xsd = ""

start = pattern

pattern =
  element element { (nameQName | nameClass), (common & pattern+) }
  | element attribute { (nameQName | nameClass), (common & pattern?) }
  | element group|interleave|choice|optional
            |zeroOrMore|oneOrMore|list|mixed { common & pattern+ }
  | element ref|parentRef { nameNCName, common }
  | element empty|notAllowed|text { common }
  | element data { type, param*, (common & exceptPattern?) }
  | element value { commonAttributes, type?, xsd:string }
  | element externalRef { href, common }
  | element grammar { common & grammarContent* }

param = element param { commonAttributes, nameNCName, xsd:string }

exceptPattern = element except { common & pattern+ }

grammarContent = 
  | element div { common & grammarContent* }
  | element include { href, (common & includeContent*) }

includeContent =
  | element div { common & includeContent* }

definition =
  element start { combine?, (common & pattern+) }
  | element define { nameNCName, combine?, (common & pattern+) }

combine = attribute combine { "choice" | "interleave" }

nameClass = 
  element name { commonAttributes, xsd:QName }
  | element anyName { common & exceptNameClass? }
  | element nsName { common & exceptNameClass? }
  | element choice { common & nameClass+ }

exceptNameClass = element except { common & nameClass+ }

nameQName = attribute name { xsd:QName }
nameNCName = attribute name { xsd:NCName }
href = attribute href { xsd:anyURI }
type = attribute type { xsd:NCName }

common = commonAttributes, foreignElement*

commonAttributes = 
  attribute ns { xsd:string }?,
  attribute datatypeLibrary { xsd:anyURI }?,

foreignElement = element * - rng:* { (anyAttribute | text | anyElement)* }
foreignAttribute = attribute * - (rng:*|local:*) { text }
anyElement = element * { (anyAttribute | text | anyElement)* }
anyAttribute = attribute * { text }



James Clark, Makoto MURATA, editors. RELAX NG DTD Compatibility. OASIS, 2001.
James Clark, Makoto MURATA, editors. RELAX NG Specification. OASIS, 2001.
The Unicode Consortium. The Unicode Standard, Version 3.2 or later
XML 1.0
Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen, Eve Maler, editors. Extensible Markup Language (XML) 1.0 Second Edition. W3C (World Wide Web Consortium), 2000.
XML Namespaces
Tim Bray, Dave Hollander, and Andrew Layman, editors. Namespaces in XML. W3C (World Wide Web Consortium), 1999.


James Clark, Kohsuke KAWAGUCHI, editors. Guidelines for using W3C XML Schema Datatypes with RELAX NG. OASIS, 2001.
W3C XML Schema Datatypes
Paul V. Biron, Ashok Malhotra, editors. XML Schema Part 2: Datatypes. W3C (World Wide Web Consortium), 2001.
Haruo Hosoya. Regular Expression Types for XML. PhD Thesis. The University of Tokyo, 2000.
XQuery Formal Semantics
Peter Fankhauser et al., editors.XQuery 1.0 Formal Semantics. W3C Working Draft 07 June 2001. W3C (World Wide Web Consortium), 2001.