B Content Markup Validation Grammar

Overview: Mathematical Markup Language (MathML) Version 2.0 (2nd Edition)
Previous: A Parsing MathML
Next: C Content Element Definitions

B Content Markup Validation Grammar

This presents an informal EBNF grammar that can be used to validate the structure of Content Markup.

It defines the valid expression trees in content markup. It does not define the rules for attribute validation. That must be done separately.
The non-terminal Presentation_tags is a placeholder for a valid presentation element start tag or end tag.
The string #PCDATA denotes XML parsed character data.
Symbols beginning with '_' (for example _mmlarg) are internal symbols. A recursive grammar is usually required for their recognition.
Symbols which are all in lowercase symbols (for example 'ci') are terminal symbols representing MathML content elements.
Symbols beginning with Uppercase letters are terminals representating other tokens.

whitespace definitions including Presentation_tags
[1]    Presentation_tags    ::=    "presentation" /* placeholder */
[2]    Space    ::=    #x09 | #x0A | #x0D | #x20 /* tab, lf, cr, space characters */
[3]    S    ::=    (Space | Presentation_tags")* /* treat presentation as space */
Characters, only for content validation characters
[4]    Char    ::=    #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* valid XML chars */

start(\%x) returns a valid start tag for the element \%x
end(\%x) returns a valid end tag for the element \%x
empty(\%x) returns a valid empty tag for the element \%x

 start(ci)    ::= "<ci>"
 end(cn)      ::= "</cn>"
 empty(plus)  ::= "<plus/>"

The reason for doing this is to avoid writing a grammar for all the attributes. The model below is not complete for all possible attribute values.

start and end tag functions
[5]    _start(\%x)    ::=    "<\%x" (Char - '>')* ">" /* returns a valid start tag for the element \%x */
[6]    _end(\%x)    ::=    "<\%x" Space* ">" /* returns a valid end tag for the element \%x */
[7]    _empty(\%x)    ::=    "<\%x" (Char - '>')* "/>" /* returns a valid empty tag for the element \%x */
[8]    _sg(\%x)    ::=    S _start(\%x) /* start tag preceded by optional whitespace */
[9]    _eg(\%x)    ::=    _end(\%x) S /* end tag followed by optional whitespace */
[10]    _ey(\%x)    ::=    S _empty(\%x) S /* empty tag preceded and followed by optional whitespace */
semantics, annotation, etc.
[11]    semantics    ::=   
_sg(semantics) _mmlarg _annot* _eg(semantics)
[12]    annotation    ::=    _sg(annotation) #PCDATA _eg(annotation)
[13]    annotation-xml    ::=    _sg(annotation-xml) _ANY _eg(annotation-xml)
[14]    _ANY    ::=    "AnyXML" /* placeholder for wellformed XML Fragment (not Mixed Content) */
[15]    _annot    ::=    annotation | annotation-xml
mathml content constructs
[17]    _mmlarg    ::=   
_container | _token | _operator | _relation
[18]    _container    ::=   
_special | _constructor
[19]    _token    ::=    ci | cn | csymbol | _constantsym
[20]    _special    ::=    apply | lambda | reln | fn | semantics
[21]    _constructor    ::=    interval | list | matrix | matrixrow | set | vector | piecewise | piece | otherwise
[23]    _qualifier    ::=   
lowlimit | uplimit | degree | logbase | domainofapplication | momentabout | condition /* interval is both a qualifier and a constructor */
[24]    _constantsym    ::=    integers | rationals | reals | naturalnumbers | complexes | primes | exponentiale | imaginaryi | notanumber | true | false | pi | eulergamma | infinity
relations
[25]    _relation    ::=    _genrel | _setrel | _seqrel2ary
[26]    _genrel    ::=    _genrel2ary | _genrelnary
[27]    _genrel2ary    ::=    ne
[28]    _genrelnary    ::=    eq | leq | lt | geq | gt
[29]    _setrel    ::=    _seqrel2ary | _setrelnary
[30]    _setrel2ary    ::=    in | notin | notsubset | notprsubset
[31]    _setrelnary    ::=    subset | prsubset
[32]    _seqrel2ary    ::=    tendsto
operators
[33]    _operator    ::=    _funcop | _arithop | _calcop | _vcalcop | _seqop | _trigop | _classop | _statop | _lalgop | _logicop | _setop
functional operators
[34]    _funcop    ::=    _funcop1ary | _funcopnary
[35]    _funcop1ary    ::=    inverse | ident | domain | codomain | image
[36]    _funcopnary    ::=    fn| compose /* general user-defined function is n-ary */

(note minus is both 1ary and 2ary)

arithmetic operators
[37]    _arithop    ::=    _arithop1ary | _arithop2ary | _arithopnary | root
[38]    _arithop1ary    ::=    abs | conjugate | factorial | minus | arg | real | imaginary | floor | ceiling
[39]    _arithop2ary    ::=    quotient | divide | minus | power | rem
[40]    _arithopnary    ::=    plus | times | max | min | gcd | lcm
calculus and vector calculus
[41]    _calcop    ::=    int | diff | partialdiff
[42]    _vcalcop    ::=    divergence | grad | curl | laplacian
sequences and series
[43]    _seqop    ::=    sum | product | limit
elementary classical functions and trigonometry
[44]    _classop    ::=    exp | ln | log
[45]    _trigop    ::=    sin | cos | tan | sec | csc | cot | sinh | cosh | tanh | sech | csch | coth | arcsin | arccos | arctan
statistics operators
[46]    _statop    ::=    _statopnary | moment
[47]    _statopnary    ::=    mean | sdev | variance | median | mode
linear algebra operators
[48]    _lalgop    ::=    _lalgop1ary |_lalgop2ary | _lalgopnary
[49]    _lalgop1ary    ::=    determinant | transpose
[50]    _lalgop2ary    ::=    vectorproduct | scalarproduct | outerproduct
[51]    _lalgopnary    ::=    selector
logical operators
[52]    _logicop    ::=    _logicop1ary | _logicopnary | _logicop2ary | _logicopquant
[53]    _logicop1ary    ::=    not
[54]    _logicop2ary    ::=    implies | equivalent | approx | factorof
[55]    _logicopnary    ::=    and | or | xor
[56]    _logicopquant    ::=    forall | exists
set theoretic operators
[57]    _setop    ::=    _setop1ary |_setop2ary | _setopnary
[58]    _setop1ary    ::=    card
[59]    _setop2ary    ::=    setdiff
[60]    _setopnary    ::=    union | intersect | cartesianproduct
operator groups
[61]    _unaryop    ::=    _funcop1ary | _arithop1ary | _trigop | _classop | _calcop | _vcalcop | _logicop1ary | _lalgop1ary | _setop1ary
[62]    _binaryop    ::=    _arithop2ary | _setop2ary | _logicop2ary | _lalgop2ary
[63]    _naryop    ::=    _arithopnary | _statopnary | _logicopnary | _lalgopnary | _setopnary | _funcopnary
[63a]    _specialop    ::=    _special | ci | csymbol
[64]    _ispop    ::=    int | sum | product
[65]    _diffop    ::=    diff | partialdiff
[66]    _binaryrel    ::=    _genrel2ary | _setrel2ary | _seqrel2ary
[67]    _naryrel    ::=    _genrelnary | _setrelnary
separator
[68]    sep    ::=    _ey(sep)
leaf tokens and data content of leaf elements
[69]    _mdatai    ::=    (#PCDATA | Presentation_tags)* /* note _mdata includes Presentation constructs here. */
[70]    _mdatan    ::=    (#PCDATA | sep | Presentation_tags)* /* note _mdata includes Presentation constructs here. */
[71]    ci    ::=    _sg(ci) _mdatai _eg(ci)
[72]    cn    ::=    _sg(cn) _mdatan _eg(cn)
[73]    csymbol    ::=    _sg(csymbol) _mdatai _eg(csymbol)

condition - constraints. constraints contains either a single reln (relation), or an apply holding a logical combination of relations, or a set (over which the operator should be applied).

condition
[74]    condition    ::=    _sg(condition) reln | apply | set _eg(condition)
domains for integral, sum , product, and specials
[75a]    _domainofapp    ::=    domainofapplication | _domainabbrev
[75b]    _domainabbrev    ::=    (lowlimit uplimit?) | uplimit | interval | condition

Note that apply is used in place of the deprecated reln in MathML2.0 for relational operators as well as arithmetic, algebraic etc.

apply construct
[76]    apply    ::=    _sg(apply) _applybody | _relnbody _eg(apply)
[77]    _applybody    ::=    ( _unaryop _mmlarg ) /* 1-ary ops */
| (_binaryop _mmlarg _mmlarg) /* 2-ary ops */
| (_naryop _mmlarg*) /* n-ary ops, enumerated arguments */
| (_naryop bvar* _domainofapp? _mmlarg) /* n-ary ops, over domain of application */
| (_specialop _mmlarg*) /* special ops can be applied to anything */
| (_specialop bvar* _domainofapp? _mmlarg) /* special ops, over domain of application */
| (_ispop bvar* _domainofapp? _mmlarg) /* integral, sum, product */
| (_diffop bvar* _mmlarg) /* differential ops */
| (log logbase? _mmlarg) /* logs */
| (moment degree? momentabout? _mmlarg*) /* statistical moment */
| (root degree? _mmlarg) /* radicals - default is square-root */
| (limit bvar* lowlimit? condition? _mmlarg) /* limits */
| (_logicopquant bvar* _domainofapp _mmlarg) /* quantifier with explicit bound variables */

equations and relations - reln uses lisp-like syntax (like apply) the bvar and condition elements are used to construct a "such that" or "where" constraint on the relation. Note that reln is deprecated but still valid in MathML2.0

equations and relations
[78]    reln    ::=    _sg(reln) _relnbody _eg(reln)
[79]    _relnbody    ::=    ( _binaryrel bvar* condition? _mmlarg _mmlarg ) | ( _naryrel bvar* condition? _mmlarg* )
fn construct Note that fn is deprecated but still valid in MathML2.0
[80]    fn    ::=    _sg(fn) _fnbody _eg(fn)
[81]    _fnbody    ::=   
Presentation_tags | _mmlarg
lambda construct
[82]    lambda    ::=    _sg(lambda) _lambdabody _eg(lambda)
[83]    _lambdabody    ::=   
bvar* _domainofapp? _mmlarg /* multivariate lambda calculus */
declare construct
[84]    declare    ::=    _sg(declare) _declarebody _eg(declare)
[85]    _declarebody    ::=    ci (fn | constructor)?
constructors
[86]    interval    ::=    _sg(interval) _mmlarg _mmlarg _eg(interval) /* start, end define interval */
[87]    set    ::=    _sg(set) _lsbody _eg(set)
[88]    list    ::=    _sg(list) _lsbody _eg(list)
[89]    _lsbody    ::=    _mmlarg* /* enumerated arguments */
| (bvar* _domainofapp _mmlarg) /* generated arguments */
[90]    matrix    ::=    _sg(matrix) matrixrow* _eg(matrix)
| _sg(matrix) bvar* _domainofapp? _mmlarg _eg(matrix) /* vectors over domain of application */
[91]    matrixrow    ::=    _sg(matrixrow) _mmlarg* _eg(matrixrow) /* allows matrix of operators */
[92]    vector    ::=    _sg(vector) _mmlarg* _eg(vector)
| _sg(vector) bvar* _domainofapp? _mmlarg _eg(vector) /* vectors over domain of application */
[93]    piecewise    ::=   
_sg(piecewise) piece* otherwise? _eg(piecewise)
[94]    piece    ::=   
_sg(piece) _mmlarg _mmlarg _eg(piece) /* used by piecewise */
[95]    otherwise    ::=    _sg(otherwise) _mmlarg _eg(otherwise) /* used by piecewise */
bound variables
[95a]    _cisemantics    ::=    _sg(semantics) _citoken _annot* _eg(semantics)
[95b]    _citoken    ::=    ci | _cisemantics
[96]    bvar    ::=    _sg(bvar) _citoken degree? _eg(bvar)
[97]    degree    ::=    _sg(degree) _mmlarg _eg(degree)
other qualifiers - note the contained _mmlarg could be a reln
[98]    lowlimit    ::=    _sg(lowlimit) _mmlarg _eg(lowlimit)
[99]    uplimit    ::=    _sg(uplimit) _mmlarg _eg(uplimit)
[100]    logbase    ::=    _sg(logbase) _mmlarg _eg(logbase)
[101]    domainofapplication    ::=    _sg(domainofapplication) _mmlarg _eg(domainofapplication)
[102]    momentabout    ::=    _sg(momentabout) _mmlarg _eg(momentabout)

The top level math element. Allow declare only at the head of a math element.

math
[106]    math    ::=    _sg(math) declare* _mmlarg* _eg(math)
Overview: Mathematical Markup Language (MathML) Version 2.0 (2nd Edition)
Previous: A Parsing MathML
Next: C Content Element Definitions