PRD
From RIF
- Document title:
- RIF Production Rule Dialect
- Editors
- Christian de Sainte Marie, ILOG
- Adrian Paschke, Free University Berlin
- Gary Hallmark, Oracle
- Abstract
-
This document specifies RIF-PRD, a Rule Interchange Format (RIF) dialect to enable the interchange of production rules.
- Status of this Document
- @@update This is an automatically generated Mediawiki page, made from some sort of W3C-style spec.
Copyright © 2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
Contents
|
1 Overview
This document specifies the production rule dialect of the W3C rule interchange format (RIF-PRD), a standard XML serialization format for many production rule languages.
Production rules are rule statements defined in terms of both individual facts or objects, and groups of facts or classes of objects. They have an if part, or condition, and a then part, or action. The condition is like the condition part of logic rules (as covered by the basic logic dialect of the W3C rule interchange format, RIF-BLD). The then part contains actions, which is different to the conclusion part of logic rules which contains only a logical statement. Actions can add, delete, or modify facts in the knowledge base, and have other side-effects.
Example 1.1. «A customer becomes a "Gold" customer as soon as his cumulative purchases during the current year top $5000»; «Customers that become "Gold" customers must be notified immediately, and a golden customer card will be printed and sent to them within one week»; «For shopping carts worth more than $1000, "Gold" customers receive an additional discount of 10% of the total amount», are all examples of production rules. ☐
As a production rule interchange format, RIF-PRD specifies an abstract syntax that shares most features with many concrete production rule languages, and it associates each abstract construct with normative semantics and a normative XML concrete syntax.
Production rules are statements of programming logic that specify the execution of one or more actions in the case that their conditions are satisfied. Production rules therefore have an operational semantic (formalizing state changes, e.g., on the basis of a state transition system formalism). The OMG Production Rule Representation specification [PRR] summarizes it as follows:
- Match: the rules are instantiated based on the definition of the rule conditions and the current state of the data source;
- Conflict resolution: a decision algorithm, often called the conflict resolution strategy, is applied to select the rule instances to be executed;
- Act: the state of the data source is changed, by executing the selected rule instances’ actions. If a terminal state has not been reached, the control loops back to the first step (Match).
In the section Operational semantics of rules and rule sets, the semantics for rules and rule sets is specified, accordingly, as a labeled terminal transition system (PLO04), where state stransitions result from executing the action part of instantiated rules. When several rules are deemed able to be executed during the rule execution process, a conflict resolution strategy is used to determine the order of rules to execute Sub-section Instance Selection specifies how an intended conflict resolution strategy can be attached to a rule set interchanged with RIF-PRD, and defines a default conflict resolution strategy.
However, as a RIF dialect, RIF-PRD has also been designed to allow interoperability between rule languages over the World Wide Web. In RIF, this is achieved by sharing the same syntax for constructs that have the same semantics across multiple dialects. As a consequence, RIF-PRD shares most of the syntax for rule conditions with RIF-BLD [RIF-BLD], and the semantics associated to the syntactic constructs used for representing the condition part of rules in RIF-PRD is specified, in Section Semantics of condition formulas, in terms of a model theory, as it is in the specification of RIF-BLD as well. In addition to exploiting similarities between the two dialects, it allows them to share the same RIF definitions for data types and built-ins [RIF-DTB].
In the section Operational semantics of actions, the semantics associated with the constructs used to represent the action part of rules in RIF-PRD is specified in terms of a transition relation between successive states of the data source, as defined by the condition formulas that they entail, thus making the link between the model-theoretic semantics of conditions and the operational semantics of rules and rulesets.
The abstract syntax is specified in mathematical english, and the abstract syntactic constructs defined in the sections Abstract Syntax of Conditions, Abstract Syntax of Actions and Abstract Syntax of Rules and Rulesets, are mapped one to one onto the concrete XML syntax in Section XML syntax. A lightweight notation is also defined along with the abstract syntax, to allow for a human-friendlier specification of the semantics. A more complete presentation syntax is specified using an EBNF in Section Presentation Syntax. However, only the XML syntax and the associated semantics are normative. A normative XML schema will also be provided in future versions of the document.
Example 1.2. In RIF-PRD presentation syntax, the first rule in example 1.1. might be represented as follows:
Prefix(ex1 http://example.com/2008/prd#) (* ex1:rule_1 *) Forall ?customer ?purchasesYTD ( If And( ?customer#ex1:Customer ?customer[ex1:purchasesYTD->?purchasesYTD] External(pred:numeric-greater-than(?purchasesYTD 5000))) Then ex1:Gold(?customer) )
The condition languages of RIF-PRD and RIF-BLD have much in common, including much of their semantics. Although their abstract syntax and rule semantics are different, due to the operational nature of the actions in production rules, there is a subset for which they are equivalent: essentially, rules with no negation and no uninterpreted functions in the condition, and with only assertions in the action part. For that subset, the XML syntax is the same, so many XML documents are valid in both dialects and have the same meaning. The correspondence between RIF-PRD and RIF-BLD is detailed in Appendix Compatibility with RIF-BLD.
This document is mostly intended for the designers and developers of RIF-PRD implementations, that is, applications that serialize production rules as RIF-PRD XML (producer applications) and/or that deserialize RIF-PRD XML documents into production rules (consumer applications).
2 Conditions
This section specifies the language of the rule conditions that can be serialized using RIF-PRD, by specifying:
- the abstract syntax that all production rule languages interchanging rules using RIF-PRD must have in common for expressing conditions;
- and the intended semantics of the condition formulas in a RIF-PRD document.
Note to the reader: this section depends on Section Constants, Symbol Spaces, and Datatypes of RIF data types and builtins [RIF-DTB].
2.1 Abstract syntax
For a production rule language to be able to interchange rules using RIF-PRD, its alphabet for expressing the condition parts of a rule must, at the abstract syntax level, consist of:
- a countably infinite set of constant symbols Const;
- a countably infinite set of variable symbols Var (disjoint from Const);
- a countably infinite set of argument names, ArgNames (disjoint from Const and Var);
- syntactic constructs to denote:
- Function calls;
- Relations, including equality, class membership and subclass relations;
- conjunction, disjunction and negation;
- existential conditions.
For the sake of readibility and simplicity, this specification introduces a notation for these constructs. That notation is not intended to be a concrete syntax, so it leaves out many details: the only concrete syntax for RIF-PRD is the XML syntax.
Notice that the production rule systems for which RIF-PRD aims to provide a common XML serialization use only externally specified functions, e.g. builtins. RIF-BLD specifies, in addition, a construct to denote uninterpreted function symbols, which RIF-PRD does not require: this is one of two differences between the alphabets used in the condition languages of RIF-PRD and RIF-BLD.
The second point of difference is that RIF-PRD does support a form of negation. RIF-BLD does not support negation because logic rule languages use many different kinds of negations, none of them prevalent enough to justify inclusion in the basic logic dialect of RIF (see also the RIF framework for logic dialects).
2.1.1 Terms
The most basic construct that can be serialized using RIF-PRD is the term. RIF-PRD provides for the representation and interchange of several kinds of terms: constants, variables, positional terms and terms with named arguments
Definition (Term).
- Constants and variables. If t ∈ Const or t ∈ Var then t is a simple term.
- Positional terms. If t ∈ Const and t1, ..., tn, n≥0, are terms then t(t1 ... tn) is a positional term.
Here, the constant t represents a function and t1, ..., tn represent argument values. - Terms with named arguments. A term with named arguments is of the form t(s1->v1 ... sn->vn), where n≥0, t ∈ Const and v1, ..., vn are terms and s1, ..., sn are pairwise distinct symbols from the set ArgNames.
The constant t here represents a function; s1, ..., sn represent argument names; and v1, ..., vn represent argument values. The argument names, s1, ..., sn, are required to be pairwise distinct. Terms with named arguments are like positional terms except that the arguments are named and their order is immaterial. Note that a term of the form f() is, trivially, both a positional term and a term with named arguments. ☐
2.1.2 Atomic formulas
The atomic truth-valued constructs that can be serialized using RIF-PRD are called atomic formulas.
Definition (Atomic formula). An atomic formula can have several different forms and is defined as follows:
- Positional atomic formulas. If t ∈ Const and t1, ..., tn, n≥0, are terms then t(t1 ... tn) is a positional atomic formula (or simply a positional atom).
- Atomic formulas with named arguments. An atomic formula with named arguments (or simply a atom with named arguments) is of the form t(s1->v1 ... sn->vn), where n≥0, t ∈ Const and v1, ..., vn are terms and s1, ..., sn are pairwise distinct symbols from the set ArgNames.
The constant t here represents a predicate; s1, ..., sn represent argument names; and v1, ..., vn represent argument values. The argument names, s1, ..., sn, are required to be pairwise distinct. Atoms with named arguments are like positional Atoms except that the arguments are named and their order is immaterial. Note that an atom of the form t() is, trivially, both a positional atom and an atom with named arguments. - Equality atomic formulas. t = s is an equality atomic formula (or, simply, an equality), if t and s are terms.
- Class membership atomic formulas (or just membership). t#s is a membership atomic formula if t and s are terms. the term t is the object and the term s is the class.
- Subclass atomic formulas. t##s is a subclass atomic formula if t and s are terms.
- Frame atomic formulas. t[p1->v1 ... pn->vn] is a frame atomic formula (or simply a frame) if t, p1, ..., pn, v1, ..., vn, n ≥ 0, are terms. The term t is the object of the frame; the pi are the property or attribute names; and the vi are the property or attribute values. In this document, an attribute/value pair is sometimes called a slot.
Membership, subclass, and frame atomic formulas are used to describe objects, classifications and class hierarchies. - Externally defined atomic formulas. If t is a positional, named-argument, or a frame atomic formula then External(t) is an externally defined atomic formula. Such atomic formulas are used for representing built-in predicates. ☐
Note that not only predicates, but also frame atomic formulas can be externally defined. Therefore, external information sources can be modeled in an object-oriented way via frames.
Editor's Note: Objects are commonly used in PR systems. In this draft, we reuse frame, membership, and subclass formulas (from RIF-BLD) to model objects. We are aware of current limits, such as difficulty expressing datatype and cardinality constraints. Future drafts will address that problem. We are interested in feedback on the merits and limitations of this approach.
Observe that the argument names of frames, p1, ..., pn, are terms and so, as a special case, can be variables. In contrast, atoms with named arguments can use only the symbols from ArgNames to represent their argument names, which can neither be constants from Const nor variables from Var.
Note that atomic formulas are sometimes also called terms, e.g. in the realm of logic languages: the specification of RIF-BLD, in particular, follows that usage. The abstract syntactic elements that are called terms in this specification, are called basic terms in the specification of RIF-BLD.
2.1.3 Formulas
Composite truth-valued constructs that can be serialized using RIF-PRD are called formulas.
Note that terms (constants, variables and functions) are not formulas.
More general formulas are constructed out of the atomic formulas with the help of logical connectives.
Definition (Condition formula). A condition formula can have several different forms and is defined as follows:
- Atomic formula: If φ is an atomic formula then it is also a condition formula.
- Conjunction: If φ1, ..., φn, n ≥ 0, are condition formulas then so is And(φ1 ... φn), called a conjunctive formula. As a special case, And() is allowed and is treated as a tautology, i.e., a formula that is always true.
- Disjunction: If φ1, ..., φn, n ≥ 0, are condition formulas then so is Or(φ1 ... φn), called a disjunctive formula. As a special case, Or() is permitted and is treated as a contradiction, i.e., a formula that is always false.
- Negation: If φ is a condition formula, then so is Not(φ), called a negative formula.
- Existentials: If φ is a condition formula and ?V1, ..., ?Vn, n>0, are variables then Exists ?V1 ... ?Vn(φ) is an existential formula. ☐
In the definition of a formula, the component formulas φ and φi are said to be subformulas of the respective condition formulas that are built using these components.
The function Var(e) that maps a term, atomic formula or formula e to the set of its free variables is defined as follows:
- if e ∈ Const, then Var(e) = {};
- if e ∈ Var, then Var(e) = {e};
- if p and argi, i = 0...n, are terms, then, Var(p(argi) = Var(p) ∪i=0...n Var(argi);
- if p and argi, i = 0...n, are terms, then, Var(External(p(argi)) = Var(p) ∪i=0...n Var(argi);
- if t1 and t2 are terms, then Var(t1 [=|#|##] t2) Var(t1) ∪ Var(t2);
- if o', ki, i = 1...n, and vi, i = 1...n, are terms, then Var(o[k1->v1 ... kn->vn]) Var(o) ∪i=1...n Var(ki) ∪i=1...n Var(vi);
- if fi, i = 0...n, are condition formulas, then Var([AND|OR|NOT](fi)) ∪i=0...n Var(fi);
- if f is a condition formula and xi ∈ Var for i = 1...n, then, Var(Exists x1 ... xn (f)) = Var(f) - {xi | i = 1...n}.
Definition (Ground formula). A condition formula φ is a ground formula if and only if Varφ = {} and φ does not contain any existential subformula. ☐
In other words, a ground formula does not contain any variable term.
2.1.4 Well-formed formulas
The specification of RIF-PRD does not assign a standard meaning to all the formulas that can be serialized using its concrete XML syntax: formulas that can be meaningfully serialized are called well-formed. Not all formulas are well-formed with respect to RIF-PRD: it is required that no constant appear in more than one context. What this means precisely is explained below.
The set of all constant symbols, Const, is partitioned into several subsets as follows:
- A subset of individuals.
The symbols in Const that belong to the primitive datatypes are required to be individuals. - A number of subsets for predicate symbols such that there is one subset per symbol arity (defined below) for externally defined predicates and one for non-external predicates.
Note that this implies that symbols used for external predicate names cannot be used for other predicates. Also, the definition of arity, below, implies that the arities for positional predicate symbols and for predicate symbols with named arguments are distinct even if the numbers of arguments are the same. Therefore, symbols that are used for positional predicates cannot be used for predicates with named arguments, and vice versa. - A number of subsets of function symbols (only externally defined function can be serialized using RIF-PRD). As with predicate symbols, there are separate subsets for symbols with different arities; function symbols with named arguments are in their own subsets. The only exception is the case of nullary symbols, which take zero arguments as in f(), since they are considered to be both positional and named-argument symbols.
Each predicate and function symbol that take at least one argument has precisely one arity. For positional predicate and function symbols, an arity is a non-negative integer that tells how many arguments the symbol can take. For symbols that take named arguments, an arity is a set {s1 ... sk} of argument names (si ∈ ArgNames) that are allowed for that symbol. Nullary symbols (which take zero arguments) are said to have the arity 0.
An important point is that neither the above partitioning of constant symbols nor the arity are specified explicitly. Instead, the arity of a symbol and its type is determined by the context in which the symbol is used.
Definition (Context of a symbol).
The context of an occurrence of a symbol, s∈Const, in a formula, φ, is determined as follows:
- If s occurs as a function symbol in a term of the form s(...) with arity α then s occurs in the context of an external function symbol with arity α (or simply the context of a function symbol with arity α, since RIF-PRD knows only external functions);
- If s occurs as a predicate in an atomic subformula of the form s(...) with arity α then s occurs in the context of a predicate symbol with arity α;
- If s occurs as a predicate in an atomic subformula External(s(...)) with arity α then s occurs in the context of an external predicate symbol with arity α;
- If s occurs in any other context (in a frame: s[...], ...[s->...], or ...[...->s]; or in a positional/named argument term: p(...s...), q(...->s...)), it is said to occur as an individual. ☐
Definition (Well-formed formula). A formula φ is well-formed iff:
- every constant symbol mentioned in φ occurs in exactly one context.
- Whenever a formula contains a function term t or an external atomic formula External(t), t must be an instance of the coherent set of external schemas (Section Schemas for Externally Defined Terms of RIF data types and builtins [RIF-DTB]) associated with the language of RIF-PRD.
- If t is an instance of the coherent set of external schemas associated with the language then t can occur only as a function term t or as an external atomic formula External(t), i.e., as an external term or atomic formula. ☐
Definition (RIF-PRD condition language). The RIF-PRD condition language consists of the set of all well-formed formulas. ☐
2.2 Semantics of condition formulas
This section specifies the intended semantics of the condition formulas in a RIF-PRD document.
For compatibility with other RIF specifications (in particular, RIF data types and builtins), and to make the interoperability with RIF logic dialects (in particular RIF Core [RIF-Core] and RIF-BLD), the intended semantics for RIF-PRD condition formulas is specified in terms of a model theory.
2.2.1 Semantic structures
Definition (Semantic structure). A semantic structure, I, is a tuple of the form <TV, DTS, D, Dind, Dfunc, IC, IV, IF, Iframe, INF, Isub, Iisa, I=, Iexternal, Itruth>. Here D is a non-empty set of elements called the Herbrand domain of 'I, i.e., the set of all ground terms which can be formed by using the elements of Const. Dind, Dfunc are nonempty subsets of D. Dind is used to interpret the elements of Const that are individuals and Dfunc is used to interpret the elements of Const that are function symbols. Const denotes the set of all constant symbols and Var the set of all variable symbols. TV denotes the set of truth values that the semantic structure uses and DTS is a set of identifiers for primitive datatypes (please refer to Section Datatypes of RIF data types and builtins [RIF-DTB] for the semantics of datatypes). The set of all ground (positional|named|frame|external) formulas which can be formed by using the function symbols with the ground terms in the Herbrand domain is the Herbrand base, HB. A semantic structure I is a Herbrand interpretation, IH, if the corresponding subset of HB is the set of all ground formulas which are true with respect to I. ☐
As far as the assignment of a standard meaning to formulas in the RIF-PRD condition language is concerned, the set TV of truth values consists of just two values, t and f.
The other components of I are total mappings defined as follows:
-
IC maps Const to D.
This mapping interprets constant symbols. In addition:
- If a constant, c ∈ Const, is an individual then it is required that IC(c) ∈ Dind.
- If c ∈ Const, is a function symbol (positional or with named arguments) then it is required that IC(c) ∈ Dfunc.
-
IV maps Var to Dind.
This mapping interprets variable symbols.
-
IF maps D to functions D*ind → D (here D*ind is a set of all sequences of any finite length over the domain Dind).
This mapping interprets positional terms and gives meaning to positional predicate function.
- INF maps D to the set of total functions of the form SetOfFiniteSets(ArgNames × Dind) → D.
This mapping interprets function symbols with named arguments and gives meaning to named argument functions. This is analogous to the interpretation of positional terms with two differences:
- Each pair <s,v> ∈ ArgNames × Dind represents an argument/value pair instead of just a value in the case of a positional term.
- The arguments of a term with named arguments constitute a finite set of argument/value pairs rather than a finite ordered sequence of simple elements. So, the order of the arguments does not matter.
- Iframe maps Dind to total functions of the form SetOfFiniteBags(Dind × Dind) → D.
This mapping interprets frame terms and gives meaning to frame functions. An argument, d ∈ Dind, to Iframe represents an object and the finite bag {<a1,v1>, ..., <ak,vk>} represents a bag of attribute-value pairs for d. Iframe is used to determine the truth valuation of frame terms.
Bags (multi-sets) are used here because the order of the attribute/value pairs in a frame is immaterial and pairs may repeat: o[a->b a->b]. Such repetitions arise naturally when variables are instantiated with constants. For instance, o[?A->?B ?C->?D] becomes o[a->b a->b] if variables ?A and ?C are instantiated with the symbol a and ?B, ?D with b.
- Isub gives meaning to the subclass relationship. It is a mapping of the form Dind × Dind → D.
The operator ## is required to be transitive, i.e., c1 ## c2 and c2 ## c3 must imply c1 ## c3. TThis is ensured by a restriction in Section Interpretation of condition formulas;
- Iisa gives meaning to class membership. It is a mapping of the form Dind × Dind → D.
The relationships # and ## are required to have the usual property that all members of a subclass are also members of the superclass, i.e., o # cl and cl ## scl must imply o # scl. This is ensured by a restriction in Section Interpretation of condition formulas;
- I= is a mapping of the form Dind × Dind → D.
It gives meaning to the equality operator.
-
Itruth is a mapping of the form D → TV.
It is used to define truth valuation for formulas.
-
Iexternal is a mapping from the coherent set of schemas for externally defined functions to total functions D* → D. For each external schema σ = (?X1 ... ?Xn; τ) in the coherent set of external schemas associated with the language, Iexternal(σ) is a function of the form Dn → D.
For every external schema, σ, associated with the language, Iexternal(σ) is assumed to be specified externally in some document (hence the name external schema). In particular, if σ is a schema of a RIF built-in predicate or function, Iexternal(σ) is specified in [RIF-DTB] so that:
- If σ is a schema of a built-in function then Iexternal(σ) must be the function defined in the aforesaid document.
- If σ is a schema of a built-in predicate then Itruth ο (Iexternal(σ)) (the composition of Itruth and Iexternal(σ), a truth-valued function) must be as specified in [RIF-DTB].
For convenience, we also define the following mapping I from terms to D:
- I(k) = IC(k), if k is a symbol in Const;
- I(?v) = IV(?v), if ?v is a variable in Var;
- I(p(t1 ... tn)) = IP(I(p))(I(t1),...,I(tn));
- I(p(s1->v1 ... sn->vn)) = INF(I(p))({<s1,I(v1)>,...,<sn,I(vn)>})
Here we use {...} to denote a set of argument/value pairs; - I(o[a1->v1 ... ak->vk]) = Iframe(I(o))({<I(a1),I(v1)>, ..., <I(an),I(vn)>})
Here {...} denotes a bag of attribute/value pairs. - I(c1##c2) = Isub(I(c1), I(c2));
- I(o#c) = Iisa(I(o), I(c));
- I(x=y) = I=(I(x), I(y));
- I(External(t)) = Iexternal(σ)(I(s1), ..., I(sn)), if t is an instance of the external schema σ = (?X1 ... ?Xn; τ) by substitution ?X1/s1 ... ?Xn/s1.
Note that, by definition, External(t) is well-formed only if t is an instance of an external schema. Furthermore, by the definition of coherent sets of external schemas, t can be an instance of at most one such schema, so I(External(t)) is well-defined.
The effect of datatypes. The set DTS must include the datatypes described in Section Primitive Datatypes of RIF data types and builtins [RIF-DTB].
The datatype identifiers in DTS impose the following restrictions. Given dt ∈ DTS, let LSdt denote the lexical space of dt, VSdt denote its value space, and Ldt: LSdt → VSdt the lexical-to-value-space mapping (for the definitions of these concepts, see Section Primitive Datatypes of RIF data types and builtins [RIF-DTB]. Then the following must hold:
- VSdt ⊆ Dind; and
- For each constant "lit"^^dt such that lit ∈ LSdt, IC("lit"^^dt) = Ldt(lit).
That is, IC must map the constants of a datatype dt in accordance with Ldt.
RIF-PRD does not impose restrictions on IC for constants in symbol spaces that are not datatypes included in DTS.
2.2.2 Interpretation of condition formulas
This section defines how a semantic structure, I, determines the truth value TValI(φ) of a condition formula, φ. In PRD a semantic structure is represented as a Herbrand interpretation.
We define a mapping, TValI, from the set of all condition formulas to TV. Note that the definition implies that TValI(φ) is defined only if the set DTS of the datatypes of I includes all the datatypes mentioned in φ and Iexternal is defined on all externally defined functions and predicates in φ.
Definition (Truth valuation). Truth valuation for well-formed condition formulas in RIF-PRD is determined using the following function, denoted TValI:
- Positional atomic formulas: TValI(r(t1 ... tn)) = Itruth(I(r(t1 ... tn)));
- Atomic formulas with named arguments: TValI(p(s1->v1 ... sk->vk)) = Itruth(I(p(s1->v1 ... sk->vk)));
- Equality: TValI(x = y) = Itruth(I(x = y)).
To ensure that equality has precisely the expected properties, it is required that:- Itruth(I(x = y)) = t if I(x) = I(y) and that Itruth(I(x = y)) = f otherwise. This is tantamount to saying that TValI(x = y) = t iff I(x) = I(y);
- Subclass: TValI(sc ## cl) = Itruth(I(sc ## cl)).
To ensure that the operator ## is transitive, i.e., c1 ## c2 and c2 ## c3 imply c1 ## c3, the following is required:- For all c1, c2, c3 ∈ D, if TValI(c1 ## c2) = TValI(c2 ## c3) = t then TValI(c1 ## c3) = t;
- Membership: TValI(o # cl) = Itruth(I(o # cl)).
To ensure that all members of a subclass are also members of the superclass, i.e., o # cl and cl ## scl implies o # scl, the following is required:- For all o, cl, scl ∈ D, if TValI(o # cl) = TValI(cl ## scl) = t then TValI(o # scl) = t;
- Frame: TValI(o[a1->v1 ... ak->vk]) = Itruth(I(o[a1->v1 ... ak->vk])).
Since the bag of attribute/value pairs represents the conjunctions of all the pairs, the following is required, if k > 0:- TValI(o[a1->v1 ... ak->vk]) = t if and only if TValI(o[a1->v1]) = ... = TValI(o[ak->vk]) = t;
- Externally defined atomic formula: TValI(External(t)) = Itruth(Iexternal(σ)(I(s1), ..., I(sn))), if t is an atomic formula that is an instance of the external schema σ = (?X1 ... ?Xn; τ) by substitution ?X1/s1 ... ?Xn/s1.
Note that, by definition, External(t) is well-formed only if t is an instance of an external schema. Furthermore, by the definition of coherent sets of external schemas, t can be an instance of at most one such schema, so I(External(t)) is well-defined; - Conjunction: TValI(And(c1 ... cn)) = t if and only if TValI(c1) = ... = TValI(cn) = t. Otherwise, TValI(And(c1 ... cn)) = f.
The empty conjunction is treated as a tautology, so TValI(And()) = t; - Disjunction: TValI(Or(c1 ... cn)) = f if and only if TValI(c1) = ... = TValI(cn) = f. Otherwise, TValI(Or(c1 ... cn)) = t.
The empty disjunction is treated as a contradiction, so TValI(Or()) = f; - Negation: TValI(Not(c)) = f if and only if TValI(c) = t. Otherwise, TValI(Not(c)) = t;
- Existence: TValI(Exists ?v1 ... ?vn (φ)) = t if and only if for some I*, described below, TValI*(φ) = t.
Here I* is a semantic structure of the form <TV, DTS, D, Dind, Dpred, IC, I*V, IP, Iframe, INP, Isub, Iisa, I=, Iexternal, Itruth>, which is exactly like I, except that the mapping I*V, is used instead of IV. I*V is defined to coincide with IV on all variables except, possibly, on ?v1,...,?vn. ☐
2.2.3 Satisfaction of a condition
We now define what it means for a set of ground formulas to satisfy a condition formula. The satisfaction of condition formulas by a set of ground formulas provides formal underpinning to the operational semantics of rulesets interchanged using RIF-PRD.
Definition (State).
A state S is a Herbrand Interpretation IH. ☐
Definition (Condition Satisfaction). A condition formula, φ is satisfied under variable assignment σ in a state S, written as S |= φ[σ], iff TValS(φ[σ]) = t. ☐
2.2.4 Matching substitution
At the syntactic level, the interpretation of the variables by a valuation function IV is realized by a substitution. The matching substitution of constants to variables, as defined below, provides the formal link between the model-theoretic semantics of condition formulas and the operational semantics of rule sets in RIF-PRD.
Let Term be the set of the terms in the RIF-PRD condition language (as defined in section Terms).
Definition (Substitution). A substitution is a finitely non-identical assignment of terms to variables; i.e., a function σ from Var to Term such that the set {x ∈ Var | x ≠ σ(x)} is finite. This set is called the domain of σ and denoted by Dom(σ). Such a substitution is also written as a set such as σ = {ti/xi}i=0..n where Dom(σ) = {xi}i=0..n and σ(xi) = ti, i = 0..,n. ☐
Definition (Ground Substitution). A ground substitution is a substitution σ that assigns only ground terms to the variables in Dom(σ): ∀ x ∈ Dom(σ), Var(σ(x)) = ∅ ☐
Notice that since RIF-PRD covers only externally defined interpreted functions, a ground term can always be replaced by a constant. In the remainder of this document, it will always be assumed that a ground substitution assigns only constants to the variables in its domain.
Definition (Matching Substitution). Let ψ be a condition formula, and φ be a set of ground formulas that satisfies ψ. We say that ψ matches φ with substitution σ : Var -> Terms if and only if there is a syntactic interpretation I such that for all ?xi in Var(σ), I(?xi) = I(σ(?xi)). ☐
3 Actions
This section specifies the action part of the rules that can be serialized using RIF-PRD (the conclusion of a production rule is often called the action part, or, simply, the action; the then part, with reference to the if-then form of a rule statement; or the right-hand side, or RHS. In the latter case, the condition is usually called the left-hand side of the rule, or LHS). Specifically, this section specifies:
- the abstract syntax that all production rule languages interchanging rules using RIF-PRD must have in common for expressing actions;
- and the intended semantics of the individual action formulas in a RIF-PRD document.
In production rule systems, the action part of the rules is used, in particular, to add, delete or modify facts in the data source with respect to which the condition of rules are evaluated and the rules instantiated. As a rule interchange format, RIF-PRD does not make any assumption regarding the nature of the data source that the producer or the consumer of a RIF-PRD document uses (e.g. a rule engine's working memory, an external data base, etc). As a consequence, the syntax of the actions that RIF-PRD supports are defined with respect to the RIF-PRD condition formulas that represents the facts that the actions are intended to affect. In the same way, the semantics of the actions is specified in terms of how the effects of their execution are intended to affect the evaluation of rule condition.
Editor's Note: This version of the draft specifies only a very limited set of basic atomic actions. Future draft will extend that set, in particular to support actions whose effect is not, or not only, to modify the fact base.
3.1 Abstract syntax
For a production rule language to be able to interchange rules using RIF-PRD, its alphabet for expressing the action part of a rule must, at the abstract syntax level, consist of syntactic constructs to denote:
- the assertion of a positional atom, an atom with named arguments, or a frame, membership, or subclass atomic formula;
- the retraction of a positional atom, an atom with named arguments, or a frame;
- the addition of a new frame object;
- the removal of a frame object and the retraction of the corresponding frame and class atomic formulas;
- a sequence of these actions, including local variables and a mechanism to bind a local variable to a frame slot value or a new frame object.
Editor's Note: These actions may seem foreign to a reader who is familiar with typical production rule language actions, such as assert an object, modify/update a field of an object, and retract/remove an object. As noted in Section Atomic formulas, in this draft, objects are modeled using frame, membership, and subclass formulas that are re-used from RIF-BLD and RIF-Core. Therefore, the object-oriented actions are defined to act upon frame, membership, and subclass relations. Mappings from a typical Java-like object model to RIF-PRD maps "instanceof" to membership, "extends" and "implements" to subclass, and object properties/fields to frame slots. To assert an object requires asserting both its class membership and its frame slots. To modify a slot, e.g. change color from red to blue, requires retracting the old frame slot and asserting the new frame slot. Indeed, frame slots are multi-valued and, therefore, merely asserting a frame slot does not overwrite a prior value, it adds to the set of values. Frame slots do not inherently constrain either the datatype or the cardinality of their values. Future drafts will address the issue of object representation in RIF-PRD. We are open to suggestions.
3.1.1 Atomic actions
Atomic action constructs take constructs from the RIF-PRD condition language as their arguments.
Definition (Atomic action). An atomic action can have several different forms and is defined as follows:
- Assert: If φ is a positional atom, an atom with named arguments, a frame, a membership atomic formula, or a subclass atomic formula in the RIF-PRD condition language, then Assert(φ) is an atomic action. φ is called the target of the action.
- Retract: If φ is a positional atom, an atom with named arguments, or a frame in the RIF-PRD condition language, then Retract(φ) is an atomic action. φ is called the target of the action.
- Retract object: If t is a term in the RIF-PRD condition language, then Retract(t) is an atomic action. t is called the target of the action. ☐
Editor's Note: Whether and under what restrictions, if any, membership and subclass atomic formulas are allowable targets for atomic assert actions is still under discussion in the working group. We welcome feedback on that issue.
Definition (Ground atomic action). An atomic action with target t is a ground atomic action if and only if Var(t) = ∅. ☐
3.1.2 Action blocks
The action block is the top level construct to represent the conclusions of the production rules that can be serialized using RIF-PRD. An action block contains a non-empty sequence of atomic actions. It may also include action variable declarations.
The action variable declaration construct is used to declare variables that are local to the action block, called action variables, and to assign them a value within the action block.
Editor's Note: This version of RIF-PRD supports only a limited mechanism to initialize local action variables. Action variables may be bound to newly created frame objects or to slot values of frames. Future versions may support different or more elaborate mechanisms.
Definition (Action variable declaration). An action variable declaration is a pair, (v p) made of an action variable, v, and an action variable binding (or, simply, binding), p, where p has one of two forms:
- frame object declaration: if the action variable, v, is to be assigned the identifier of a new frame, then the action variable binding is a frame object declaration: New(v). In that case, the notation for the action variable declaration is: (?o New(?o));
- frame slot value: if the action variable, v, is to be assigned the value of a slot of a ground frame, then the action variable binding is a frame: p = o[s->v], where o is a term that represents the identifier of the ground frame and s is a term that represents the name of the slot. The associaed notation is: (?value o[s->?value]). ☐
Definition (Action block). If (v1 p1), ..., (vn pn), n ≥ 0, are action variable declarations, and if a1, ..., am, m ≥ 1, are atomic actions, then
Do((v1 p1), ..., (vn pn) a1 ... al) denotes an action block. ☐
3.1.3 Well-formed action blocks
The specification fo RIF-PRD does not assign a standard meaning to all the action blocks that can be standardized using its concrete XML syntax. Action blocks that can be meaningfully serialized are called well-formed. The notion of well-formedness, already defined for condition formulas, is extended to atomic actions, action variable declarations and action blocks.
The main restrictions are that one and only one action variable bindings can assign a value to each action variable binding, and that the assertion of a membership atomic formula is meaningful only if for a new frame object.
Definition (Well-formed atomic action). An atomic action is well-formed if and only if one of the following is true:
- it is an Assert and its target is a well-formed atom (positional or with named arguments), or a well-formed frame, membership or subclass atomic formula;
- it is a Retract and its target is a well-formed term or a well-formed atom (positional or with named arguments), or a well-formed frame atomic formula. ☐
Definition (Well-formed action variable declaration). An action variable declaration (?v p) is well-formed if and only if one of the following is true:
- the action variable binding, p, is the declaration of a new frame object: p = New(?v), and its argument is the action variable that is declared in the same action variable declaration, ?v;
- the action variable binding, p, is a well formed frame atomic formula, p = o[a1->t1...an->tn], n ≥ 1, and the action variable, v occurs in the position of a slot value, and nowhere else, that is: v ∉ Var(o) ∪ Var(a1) ∪ ... ∪ Var(an) and v ∈ {t1 ... tn}. ☐
For the definition of a well-formed action block, the function Var(f), that has been defined for condition formulas, is extended to atomic actions and frame object declarations as follows:
- if f is an atomic action with target t, then Var(f) = Var(t);
- if f is a frame object declaration, New(?v), then Var(f) = {?v}.
Definition (Well-formed action block). An action block is well-formed if and only if all of the following is true:
- all the action variable declarations, if any, are well-formed;
- each action variable, if any, is assigned a value by one and only one action binding, that is: if b1 = (v1 p1) and b2 = (v2 p2) are two action variable declarations in the action block, then v2 ∉ Var(p1) if v1 ∈ Var(p2), and, reciprocally, v1 ∉ Var(p2) if v2 ∈ Var(p1);
- all the actions in the action block are well-formed atomic actions;
- if an atomic action in the action block, a, asserts a membership atomic formula, a = Assert(t1 # t2), then the object term in the membership atomic formula, t1, is an action variable that is declared in the action block and the action variable binding is the declaration of a new frame object. ☐
Definition (RIF-PRD action language). The RIF-PRD action language consists of the set of all the well-formed action blocks. ☐
3.2 Operational semantics of atomic actions
This section specifies the intended semantics of the atomic actions in a RIF-PRD document.
The effect intended of the ground atomic actions in the RIF-PRD action language is to modify the state of the fact base, in such a way that it changes the set of conditions that are satisfied before and after each atomic action is performed.
As a consequence, the intended semantics of the ground atomic actions in the RIF-PRD action language determines a relation, called the RIF-PRD transition relation: →RIF-PRD ⊆ W × L × W, where W denotes the set of all the states of the fact base, and where L denotes the set of all the ground atomic actions in the RIF-PRD action language.
Since the satisfaction of condition formulas is defined with respect to the Herbrand interpretation of ground formulas (Section Satisfaction of a condition), we will assume in the following that the states of the fact base are represented by such sets, for the purpose of specifying the intended operational semantics of atomic actions, or rules and of rule sets serialized using RIF-PRD.
Definition (RIF-PRD transition relation). The intended semantics of RIF-PRD atomic actions is completely specified by the transition relation →RIF-PRD ⊆ W × L × W. (w, α, w') ∈ →RIF-PRD if and only if w ∈ W, w' ∈ W, α is a ground atomic action, and one of the following is true:
- α is Assert(φ), where φ is a ground atomic formula, and w' = w + φ;
- α is Retract(φ), where φ is a ground atomic formula, and w' = w - φ;
- α is Retract(o), where o is a constant, and w' = w - {o[s->v] | for all the values of terms s and v} - {o#c | for all the values of term c}. ☐
Rule 1 says that all the condition formulas that were satisfied before an assertion will be satisfied after, and that, in addition, the condition formulas that are satisfied by the asserted ground formula will be satisfied after the assertion.
Rule 2 says that all the condition formulas that were satisfied before a retraction will be satisfied after, except if they are satisfied only by the retracted fact.
Rule 3 says that all the condition formulas that were satisfied before the removal of a frame object will be satisfied after, except if they are satisfied only by one of the frame or membership formulas about the removed object or a conjunction of such formulas.
4 Production rules and rulesets
This section specifies the rules and rulesets that can be serialized using RIF-PRD, by specifying:
- the abstract syntax that all production rule languages interchanging rules using RIF-PRD must have in common for rules and rule sets;
- and the intended semantics of the rules and ruleset in a RIF-PRD document.
4.1 Abstract syntax
For a production rule language to be able to interchange rules using RIF-PRD, in addition to the RIF-PRD condition and action languages, its alphabet must, at the abstract syntax level, contain syntactic constructs:
- to associate a condition and an action block into a rule;
- to declare the variables that are free in a rule, to specify their bindings, and to associate them with that rule into a rule with less free variables;
- to group rules and to associate specific operational semantics to groups of rules.
4.1.1 Rules
Definition (Rule). A rule can be either:
- an unconditional action block;
- a conditional action block: if condition is a formula in the RIF-PRD condition language, and if action is a well-formed action block, then If condition, Then action is a conditional action;
- a rule with bound variables: if ?v1 ... ?vn, n > 0, are variables; p1 ... pm, m ≥ 0, are condition formulas (called binding patterns), and rule is a rule, then Forall ?v1...?vn (p1...pm) (rule) is a rule. ☐
RIF-BLD compatibility. A rule If condition, Then action can be equivalently written action :- condition, that is, using RIF-BLD notation. Indeed, the normative XML syntax is the same for a conditional assertion in RIF-BLD and for a conditional action in RIF-PRD. The use of RIF-BLD notation is especially useful if the condition formula, condition, contains no negation, and if the action block, action, contains only Assert atomic actions. The use of the same notation emphasizes that such a rule has the same semantics in RIF-PRD and RIF-BLD.
Editor's Note: At this stage, the above assertion regarding the equivalence of a specific fragment of RIF-PRD and RIF-BLD is mostly the statement of an objective of the working group. That issue will be addressed more completely in a future version of this document.
To emphasize that equivalence even further, an action block can be written as simply And(φ1 ... φn), if it contains only atomic assert actions: Do(Assert(φ1) ... Assert(φn)), n ≥ 1. If the action block consists of a single atomic assert action: Do(Assert(φ)), then it can be written as simply φ.
Notice that the notation for a rule with bound variables uses the keyword Forall for the same reasons, that is, to emphasize the overlap with RIF-BLD. Indeed, Forall does not indicate the universal quantification of the declared variables, in RIF-PRD, but merely that the execution of the rule must be considered for all their bindings as constrained by the binding patterns. However, when no negation is used in the conditions and only assertions in the actions, the XML serialization of a RIF-PRD rule with bound variables is exactly the same as the XML serialization of a RIF-BLD universally quantified rule, and their semantics coincide.
4.1.2 Groups
As was already mentioned in Section Overview, production rules have an operational semantics that can be described in terms of matching rules against states of the fact base, selecting rule instances to be executed, and executing rule instances' actions to transition to new states of the fact base.
When production rules are interchanged, the intended rule instance selection strategy, often called the conflict resolution strategy, need be interchanged along with the rules : in RIF-PRD, the group is the construct that is used to group sets of rules and to associate them with a conflict resolution strategy. Many production rule systems use priorities associated with rules as part of their conflict resolution strategy: in RIF-PRD, the group is also used to carry the priority information that may be associated to the interchanged rules.
Definition (Group). If strategy is an IRI that denotes a conflict resolution strategy, if priority is an integer, and if each rgj, 0 ≤ j ≤ n, is either a rule or a group, then any of the following is a group:
- Group rg0 ... rgn, n ≥ 0;
- Group strategy rg0 ... rgn, n ≥ 0;
- Group priority rg0 ... rgn, n ≥ 0;
- Group strategy priority rg0 ... rgn, n ≥ 0. ☐
4.1.3 Well-formed rules and groups
The function Var(f), that has been defined for condition formulas and extended to actions, is further extended to rules, as follows:
- if f is an action block that declares action variables ?v1 ... ?vn, n ≥ 0, and that contains actions a1 ... am, m ≥ 1, then Var(f) = U1 ≤ i ≤ m Var(ai) - {?v1 ... ?vn};
- if f is an conditional action where c is the condition formula and a is the action, then Var(f) = Var(c) ∪ Var(a);
- if f is a quantified rule where ?v1 ... ?vn, n > 0, are the declared variables; p1 ... pm, m ≥ 0, are the binding patterns, and r is the rule, then Var(f) = (Var(r) ∪ Var(p1) ∪ ... ∪ Var(pm)) - {?v1 ... ?vn}.
Definition (Well-formed rule). A rule, r, is a well-formed rule if and only if it contains no free variable, that is, Var(r) = ∅, and either:
- it is an unconditional well-formed action block, a;
- or it is a conditional action where the condition formula, c, is a well-formed condition formula, and the action block, a, as a well-formed action block, and no atomic action in a has a subclass atomic formula as its target;
- or it is a quantified rule, Forall V (P) (r), and the quantified rule, r is a well-formed rule, and each of the declared variables in V = {?vi}0 ≤ i ≤ n is free in some of the binding patterns in P = {pj}0 ≤ j ≤ m or in the quantified rule, r; that is, V ⊆ Var(r) ∪ Var(p1) ∪ ... ∪ Var(pm), m ≥ 0. ☐
Definition (Well-formed group). A well-formed group is either a group that contains only well-formed rules and well-formed groups, or a group that contains no rule or group (an empty group). ☐
The set of the well-formed groups contains all the production rulesets that can be meaningfully interchanged using RIF-PRD.
4.2 Operational semantics of rules and rule sets
4.2.1 Motivation and example
As already mentioned in Section Overview, the description of a production rule system as a transition system can be used to specify the intended semantics that is associated with production rules and rulesets interchanged using RIF-PRD.
The intuition of describing a production rule system as a transition system is that, given a set of production rules RS and a fact base w0, the rules in RS that are satisfied, in some sense, in w0 determine an action a1, whose execution results in a new fact base w1; the rules in RS that are satisfied in w1 determine an action a2 to execute in w1, and so on, until the system reaches a final state and stops. The result is the fact base wn when the system stops.
Example 3.1. Judicael, a chicken and potato farmer, uses a rule based system to decide on the daily grain allowance for each of her chicken. Currently, Judicael's rule base contains one single rule, the chicken and mashed potatoes rule:
(* ex:ChickenAndMashedPotatoes *)
Forall ?chicken ?potato ?weight
(And(?chicken#ex:Chicken
(Exists ?age
And(?chicken[ex:age->?age]
External(pred:numeric-greater-than(?age, 8)))))
And(?potato#ex:Potato
ex:owns(?chicken ?potato)
(Exists ?weight
And(?potato[ex:weight->?weight]
External(pred:numeric-greater-than(?weight External(func:numeric-divide(?age 2)))))))
If And(External(pred:string-not-equals(External(ex:today()), "Tuesday"))
Not(External(ex:foxAlarm())))
Then Do( (?allowance ?chicken[ex:allowance->?allowance])
Execute(ex:mash(?potato))
Retract(?potato)
Retract(ex:owns(?chicken ?potato))
Retract(?chicken[ex:allowance->?allowance])
Assert(?chicken[ex:allowance->External(func:numeric-multiply(?allowance 1.1))]))
Judicael has four chickens, Jim, Jack, Joe and Julia, that own three potatoes (BigPotato, SmallPotato, UglyPotato) among them:
- Jim (daily grain allowance = 10) is 12 months old, Jack (daily grain allowance = 12) is 9 months old, Joe (daily grain allowance = 6) is 6 months old and Julia (daily grain allowance = 14) is 10 months old;
- BigPotato weights 70g, SmallPotato weights 10g, UglyPotato weights 50g;
- Jim owns BigPotato, Jack and Woof own SmallPotato jointly (Woof is the farm's dog. It inherited joint ownership of SmallPotato from its aunt Georgette) and Joe owns UglyPotato.
That is the initial set of facts w0.
When the rule is applied to w0:
- the first pattern selects {Jim/?chicken, Jack/?chicken, Julia/?chicken} as possible values for variable ?chicken (Joe is too young);
- the second pattern selects {(Jim/?chicken, BigPotato/?potato)} as the only possible substitution for the variables ?chicken and ?potato (UglyPotato does not belong to either Joe, Jack or Julia and SmallPotato is too small);
Suppose that Judicael's implementation of today() returns Monday and that the foxAlarm() is false when the Chicken and mashed potatoes rule is applied: the condition is satisfied, and the actions in the conclusion are executed with BigPotato substituted for ?potato, Jim substituted for ?chicken, and 10 substituted for ?allowance. This results in the following changes in the set of facts:
- BigPotato is mashed (and removed from the list of potatoes to be considered in future applications of the Chicken and mashed potatoes rule);
- the daily grain allowance of Jim is now 11;
- Jim does not own a potato anymore.
The resulting set of facts w1 is thus:
- Jim (daily grain allowance = 11) is 12 months old, Jack (daily grain allowance = 12) is 9 months old, Joe (daily grain allowance = 6) is 6 months old and Julia (daily grain allowance = 14) is 10 months old;
- SmallPotato weights 10g, UglyPotato weights 50g;
- Jack and Woof own SmallPotato jointly and Joe owns UglyPotato.
When the Chicken and mashed potatoes rule in applied to w1, the first pattern still selects {Jim/?chicken, Jack/?chicken, Julia/?chicken} as possible values for variable ?chicken, but the second pattern does not select any possible substitution for the couple (?chicken, ?potato) anymore: the rule cannot be satisfied, and the system, having detected a final state, stops.
The result of the execution of the system is w1. ☐
4.2.2 Definitions and notational conventions
More precisely, a production rule system is defined as a labeled terminal transition system (e.g. PLO04), for the purpose of specifying the intended semantics of a RIF-PRD rule or group of rules.
Definition (labeled terminal transition system): A labeled terminal transition system is a structure {C, L, →, T}, where
- C is a set of elements, c, called configurations, or states;
- L is a set of elements, a, called labels, or actions;
- → ⊆ C × L × C is the transition relation, that is: (c, a, c' ) ∈ → iff there is a transition labeled a from the state c to the state c' or, more appropriately in the case of a production rule system, the execution of action a in the state c causes a transition to state c' ;
- T ⊆ C is the set of final states, that is, the set of all the states c from which there are no transitions: T = {c ∈ C | ∀ a ∈ L, ∀ c' ∈ C, (c, a, c') ∉ →}. ☐
For many purposes, a representation of the states of the fact base is an appropriate representation of the states a production rule system seen as a transition system. However, the most widely used conflict resolution strategies require information about the history of the system, in particular with respect to the rule instances that have been selected for execution in previous states. Therefore, each state of the transition system used to represent a production rule system must keep a memory of the previous states and the rule instances that where selected and triggered the transition in those states.
To avoid the confusion between the states of the fact base and the states of the transition system, the latter will be called production rule system states.
Definition (Production rule system state). A production rule system state (or, simply, a system state), s, is characterized by
- a state of the fact base, facts(s);
- if s is not the initial state: a previous system state, previous(s), such that, given two system states s1 and s2, s1 = previous(s2) if and only if the production rule system the sequential execution of the action parts of the rule instances in picked(s1) transitioned the system from system state s1 to system state s2;
- if s is not the current state: the ordered set of rule instances, picked(s), that the conflict resolution strategy picked, among the all the rule instances that matched facts(s). ☐
In the following, we will write previous(s) = NIL to denote that a system state s is the initial state.
Here, a rule instance is defined as the result of the substitution of constants for all the rule variables in a rule.
Let R denote the set of all the rules in the rule language under consideration.
Definition (Rule instance). Given a rule, r ∈ R and a ground substitution, σ, such that Var(r) ⊆ Dom(σ), where Var(r) denotes the set of the rule variables in r, the result, ri = σ(r), of the substitution of the constant σ(?x) for each variable ?x ∈ Var(r) is a rule instance (or, simply, an instance) of r. ☐
Given a rule instance ri, let rule(ri) identify the rule from which ri is derived by substitution of constants for the rule variables, and let substitution(ri) denote the substitution by which ri is derived from rule(ri).
In the following, two rule instances ri1 and ri2 of a same rule r will be considered different if and only if substitution(ri1) and substitution(ri2) substitute a different constant for at least one of the rule variables in Var(r).
In the definition of a production rule system state, a rule instance, ri, is said to match a state of a fact base, w, if its defining substitution, substitution(ri), matches the RIF-PRD condition formula that represents the condition of the instantiated rule, rule(ri), to the ground formula that represents the state of facts w.
Let W denote the set of all the possible states of a fact base.
Definition (Matching rule instance). Given a rule, ri, and a state of the fact base, w ∈ W, ri is said to match w if and only if one of the following is true:
- rule(ri) is an unconditional action block;
- rule(ri) is a conditional action block: If condition, Then action, and substitution(ri) matches the condition formula condition to the ground condition formula that represents w;
- rule(ri) is a rule with bound variables: Forall ?v1...?vn (p1...pn) (r'), n ≥ 0, m ≥ 0, and substitution(ri) matches each of the condition formulas pi, 0 ≤ i ≤ m, to the ground condition formula that represents w, and the rule instance ri' matches w, where ri' is the instance of rule r' such that substitution(ri') = substitution(ri). ☐
Definition (Conflict set). Given a rule set, RS ⊆ R, and a system, s, the set, conflictSet(RS, s) of all the different instances of the rules in RS that match the state of the fact base, facts(s) ∈ W is called the conflict set determined by RS in s. ☐
In each non-final state, s, of a production rule system, a subset, picked(s), of the rule instances in the conflict set is selected and ordered; their action parts are instantiated, and they are executed. This is sometimes called: firing the selected instances.
Definition (Action instance). Given a system state, s; given a rule instance, ri, of a rule in a rule set, RS; and given the action block in the action part of the rule rule(ri): Do((v1 p1)...(vn pn) a1...am), n ≥ 0, m ≥ 1, where the (v1 p1), 0 ≤ i ≤ n, represent the action variable declarations and the aj, 1 ≤ j ≤ m, represent the sequence of atomic actions in the action block; if ri is a matching instance in the conflict set determined by RS in system state s: ri ∈ conflictSet(RS, s), the substitution σ = substitution(ri) is extended to the action variables v1...vn, n ≥ 0, in the following way:
- if vi is assigned the identifier of a new frame by the action variable declaration: (vi New(vi), then σ(vi) = cnew, where cnew is a constant of type rif:IRI that does not occur in any subformula of the ground formula that represents the state of the fact base that is associated to s, facts(s);
- if vi is assigned the value of a frame's slot by the action variable declaration: (vi o[s->vi]), then σ(vi) is a constant such that the frame formula o[s->vi]) matches the state of the fact base facts(s) with subtitution σ.
The sequence of ground atomic actions that is the result of substituting a constant for each variable in the atomic actions of the action block of the rule instance, ri, according to the extended substitution, is the action instance associated to ri. ☐
Let actions(ri) denote the action instance that is associated to a rule instance ri. By extension, given an ordered set of rule instances, ori, actions(ori) denotes the sequence of ground atomic actions that is the concatenation, preserving the order in ori, of the action instances associated to the rule instances in ori.
4.2.3 Operational semantics of a production rule system
All the elements that are required to define a production rule system as a labeled terminal transition system have now been defined.
Definition (RIF-PRD Production Rule System). A RIF-PRD production rule system is defined as a labeled terminal transition system PRS = {S, A, →PRS, T}, where :
- S is a set of system states;
- A is a set of transition labels, where each transition label is a sequence of ground RIF-PRD atomic actions;
- The transition relation →PRS ⊆ S × A × S, is defined as follows:
∀ (s, a, s' ) ∈ S × A × S, (s, a, s' ) ∈ →PRS if and only if all of the following hold:- (facts(s), a, facts(s')) ∈ →*RIF-PRD, where →*RIF-PRD denotes the transitive closure of the transition relation →RIF-PRD that is determined by the specification of the semantics of the atomic actions supported by RIF-PRD;
- a = actions(picked(s));
- T ⊆ S, a set of final system states. ☐
Intuitively, the first condition in the definition of the transition relation →PRS states that a production rule system can transition from one system state to another only if the state of facts in the latter system state can be reached from the state of facts in the former by performing a sequence of ground atomic actions supported by RIF-PRD, according to the semantics of the atomic actions.
The second condition states that the allowed paths out of any given system state are determined only by how rules instances are picked from the conflict set for execution by the conflict resolution strategy.
Given a ruleset RS ⊆ R, the associated conflict resolution strategy LS, and an initial state of the fact base, w ∈ W, the input function to a RIF-PRD production rule system is defined as:Therefore, the exact behavior of a RIF-PRD production rule system depends on:
- the conflict resolution strategy, that is, how rule instances are, precisely, selected for execution from the rule instances that match a given state of the fact base;
- and how the set T of final system states is, precisely, defined.
4.2.4 Conflict resolution
The process of selecting one or more rule instances from the conflict set for firing is often called: conflict resolution.
In RIF-PRD the conflict resolution algorithm (or conflict resolution strategy) that is intended for a set of rules is denoted by a keyword or a set of keywords that is attached to the rule set. In this version of the RIF-PRD specification, a single conflict resolution strategy is specified normatively: it is denoted by the keyword rif:forwardChaining (a constant of type rif:IRI), for it accounts for a common conflict resolution strategy used in most forward-chaining production rule systems.
Future versions of the RIF-PRD specification may specify normatively the intended conflict resolution strategies to be attached to additional keywords. In addition, RIF-PRD documents may include non-standard keywords: it is the responsability of the producers and consumers of such document to agree on the intended conflict resolution strategies that are denoted by such non-standard keywords.
Conflict resolution strategy: rif:forwardChaining
Most existing production rule systems implement conflict resolution algorithms that are a combination of the following elements (under these or other, idiosyncratic names; and possibly combined with additional, idiosyncratic rules):
- Refraction. The essential idea of refraction is that a given instance of a rule must not be fired more than once as long as the reasons that made it eligible for firing hold. In other terms, if an instance has been fired in a given state of the system, it is no longer eligible for firing as long as it satisfies the states of facts associated to all the subsequent system states;
- Priority. The rule instances are ordered by priority of the instantiated rules, and only the rule instances with the highest priority are eligible for firing;
- Recency. the rule instances are ordered by how long a rule instance has been continuously satisfied in the states of facts associated to previous system states, and only the most recent ones are eligible for firing.
The RIF-PRD keyword rif:forwardChaining denotes the common conflict resolution strategy that can be summarized as follows: given a conflict set
- Refraction is applied to the conflict set, that is, all the refracted rule instances are removed from the conflict set;
- The remaining rule instances are ordered by decreasing priority, and only the rule instances with the highest priority are kept in the conflict set;
- The remaining rule instances are ordered by decreasing recency, and only the most recent rule instances are kept in the conflict set;
- Any remaining tie is broken arbitrarily, and a single rule instance is kept for firing.
As specified earlier, picked(s) denotes the ordered list of the rule instances that were picked in a system state, s. Under the conflict resolution strategy denoted by rif:forwardChaining, the list denoted by picked(s) contains a single rule instance, for any given system state, s.
Given a system state, s, a rule set, RS, and a rule instance, ri ∈ conflictSet(RS, s), let recency(ri, s) denote the number of system states before s, in which ri has been continuously a matching instance: if s is the current system state, recency(ri, s) provides a measure of the recency of the rule instance ri. recency(ri, s) is specified recursively as follows:
- if previous(s) = NIL, then recency(ri, s) = 1;
- else if ri ∈ conflictSet(RS, previous(s)), then recency(ri, s) = 1 + recency(ri, previous(s));
- else, recency(ri, s) = 1.
In the same way, given an rule instance, ri, and a system state, s, let lastPicked(ri, s) denote the number of system states before s, since ri has been last fired. lastPicked(ri, s) is specified recursively as follows:
- if previous(s) = NIL, then lastPicked(ri, s) = 1;
- else if ri ∈ picked(previous(s)), then lastPicked(ri, s) = 1;
- else, lastPicked(ri, s) = 1 + lastPicked(ri, previous(s)).
Finally, given a rule instance, ri, let priority(ri) denote the priority that is associated to rule(ri), or zero, if no priority is associated to rule(ri). If rule(ri) is inside nested Groups, priority(ri) denotes the priority that is associated with the innermost Group to which a priority is explicitely associated, or zero.
Given a conflict set, cs, the conflict resolution strategy rif:forwardChaining can now be described with the help of four rules, where ri and ri' are rule instances:
- Refraction rule: if ri ∈ cs and lastPicked(ri, s) ≤ recency(ri, s), then cs = cs - ri;
- Priority rule: if ri ∈ cs and ri' ∈ cs and priority(ri) < priority(ri'), then cs = cs - ri;
- Recency rule: if ri ∈ cs and ri' ∈ cs and recency(ri, s) > recency(ri', s), then cs = cs - ri;
- Tie-break rule: if ri ∈ cs, then cs = {ri}.
The refraction rule removes the instances that have been in the conflict set in all the system states at least since they were last fired, that is, it removes the refracted instances from the current conflict set; the priority rule removes the instances such that there is at least one instance with a higher priority; the recency rule removes the instances such that there is at least one instance that is more recent; and the tie-break rule keeps one rule from the set, arbitrarily.
To select the singleton rule instance, picked(s), to be fired in a system state, s, given a rule set, RS, the conflict resolution strategy denoted by the keyword rif:forwardChaining consists in the following sequence of steps:
- start with the conflict set, cs, that a rule set RS determines in a system state s: cs = conflictSet(RS, s);
- apply the refraction rule to all the rule instances in cs;
- then apply the priority rule to all the remaining instances in cs;
- then apply the recency rule to all the remaining instances in cs;
- then apply the tie-break rule.
4.2.5 Halting test
Editor's Note: This section is still under discussion (see ISSUE-65). This version specifies a single, default halting test: future version of this draft may specify additional halting tests, and/or a different default. The Working Group seeks feedback on which halting tests and which combinations of tests should be supported by RIF-PRD and/or required from RIF-PRD implementations; and which halting test should be the default, if any.
By default, a system state is final, given a rule set, RS, and a conflict resolution strategy, LS, if there is no rule instance available for firing after application of the conflict resolution strategy.
For the conflict resolution strategy identified by the RIF-PRD keyword rif:forwardChaining, a system state, s, is final given a rule set, RS if and only if the remaining conflict set is empty after application of the refraction rule to all the rule instances in conflictSet(RS, s). In particular, all the system states, s, such that conflictSet(RS, s) = ∅ are final.
5 XML Syntax
This section specifies a common concrete XML syntax to serialize any production rule set written in a language that share the abstract syntax speicifed in section 4.1, provided that its intended semantics agrees with the semantics that is described in section 4.2.
In the following, after the notational conventions are introduced, we specify the RIF-PRD XML constructs that carry a normative semantics with respect to the intended interpretation of the interchanged rules. They are specified with respect to the abstract syntax, and their specification is structured according to the specification of the abstract syntax in sections 2.1, 3.1 and 4.1.
The root element of any RIF XML document, Document and other XML constructs that do not carry a normative semantics with respect to the intended interpretation of the interchanged rules are specified in the last sub-section.
5.1 Notational conventions
5.1.1 Namespaces
Throughout this document, the xsd: prefix stands for the XML Schema namespace URI http://www.w3.org/2001/XMLSchema#, the rdf: prefix stands for http://www.w3.org/1999/02/22-rdf-syntax-ns#, and rif: stands for the URI of the RIF namespace, http://www.w3.org/2007/rif#.
Syntax such as xsd:string should be understood as a compact URI (CURIE) -- a macro that expands to a concatenation of the character sequence denoted by the prefix xsd and the string string. The compact URI notation is used for brevity only, and xsd:string should be understood, in this document, as an abbreviation for http://www.w3.org/2001/XMLSchema#string.
5.1.2 BNF pseudo-schemas
The XML syntax of RIF-PRD is specified for each component as a pseudo-schema, as part of the description of the component. The pseudo-schemas use BNF-style conventions for attributes and elements: "?" denotes optionality (i.e. zero or one occurrences), "*" denotes zero or more occurrences, "+" one or more occurrences, "[" and "]" are used to form groups, and "|" represents choice. Attributes are conventionally assigned a value which corresponds to their type, as defined in the normative schema. Elements are conventionally assigned a value which is the name of the syntactic class of their content, as defined in the normative schema.
<!-- sample pseudo-schema -->
<defined_element
required_attribute_of_type_string="xs:string"
optional_attribute_of_type_int="xs:int"? >
<required_element />
<optional_element />?
<one_or_more_of_these_elements />+
[ <choice_1 /> | <choice_2 /> ]*
</defined_element>
5.1.3 Syntactic components
Three kinds of syntactic components are used to specify RIF-PRD:
- Abstract classes are defined only by their subclasses: they not visible in the XML markup and can be thought of as extension points. In this document, abstract constructs will be denoted with all-uppercase names;
- Concrete classes have a concrete definition and they are associated with specific XML markup. In this document, concrete constructs will be denoted with CamelCase names with leading capital letter;
- Properties, or roles, define how two classes relate to each other. They have concrete definitions and are associated with specific XML markup. In this document, properties will be denoted with camelCase names with leading smallcase letter.
5.2 Conditions
This section specifies the XML constructs that are used in RIF-PRD to serialize condition formulas.
5.2.1 TERM
The TERM class of constructs is used to serialize terms, be they simple terms, that is, constants and variables; or positional terms or terms with named arguments, both being, per the definition of a well-formed formula, representations of externally defined functions.
As an abstract class, TERM is not associated with specific XML markup in RIF-PRD instance documents.
[ Const | Var | External ]
5.2.1.1 Const
In RIF, the Const element is used to serialize a constant.
The Const element has a required type attribute and an optional xml:lang attribute:
- The value of the type attribute is the identifier of the Const symbol space. It must belong to the type xsd:anyURI. In the RIF data types and builtins document, the section about [DTB#Constants.2C_Symbol_Spaces.2C_and_Datatypes|Constants, Symbol spaces and Datatypes]] lists the builtin symbol spaces and data types that all implementations of RIF-PRD must support. Rule sets that are exchanged through RIF-PRD can use additional, user-defined symbol spaces;
- The xml:lang attribute, as defined by 2.12 Language Identification of XML 1.0 or its successor specifications in the W3C recommendation track, is optionally used to identify the language for the presentation of the Const to the user. It is allowed only in association with constants of the type rif:text. A compliant implementation MUST ignore the xml:lang attribute if the type of the Const is not rif:text.
The content of the Const element is the constant's litteral, which can be any Unicode character string.
<Const type=xsd:anyURI [xml:lang=xsd:language]? >
Any Unicode string
</Const>
\{\{EdNote|text=The case of non-standard data types, that is, of constants that do not belong or cannot be cast in one of RIF builtin types for interchange purposes, is still under discussion in the WG. The WG seeks feedback on whether they should be allowed and why.\}\}
Example 2.1. In each of the examples below, a constant is first described, followed by its serialization in RIF-PRD XML syntax.
a. A constant with builtin type xsd:integer and value 123:
<Const type="xsd:integer">123</Const>
b. A constant which symbol today is defined in Joe the Hen Public's namespace http://example.com/2008/joe#. The type of the constant is rif:iri:
<Const type="rif:iri"> http://example.com/2008/joe#today </Const>
c. A constant with symbol BigPotato that is local to the set of rules where i