% GBNF specification of FOL to supersume RuleML Lite V0.8 and SWRL % version 0.1 by Benjamin Grosof and Said Tabet created 6/4/04 GBNF specification of FOL Spirit is to err in the direction of parseability with precision, with the attitude to put more in the design of the markup so as to reduce the amount of extra parsing effort that each new implementation effort has to undertake. %%%% /* TODO's: CLEAN UP COMMENTS overall. ISSUE: should we define lvariable in manner of RuleML V0.8/V0.85 or in manner of SWRL? ISSUE: should conjunction and disjunction be generalized syntactically to permit more than two conjuncts or disjuncts respectively? It would be convenient, but involve perhaps some more syntactic hair. ISSUE: we need somehow to formally specify as a constraint, in universalsentence and in existentialsentence, that the variables free in the fowff all appear in the lvariablecollection, so that the universal/existential sentence expression is properly closed syntactically. (SCL also does not seem to specify this in the abstract syntax BNF directly.) ISSUE: should equality be treated specially in the syntax, as opposed to being treated simply as one particular predicate? ISSUE: shorter (e.g., acronymic) names for the XML markup, e.g., "fose" for "fosentence", "mise" for "Material Implication SEntence", etc. ISSUE: be consistent throughout wrt use of "_", e.g., cf. RuleML design convention style that bridges RDF and XML data models ISSUE: elaborate the specification somewhat in order to maximize compatibility with RuleML V0.8+ ISSUE: is it OK philosophically that there is some redundancy of expressiveness between axiom vs. {rule or fact}, just as there is in RuleML LP between rule vs. fact? We assumed the answer to be yes here. ISSUE: should we introduce @direction? One reason to is to supersume RuleML LP where it is useful. ISSUE: should we introduce _rbschema? One reason to is to supersume general-purpose RuleML where it is useful. issue for _slab: should axiom labels be permitted to have free variables? Note this is related to compatibility with RuleML LP. issue for _kbaselab: should a label cterm restricted to be ground? issue for this and for RuleML V0.9: should top-level of argument collection be presumed to be by default an ordered collection. Benjamin and Said think no. Harold in past thought yes. */ knowledgebase : _kbaselab? , _statements ; _kbaselab : ind | cterm ; _statements = (axiom | imp | fact)* ; /* Compared to hornlog: more kinds of statements: axiom for general FOL formulas. Compared to SCLP: fewer kinds of statements -- lacks: mutex for Courteous features, sensor and effector for Situated features. */ /* NB: Built-in sensors cf. SWRL V0.6 are indicated via special predicate. */ axiom : _slab? , fosentence ; /* "fo" stands acronymically for "First-Order logic". */ fosentence = (atomicsentence | conjunctivesentence | disjunctivesentence | negationsentence | implicationsentence | equivalencesentence | universalsentence | existentialsentence) /* Note that a special case of atomicsentence is an equalitysentence, because the equality predicate is a special case of a predicate. */ atomicsentence = groundatomicfowff | swrlV06axiom ; /* "fowff" stands acronymically for First-Order logic Well Formed Formula. */ groundatomicfowff : (rel , _groundargs) ; swrlV06axiom = ... /* see below; might want some renaming there, e.g., might want to make "axiom" be more general-purpose */ %%% /* SWRL V0.6 abstract syntax, for reference, is: axiom ::= rule rule ::= 'Implies(' [ URIreference ] { annotation } antecedent consequent ')' antecedent ::= 'Antecedent(' { atom } ')' consequent ::= 'Consequent(' { atom } ')' atom ::= description '(' i-object ')' | dataRange '(' d-object ')' | individualvaluedPropertyID '(' i-object i-object ')' | datavaluedPropertyID '(' i-object d-object ')' | sameAs '(' i-object i-object ')' | differentFrom '(' i-object i-object ')' | builtIn '(' builtinID { d-object } ')' i-object ::= i-variable | individualID d-object ::= d-variable | dataLiteral i-variable ::= 'I-variable(' URIreference ')' d-variable ::= 'D-variable(' URIreference ')' */ %%% _groundargs = (groundslottedargumentcollection | groundsequencedargumentcollection) ; groundslottedargumentcollection = (groundslottedargument)* ; groundslottedargument : (slotname , groundterm) ; groundterm = ind | groundcterm ; groundcterm : (ctor , _groundargs) ; groundsequencedargumentcollection = (groundterm)\* ; /* NOTATION: "\*" indicates a sequenced collection, i.e., one in which the order in which the members are listed is significant. */ conjunctivesentence : (andconnective , fosentence , fosentence) ; /* NB: for compatibility, beware that there are other versions of "and" in RuleML V0.8+ LP, e.g., andb, andh, ando ; likewise for versions of "or". */ disjunctivesentence : (orconnective , fosentence , fosentence) ; negationsentence : (classicalnegationconnective , fosentence) ; implicationsentence : (materialimplicationconnective , _antecedentfosentence , _consequentfosentence); antecedentfosentence = fosentence ; consequentfosentence = fosentence ; equivalencesentence : (equivalenceconnective, _lefthandfosentence , _righthandfosentence); lefthandfosentence = fosentence ; righthandfosentence = fosentence ; universalsentence : (forall , lvariablecollection , fowff ) ; /* NB: we need somehow to formally specify as a constraint that the variables free in fowff all appear in the lvariablecollection here, so that the universal sentence expression is properly closed syntactically. Likewise for existentialsentence below. */ lvariablecollection = (lvariable)* ; lvariable = ... /* TODO: define this as either lvar cf. RuleML, and/or define it as i-variable / d-variable cf. SWRL V0.6 */ existentialsentence : (exists , lvariablecollection , fowff ) ; /* "fowff" stands for "First Order logic Well Formed Formula"; it may have free variables */ fowff = (atomicfowff | conjunctivefowff | disjunctivefowff | negationfowff | implicationfowff | equivalencefowff | universalfowff | existentialfowff) /* Note that a special case of atomicsentence is an equalitysentence, because the equality predicate is a special case of a predicate. */ atomicfowff : (rel , _args) ; /* basically same as "atom" in RuleML V0.8/V0.85 */ _args = (slottedargumentcollection | sequencedargumentcollection) ; slottedargumentcollection = (slottedargument)* ; slottedargument : (slotname , term) ; term = ind | cterm ; sequencedargumentcollection = (term)\* ; /* REMINDER: NOTATION: "\*" indicates a sequenced collection, i.e., one in which the order in which the members are listed is significant. */ conjunctivefowff : (andconnective , fowff , fowff) ; disjunctivefowff : (orconnective , fowff, fowff) ; negationfowff : (classicalnegationconnective , fowff) ; implicationfowff : (materialimplicationconnective , _antecedentfowff , _consequentfowff); antecedentfowff = fowff ; consequentfowff = fowff ; equivalencefowff : (equivalenceconnective , _lefthandfowff , _righthandfowff); lefthandfowff = fowff ; righthandfowff = fowff ; universalfowff : (forall , lvariablecollection , fowff ) ; /* NB: somewhat different from in universalfosentence, in that we do NOT necessarily need somehow to formally specify as a constraint that the variables free in fowff all appear in the lvariablecollection here, so that the universal fowff expression is properly closed syntactically. Likewise for existentialfowff below. */ existentialfowff : (exists , lvariablecollection , fowff ) ; /* MORE STUFF IS BELOW, will get worked out in next major iteration of drafting. */ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%% TODO's in EDITING: 1. make sure that defined below are: - ind - ctor - slotname -- similar to arole below, and similar to _slot in RuleML V0.85 - lvariable 2. use appropriate parts of the following previously existing stuff from RuleML V0.8 abstract syntax. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% /* PREVIOUSLY EXISTING STUFF TO MINE AS APPROPRIATE FOR USEFUL LOWER-LEVEL PARTS OR EXTENSION FEATURES: the following is pasted from the RuleML V0.8 abstract syntax spec of spring 2003 by Benjamin Grosof (posted then to Joint Committee etc.). */ imp : _slab? , _head , _body ; _slab = ind | cterm ; /* _slab for STATEMENT label here generalizes _rlab for RULE label in RuleML V0.8. It is used for general-formula axioms not just rules and facts. */ fact : (_rlab? , _head ; /* issue: should a fact _head be restricted to be ground? ; some systems like to, others like not to */ /* usage issue: a fact can be represented as a axiom with empty body */ _axlab : ind | cterm ; /* note for default reasoning cf. courteous LP, it is convenient to permit the cterm to be non-ground, mentioning variables that also appear in the head and/or body of the rule */ _head : classicalliteral | andheadexpr ; /* compared to hornlog: AND (conjunction) is permitted in the head */ _body : literal | complexbodyexpr ; literal = classicalliteral | nafliteral ; /* compared to hornlog: literals are generalized to permit two kinds of negation sign: Negation As Failure (NAF) and a limited form of classical negation ("CNEG"), better viewed as "strong" negation. NAF may appear in the body. CNEG may appear in the head or the body. */ classicalliteral = cslit | atom ; /* ... | atom is for down-compatibility */ nafliteral = fclit | flit ; /* ... | flit is for down-compatibility */ complexbodyexpr = andbodyexpr | orbodyexpr ; /* compared to basic hornlog: body expressions are generalized to permit OR (disjunction), moreover nestedly with AND's, in the body */ andbodyexpr = andb | and ; /* ... | and is for down-compatibility */ orbodyexpr = orb ; andb : (literal | complexbodyexpr)* ; and : (atom)* ; /* recommend in revised design to eliminate and, replacing it by andb to improve elegance and up/down -compatibility */ /* actually, in SCLP dtd-v13 (SweetRules V1) the rhs is (atom | and)* which was motivated by down-compatibility */ orb : (literal | complexbodyexpr)+ ; /* + rather than * since empty OR should logically be false rather than true, which would be inconvenient */ /* actually, in SCLP dtd-v13 (SweetRules V1) it's +2 rather than + */ andheadexpr = andh ; andh : (literal | cslit | atom | andh)+ ; /* actually, in SCLP dtd-v13 (SweetRules V1) it's +2 rather than + */ atom : _opr , _args ; cslit : _opr , _args , @cneg? ; /* actually, in SCLP dtd-v13, cslit was named clit */ fclit : _opr , _args , @cneg?, @fneg? ; flit : _opr , _args , @fneg? ; @cneg : bool ; @cneg $ NO ; @fneg : bool ; @fneg $ NO ; /* actually, in SCLP dtd-v13, @cneg and @fneg did not have a default value */ bool = YES | NO ; _opr : rel ; /* recommend in revised design to make this = not : and recommend likewise similarly for _opc */ _args = (logicalterm | argumentcollection)\* ; /* implicitly top-level is an ordered tuple */ /* recommend in revised design that rhs be simply argumentcollection */ /* actually, in SCLP dtd-v13, rhs was simply (logicalterm)\* */ logicalterm = ind | var | cterm ; argumentcollection = tup | roli ; rel : constname ; constname = PCDATA , @href? ; /* recommend in revised design that this be instead: constname = PCDATA | (@href , nick?) ; nick : PCDATA ; when there's a href, this would clarify meaning of the PCDATA as nickname for the URI and recommend likewise similarly for ind, ctor, arole, meth, clas, path, and any other constants */ @href : URI ; URI = CDATA ; var : PCDATA ; ind : constname ; cterm : _opc , _args ; _opc : ctor ; ctor : constname ; tup : (logicalterm | argumentcollection)\* ; /* implicitly at top-level is an ordered tuple. permits nested tup's and roli's, in spirit of complex data structures / Object-Oriented / Lisp S-expressions. */ roli : (_arv)* ; _arv : arole, (logicalterm | argumentcollection) ; arole : constname ; /* recommend in revised design: consider XML Schemas or OWL ontologies for the substructure of argument collections and complex terms */ /* syntax for COURTEOUS follows. compared to hornlog: this is all added. */ mutex : _oppo , _mgiv? ; _oppo : ando ; /* recommend in revised design that this be = not : */ _mgiv : literal | complexbodyexpr ; /* recommend in revised design that this be = not : */ /* actually, in SCLP dtd-v13, rhs was (fclit | andb | orb), so this is more back-compatible */ ando : classicalliteral , classicalliteral ; /* note this requires cardinality 2, which is unusual, and thus requires care if/when doing a concrete syntax in OWL-Lite */ /* actually, in SCLP dtd-v13, rhs is simply (cslit, cslit) , which suffices for down-compatibility with hornlog */ /* plus: overrides is a syntactically reserved predicate that specifies the prioritization (partial ordering) among (default) rules. */ /* syntax for SITUATED follows. compared to hornlog: this is all added. */ sens : _opr , _aproc, _modli? ; effe : _opr , _aproc ; /* recommend in revised design that sens and effe perhaps be enhanced to permit a cneg sign so that cslit's can be directly sensed or effected. Upside would include syntactic convenience in writing rules. Downside would include a bit of added complexity in implementation. */ _aproc : jproc | uproc ; /* recommend in revised design that this be = not : */ uproc : constname ; /* recommend in revised design that web services (HTTP, SOAP, WSDL) be supported as a way to access and invoke attached procedures */ jproc : clas, meth, path? ; path : constname ; clas : constname ; meth : constname ; _modli : (bmode | bindingmodecollection)\* ; /* recommend in revised design that rhs be simply bindingmodecollection, similar to recommended revised treatment of argument collections in literals (i.e., of _args) */ /* actually, in SCLP dtd-v13, rhs was simply (bmode)\* */ /* usage note: if need to specify multiple binding patterns for the same (rel, aproc) pair, then make multiple sens statements, each with a different _modli . */ bindingmodecollection = bmtup | bmroli ; bmtup : (bmode | bindingmodecollection)\* ; bmroli : (_arbm)* ; _arbm : arole, (bmode | bindingmodecollection) ; bmode : @bval ; @bval : bind ; /* actually, in SCLP dtd-v13, @bval was named @val */ @bval $ FREE ; bind = BOUND | FREE ; /* Explanation of Abbreviations: knowledgebase = knowledge base of first-order formulas. direction = intended direction of inferencing. _kbaselab = knowledgebase label. Name of a knowledgebase. imp = implication rule. (Note it does not employ *material* implication cf. classical logic.) fact: Can be viewed logically as an implication rule that has an empty body. _head = head of a rule. A.k.a. the "consequent" or "then" part of the rule. _body = body of a rule. A.k.a. the "antecedent" or "if" part of the rule. _axlab = axiom label. Name of a axiom. andb = AND'd (i.e., conjunctive) expression permitted in the body. and = AND'd (i.e., conjunctive) expression - a particular kind that is permitted in the body. The "and" of some body atoms. Similar to andb, but simpler. Included for down-compatibility / back-compatibility, esp. with early versions of the Datalog and Hornlog sublanguages. atom = logical atom. An expression formed from a predicate applied to a collection of its (logical) arguments. _opr = relational operator expression. (This is for sake of upward expressive extensibility.) rel = relation. A logical predicate. var = variable. A logical variable. ind = individual. A logical individual. (Can be viewed logically as logical function whose arity is zero.) "Lloyd-Topor And-Or" ("LTAO") feature syntax follows: Lloyd-Topor transformations permit more expressive heads and bodies in LP, by expressively reducing them (with tractable computational effort) to the simpler, more basic LP expressive form in which a head consists of a single literal and a body consists of a conjunction of literals. In particular, it is straightforward to permit "OR" (disjunction) expressions in the body, and to permit "AND" (conjunction) expressions in the head. We call this the "Lloyd-Topor And-Or" ("LTAO") feature. orb = OR'd (i.e., disjunctive) expression permitted in the body. Note that andb's and orb's can be nested. andh = AND'd (i.e., conjunctive) expression permitted in the head. Negation-As-Failure (a.k.a. "Normal" or "Ordinary" LP) feature syntax follows: flit = negation-as-failure literal. A literal with negation-as-failure sign (fneg). fneg = negation-as-failure sign. The sign is either positive or negative. Negative ("YES" value of attribute) means negated. Positive ("NO" value of attribute) means unnegated, i.e., the same as if the negation symbol does not appear. bool = boolean. "Extended" LP feature syntax follows: Note SCLP permits literals to be classically negated, in the manner of "extended" LP originated by Gelfond & Lifschitz. This is a limited sense of classical negation; it is actually expressively inessential. Let "cneg" stand for classical negation. For every predicate P, " cneg P " is essentially treated as if it were rewritten " P' ", where " P' " is a newly introduced predicate symbol. cslit = classically signed literal, a.k.a. classical literal. A literal with classical-negation sign (cneg). fclit = negation-as-failure classically signed literal. A literal with both classical-negation sign (cneg) and negation-as-failure sign (fneg). cneg = classical negation sign. The sign is either positive or negative. Negative ("YES" value of attribute) means negated. Positive ("NO" value of attribute) means unnegated, i.e., the same as if the negation symbol does not appear. URI constants feature syntax follows: constname = (logical-) constant name. In this version, permits a URI. Complex (constructor) terms feature syntax follows: cterm = complex term. A logical term of the form "f(...)" where f is a ctor. _opc = constructor operator expression. (Similar in spirit to _opr.) ctor = constructor. A logical function. Object-oriented argument collections feature syntax follows: tup = tuple of arguments. An ordered collection of arguments. roli = role'd list of arguments. An unordered collection of arguments, with each member of the collecion being distinguished by an argument role name (arole). _arv = argument role-value pair. An argument (value) together with its indicating argument role (arole). arole = argument role. (See roli and _arv above.) "Query element" feature syntax follows: query = stored query specification. An EXPERIMENTAL feature. (If "query" element is indeed included in this version.) Note that in this DTD version, tup's and roli's can be nested. This is an EXPERIMENTAL aspect of the object-oriented argument collections feature. Courteous feature syntax follows: In Courteous LP, axioms are treated as defaults, and prioritized conflict handling can be represented. Priorities between rules are represented via a binary predicate "overrides" which takes rule labels as arguments. "overrides" is syntactically reserved, but is otherwise an ordinary predicate - it can appear in general-form rules and be reasoned about, thus "overrides" facts can be inferred. Higher priority rules defeat lower priority rules. The prioritization ordering is a partial ordering. mutex = mutual exclusion statement (a.k.a. "mutex"). A mutex is a kind of integrity constraint, used to specify the scope of conflict in Courteous logic programs. It contains an opposer part and a given-condition part. The opposer part consists of two (classical) literals. The given-condition part is similar to a rule body. The mutex specifies overall that it is a prohibited contradiction for the two literals to be concluded if the given-condition part is concluded (i.e., holds). The semantics of Courteous LP enforces consistency with respect to this integrity constraint. A typical usage of a mutex is to specify that a predicate is (partial-)functional, e.g., that one should not conclude two different values for the price of the same item. OWL (and Description Logic) has the same notion of (partial-) functionality as something one can specify for properties. _oppo = opposer part of mutex. (See mutex above.) _mgiv = given part of mutex. (See mutex above.) ando = AND'd (i.e., conjuctive) expression permitted in the opposer part of a mutex. (See mutex above.) Situated feature syntax follow: sens = sensor link statement. Sensing overall is the obtaining of facts from external attached procedures, during the testing of rules that contain particular kinds of literals. "External" here means external to the LP inferencing engine. A sensor link statement specifies an association of a predicate P to an attached procedure A. "A" is also known as a "sensor procedure". Essentially, A can be viewed as a queryable virtual knowledge base of facts about P. Let R be a rule in which literal L appears in the body, where L's predicate is P. During (situated LP) inferencing, when L's body is tested, A is invoked. A sensor link statement also specifies an optional binding restriction pattern: i.e., for each of P's arguments, it can restrict that argument to be bound (rather than a free variable or cterm containing a free variable) when A is invoked (i.e., queried). effe = effector link statement. Effecting overall is the performing of side-effectful actions, via invoking external attached procedures, triggered by the drawing of particular kinds of conclusions. "External" here means external to the LP inferencing engine. An effector link statement specifies an association of a predicate P to an attached procedure A. "A" is also known as an "effector procedure". Let literal L appear in the head of some rul(s), where L's predicate is P. During (situated LP) inferencing, suppose P(U) is concluded, where U is an instantiation of L's argument terms. Then A is invoked with instantiation U. Note in this version, the sensed or effected literal might not actually be permitted to be classically-negated (i.e., if no cneg is specified in the sens or effe statement). However, such a restriction is expressively inessential. _aproc = attached procedure. (See sens and effe above.) jproc = Java (attached) procedure. In the current EXPERIMENTAL design, a Java attached procedure is specified by its class, method, and (optional) path. uproc = "universal" (attached) procedure. I.e., a procedure specified via some kind of stringname or URI. This is an EXPERIMENTAL design. It is intended to support extensibility towards CGI's and Web services. _modli = binding mode list. A binding restriction pattern for a sensor. (See sens above.) bmode = binding mode. Indicates whether a particular argument is restricted to be bound vs. permitted to be a free variable or contain a free variable. (See sens above.) bval = binding value. (See bmode above.) bmtup = tuple (tup) of binding modes. bmroli = role'd list (roli) of binding modes. */ /* Comment: This was based on earlier RuleML V0.8 spec of SCLP (i.e., Situated Courteous Logic Programs), with the URI constants feature ("ur"), and with the object-oriented arguments collection feature ("roli's"), but without the experimental feature of a query statement ("query") element that is defined in some other versions of RuleML V0.8 Hornlog; with also disjunction permitted in rule bodies, and conjunction permitted in rule heads (the "ltao" feature). Note SCLP permits literals to be classically negated, in the manner of "extended" LP originated by Gelfond & Lifschitz. This is a limited sense of classical negation; it is actually expressively inessential. This version corresponds to the monolith V0.8 DTD for SCLP with ur and roli's. It updates the SCLP DTD (known as "SCLP dtd-v13" or "sclp-v13") used in the first versions of SweetRules and SweetJess. This version is called "SCLP dtd-v14" or "sclp-v14". */ /* ISSUE: Note that one aspect of Courteous LP not represented here in the GBNF is the syntactically reserved status of the "overrides" predicate that specifies the prioritization (partial ordering) among (default) rules. This should probably be identified via use of the RuleML namespace, e.g., an instance rulebase should use "ruleml:overrides" and define the ruleml namespace as "http://www.ruleml.org". */ /* Overall, the main substantial difference in this version from SCLP dtd-v13 is that argument collections are supported, i.e., there are roli's and explicit tuples (i.e., tup's) */ Comment: Let "cneg" stand for classical negation. For every predicate P, " cneg P " is essentially treated as if it were rewritten " P' ", where " P' " is a newly introduced predicate symbol. /* This version is somewhat sparse wrt design comments. But it includes comments that: - compare it to Hornlog V0.8 (file spec-hornlog+ur+roli-v08-monolith-v11.txt) - compare it to SCLP dtd-v13. - recommend design revisions and flag issues. */ /* Upper case denotes terminals. URI could be viewed as terminal, or one could add a macro equating/expanding it to CDATA. As in XML, CDATA is a string, and PCDATA is a string or markup. The rest (YES, NO, FORWARD, BACKWARD, BIDIRECTIONAL, BOUND, FREE) are tokens; each can be viewed as a fixed string. */ /* Note that left implicit below is a subtlety wrt restricting interleaving of sequencing. A macro symbol appearing in a containment rhs can be viewed as implicitly having an o-block (i.e., "[...]") around it. (See the description of GBNF for more discussion of interleaving and o-blocks.) E.g., consider the containment "atom : _opr , _args". "_args" here is a macro that may expand into multiple elements (e.g., multiple ind's). The element "_opr" is restricted to not be interleaved among the midst of these multiple elements; rather, it must appear either before all of them or after all of them. */