AI3:::Adaptive Information (Mike Bergman)

Syndicate content
Mike Bergman on the semantic Web and structured Web
Updated: 13 hours 29 min ago

KBpedia Relations, Part IV: The Detailed Relations Hierarchy

Tue, 06/27/2017 - 19:25

Download as PDF

Completing the Schema for the Relations Addition

In the previous part of this series I discussed the relations model that we are adding to the KBpedia knowledge structure for its upcoming version 1.50. This article continues that discussion by fleshing out the concept hierarchy for this relations addition. These changes are taking place within the upper structure of KBpedia, what we call the KBpedia Knowledge Ontology, or KKO. The addition of this relations model affects both the class hierarchy of KKO and its properties structure. Both aspects are discussed below.

The previous parts of this series introducing the new KBpedia v 150 provide the rationale for adding a relations or predications structure to KBpedia, as well as the rationale for grounding this structure in the universal categories and logic of relations developed by Charles Sanders Peirce. Let me expand on those arguments a bit further here.

Recall from the earlier parts in this discussion that we want to adopt a schema for relations because we are interested in assertions and propositions in our knowledge structures. As Peirce notes [1],”The unity to which the understanding reduces impressions is the unity of a proposition. This unity consists in the connection of the predicate with the subject; and, therefore, that which is implied in the copula, or the conception of being, is that which completes the work of conceptions of reducing the manifold to unity.” (CP 1.548). The previous part discussed the rationale for three main classes of relationship types and how they related to Peirce’s views on logic and the theory of signs (semiosis). It is now time to expand on those distinctions.

The Second Level of the Relations Hierarchy

At the top level, relations are defined in KBpedia as belonging to one of three main types: Attributes, which are how we describe and characterize individual things (using the shorthand of A:A); External Relations, which are how an individual thing may relate, interact or situate with regard to external things (using the shorthand of A:B); or Representations [3], which are how we may name or define, indicate or reference, or provide supporting information about an individual thing (using the shorthand of re:A). These three main categories capture every conceivable form of relation. Note that an individual thing may also include classes or types, as well as individuals. Through this means, we can talk about and refer to categories, concepts, classes or types when we need to consider them as things unto themselves.

These three main categories correspond to Peirce’s universal categories of Firstness, Secondness and Thirdness [2]. (Where appropriate, I also sometimes footnote some Peirce quotes relevant to the category at hand.) Using the same trichotomous approach to categorization [2], we can now expand these three main relational categories into a second level of nine categories:

Mid-level KBpedia Relations

For Attributes (A:A), the split is into these three categories:

  • Intrinsic — innate characteristics or essences of single entities or events (particulars). Example concepts include oneness, qualities, feelings, inherent, negation, is, has, intensional, naturalness, internal, innateness
  • Adjunctual — events that may occur to a single entities or events (particulars) that help characterize it. Example concepts include birth, death, marriage, events, accidents, surprises, happenings, extrinsic, adjunctual
  • Contextual — circumstances or placements of single entities or events (particulars) that help characterize it. Example concepts include space, time, continuity, contiguous, smooth, otherness, ratings, level, situational (w/ respect to A), sensible, contiguous, all placements thereto, derivative, classificatory, rankings.

For External Relations (A:B), the split is into these three categories:

  • Direct — a simple, direct relationship (no intermediaries) between two different objects (entities, events, or their types, considered as instances). Example concepts include is a, simple without parts, part of, members in types or classes, genealogical roles (parent, child, brother), identity, extensional
  • Copulative — relationships of combination, membership, quantity, action, or circumstance. Example concepts include accidental, real, place, time, situation, quantity, facets, aspects, conjunctive, one-to-many, many-to-one, sum of, contextual, verbs
  • Mediative — true, triadic external relations, such as “A gives B to C”; relationships of relevance, meaning or explanation – namely, thirdness – about subjects and types. Example concepts include concepts, generalities, similarity, genres, aspects, comparison, performance, thought, triadic, agreement/difference, placement in space/time (contiguity), conditional, reasoning, classification.

And, for Representations (re:A), the split is into these three categories:

  • Denotatives — icons or symbols that name or describe the subject. Example concepts include names, labels, images, descriptions, definitions, denotations, icons, designations, proper nouns
  • Indexes — indirect references or pointers that help situate or draw attention to the subject. Example concepts include URIs, identifiers, keys, indices, references, semes, propositions (w/o objects), codes, selections, directional, citations, pronouns [4]
  • Associatives — a situational and contextual assertion of proximity, affiliation or adjacency of the subject with regard to any contiguity. Example concepts include see also, lists, links (incoming + outgoing), associations, likenesses, resemblances [5].
The Third Level of the Relations Hierarchy

For the next level, we can continue with our process of categorization based on the universal categories applied to the nine categories listed above. Here is the schema that results from this approach:

Third Level of the KBpedia Relations Hierarchy

For the mid-level category of Intrinsics, the first of three categories under Attributes, here are the three subsidiary categorizations (representing the Firstness, Secondness and Thirdness, respectively):

  • Qualities — an internal characteristic or aspect of an object; collectively these define intensionally what kind of thing to which the object belongs, though that relationship is not intrinsic
  • Elementals — a contributing part of or integral input or aspect that adds to the understanding about the subject (A)
  • Configurations (forms) — forms or arrangements that are of the nature or perceivable of the subject (A).

For the mid-level category of Adjunctual, here are the three subsidiary categorizations:

  • Quantities — a characteristic of a subject (A) that is expressed as a number quantity
  • Eventuals — chance, accidental or planned occurrences that directly involve subject (A)
  • Extrinsics — external events or circumstances that directly involve subject (A) or help define the nature or reality of subject (A).

For the mid-level category of Contextual, here are the three subsidiary categorizations:

  • Situants — attributes or characteristics that help situate, or place in a locational context, the subject (A)
  • Ratings — an assigned value or characterization that orders subject (A) in relation to other subjects for a given attribute
  • Classifications — a characterization of subject (A) that involves evaluating subject (A) and providing a multi-factor typing, coding or value in relation to a given attribute or set of attributes.

For the mid-level category of Direct, the first of three categories under External Relations, here are the three subsidiary categorizations:

  • Equivalences — a simple, direct relatiionship between a subject (A) and an object (B) that asserts equalness or sameness
  • Parts — a simple, direct relationship where the object (B) is a part of or component of subject (A), including the idea of ‘whole’
  • Descendants — a simple, direct relationship where object (B) is a direct child or parent or subsumption (hyponym) or supersumption (hypernym) to subject (A).

For the mid-level category of Copulative, here are the three subsidiary categorizations:

  • Typings (is B) — this simple relation is for all of the is-a relations to types (B) for subject (A). Identities and common names fit into this category
  • ActionTypes — simple relations of energetics, perception or thought of subject (A) to some other object (B)
  • Conjoins — relations that involve the joining of a subject (A) to an object (B) via an intermediate object.

For the mid-level category of Mediative, here are the three subsidiary categorizations:

  • Comparisons — relations that compare, contrast or size up similarities or differences or overlaps between subject (A) and object (B)
  • Performances — relations of quantity or rank for how subject (A) performed in relation to object (B)
  • Circumstances — relations of subject (A) to external circumstances, situations or contexts.

For the mid-level category of Denotatives, the first of three categories under Representations, here are the three subsidiary categorizations:

  • Media — iconic images or sounds that invoke the identification with a given object or representation. Media in this sense draws attention to the object [6]
  • Labels — symbolic text strings that help to name or draw attention to a particular object
  • Descriptions — text strings that may be longer than labels and provide additional or contextual information or specify attributes about the object, beyond drawing attention.

For the mid-level category of Indexes, here are the three subsidiary categorizations:

  • Pointers — physical or symbolic indicators of a given thing and which draw attention to it
  • Identifiers — generally (unique) symbols or strings that provide a key to the given subject, often within some conventional scheme for generating and recognizing the token assigned
  • Codes — an assigned symbolic token or string that groups the object with similar items; the generation and interpretation of the token is (often) done in relation to an understood schema. Indexes are included in this category.

And, lastly, for the mid-level category of Associatives, here are the three subsidiary categorizations:

  • Lists — an aggregation, either ordered or unordered, of objects similar to one another with respect to given characters or types
  • Relateds (see also) — an indicator of some nature to other objects similar or related to the given object; the criteria and degree or strength of relationship between the items are indeterminate
  • Augments — an indicator to an external factor in relation to the object, which factor itself leads to still further explanations.

These categories, then, complete the three levels underneath the KBpedia Knowledge Ontology’s Predications branch (itself a Secondness under the basic Thirdness of Generals). We have established these as concepts within KKO such that we may reason over the ideas of these categories, and do other conceptual work with them. This conceptualization is helpful in order to consider the ideas of predicates and the relationships between them. However, for mapping properties from external sources, we need a parallel structure in terms of KKO properties.

An Analogous Structure for Properties

We thus created an analogous structure for properties within KBpedia. One useful addition that is forthcoming in the KBpedia v 1.50 release is a mapping of 2500 Wikidata properties to this property schema, which represents more than 95% of all property assignments to the 25 million+ entities within Wikidata. This is one of the first tangible expressions of the benefits of having a fully rational predicate structure within KBpedia.

Attributes, External Relations and Representations comprise OWL properties. In general, Attributes correspond to the OWL datatypes property; External Relations to the OWL object property; and Representations to the OWL annotation properties. These specific OWL terms are not used in our KKO grammar, however, because some attributes may be drawn from controlled vocabularies, such as colors or shapes, that can be represented as one of a list of attribute choices. In these cases, such attributes are defined as object properties. Nonetheless, the mappings of KKO’s grammar to existing OWL properties is quite close. In all cases, our mappings to external sources are done via the subPropertyOf relation in OWL.

Object and Data Properties

Because a given external non-annotation property (that is, all properties except Representations) may either refer to another object (IRI) or to a string or data value, we created parallel listings of object and datatype properties within KKO. Here is the listing for the object properties group as shown by a screen capture from Protégé:

A Matching KBpedia Object Properties Hierarchy

Note that Protégé lists its items alphabetically, rather than our preferred ordering of Firstness, Secondness and Thirdness. You also should note the parallelism to the concept hierarchy discussed above. Not shown is the similar datatype properties structure.

Annotation Properties

As noted, Representations correspond to annotation properties, and that listing is shown in this screen capture:

A Matching KBpedia Annotation Properties Hierarchy

In OWL2, annotations may be organized in a subPropertyOf manner, even though inferencing is not possible over this structure. The organization, though, is helpful to understand relationships (and they can independently be reasoned over with the matching concept structure) and may be used in search, SPARQL or external analytic scripts.

We also organize the SKOS and Dublin Core annotation properties used by KKO into the categorical structure of these Representations, as shown by the screen capture above. This organization helps elucidate the structure under the Representations branch using these commonly applied properties.

Some Caveats and Next Part

This completes the hierarchical specification of the Predications branch within KKO. These 39 categories (3 + 9 + 27), following the categorization approach suggested by Peirce’s universal categories, now complete our additions to this next version of KBpedia.

This addition has been under intense and active development for the past year, though it has been in the works for much longer than that. Over that time, the names of categories and, indeed, even the splits themselves, have changed frequently and evolved. Peirce’s Logic of Relatives [7], as one of the two main sources, is itself laid out in a dichotomous structure, though elsewhere in Peirce’s writings — as we showed in the previous parts — these are often expressed in Peirce’s more standard trichotomous structure. I am sure with use and more experience that we will see further refinements to the KBpedia relations schema as time goes on. Please do not be surprised if you see further changes.

We are now nearly complete with the advance material leading up the KBpedia v 1.50 release. We will have one next part in our series recapitulating the terminology and grammar for this new version. The announcement following that will finally release version 1.50.

This series on KBpedia relations covers topics from background, to grammar, to design, and then to implications from explicitly representing relations in accordance to the principals put forth through the universal categories by Charles Sanders Peirce. Relations are an essential complement to entities and concepts in order to extract the maximum information from knowledge bases. This series accompanies the next release of KBpedia (v 150), which includes the relations enhancements discussed. [1] Peirce citation schemes tend to use an abbreviation for source, followed by volume number using Arabic numerals followed by section number (such as CP 1.208) or page, depending on the source. For CP, see the electronic edition of The Collected Papers of Charles Sanders Peirce, reproducing Vols. I-VI, Charles Hartshorne and Paul Weiss, eds., 1931-1935, Harvard University Press, Cambridge, Mass., and Arthur W. Burks, ed., 1958, Vols. VII-VIII, Harvard University Press, Cambridge, Mass. [2] M.K. Bergman, 2016. “A Foundational Mindset: Firstness, Secondness, Thirdness,” AI3:::Adaptive Information blog, March 21, 2016. [3] “I confine the word representation to the operation of a sign or its relation to the object for the interpreter of the representation.” (CP 1.540) “A very broad and important class of triadic characters [consists of] representations. A representation is that character of a thing by virtue of which, for the production of a certain mental effect, it may stand in place of another thing.” (CP 1.564) “as my analysis makes it to be, that a percept contains only two kinds of elements, those of firstness and those of secondness, then the great overshadowing point of difference is that the perceptual judgment professes to represent something, and thereby does represent something, whether truly or falsely. This is a very important difference, since the idea of representation is essentially what may be termed an element of “Thirdness,” that is, involves the idea of determining one thing to refer to another.” (CP 7.630) [4] “But it would be difficult if not impossible, to instance an absolutely pure index, or to find any sign absolutely devoid of the indexical quality. Psychologically, the action of indices depends upon association by contiguity, and not upon association by resemblance or upon intellectual operations.” (CP 2.306) “Indices may be distinguished from other signs, or representations, by three characteristic marks: first, that they have no significant resemblance to their objects; second, that they refer to individuals, single units, single collections of units, or single continua; third, that they direct the attention to their objects by blind compulsion. But it would be difficult if not impossible, to instance an absolutely pure index, or to find any sign absolutely devoid of the indexical quality. Psychologically, the action of indices depends upon association by contiguity, and not upon association by resemblance or upon intellectual operations.” (CP 2.306) [5] “Association is the only force which exists within the intellect, and whatever power of controlling the thoughts there may be can be exercised only by utilizing these forces; indeed, the power, and even the wish, to control ourselves can come about only by the action of the same principles. Still, the force of association in its native strength and wildness is seen best in persons whose understandings are so little developed that they can hardly be said to reason at all. Believing one thing puts it into their heads to believe in another thing; but they know not how they come by their beliefs, and can exercise no control over the inferential process. These unconscious and uncontrolled reasonings hardly merit that name; although they are very often truer than if they were regulated by an imperfect logic, showing in this the usual superiority of instinct over reason, and of practice over theory.” (CP 7.453) [6] “The only way of directly communicating an idea is by means of an icon; and every indirect method of communicating an idea must depend for its establishment upon the use of an icon. Hence, every assertion must contain an icon or set of icons, or else must contain signs whose meaning is only explicable by icons. The idea which the set of icons (or the equivalent of a set of icons)contained in an assertion signifies may be termed the predicate of the assertion.” (CP 2.278) [7] C.S. Peirce, 1897. “Logic of Relatives,” The Monist Vol VII, No 2, pp. 161-217. See https://ia801703.us.archive.org/12/items/jstor-27897407/27897407.pdf

KBpedia Relations, Part III: A Three-Relations Model

Wed, 05/24/2017 - 16:15

Download as PDF

Attributes, External Relations and Representations Form the Trichotomy

The forthcoming release of KBpedia version 1.50 deals primarily with the addition of a relations schema to the knowledge structure. In the previous part of this series, I discussed the event-action model at the heart of the schema. Actions are external relations between two objects, including parts, for which we use the variable shorthand of A and B. In terms of the universal categories of Charles Sanders Peirce [1], these dyadic relations are a Secondness that we formally term External Relations, or A:B.

But external relations do not constitute the complete family of relations. External relations are but one form of predicate we require. We need relations that cover the full range of language usage, as well as the entire scope of OWL properties. KBpedia is written in OWL2, which is the semantic ontology language extension of RDF. We need to capture the three types of properties in OWL2, namely object properties, datatype properties, and annotation properties.

In Peircean terms, we thus need relations that characterize the subject itself (A:A), which are mostly datatype properties, as well as statements about the subject (re:A), which are annotation properties. Most external relations are represented by object properties in OWL2, but sometimes datatype properties are also used [2].

We call relations of subject characterizations Attributes, and these are a Firstness within Peirce’s universal categories. We call relations about the subject Representations, and these are a Thirdness within the universal categories. The purpose of this part in our series is to introduce and define these three main branches of relations — Attributes, External Relations, and Representations — within the KBpedia schema.

Nature of the Trichotomy

In accordance with the design basis of KBpedia, we use Peirce’s universal categories and his writings on logic and semiosis (signs) to provide the intellectual coherence for the design. For the analysis of relations, two additional Peirce manuscripts were closely studied. The first manuscript is the first one on logic relations by Peirce in 1870, which goes by the shorthand of DNLR [3]. The second manuscript was from nearly 20 years later, known simply as the “Logic of Relatives” [4]. These two manuscripts deal with the ideas of internal and external relations, or the Attributes and External Relations, respectively, discussed above, and relate more to the predicate side of propositions. However, we also need to reference or point to the subjects of the proposition and to predicates, for which the Representations portion applies. Our need here was to organize a full breadth of relations in context with the universal categories, the needs of knowledge representation, and the structure of properties within OWL2 [5]. The result, we think, is consistent with the Peircean architectonic, but modernized for KR purposes.

For example, Peirce notes “the law of logic governs the relations of different predicates of one subject” (CP 1.485). In expanding on this law he states:

“Now logical terms are of three grand classes.

“The first embraces those whose logical form involves only the conception of quality, and which therefore represent a thing simply as “a ──”. These discriminate objects in the most rudimentary way, which does not involve any consciousness of discrimination. They regard an object as it is in itself as such (quale); for example, as horse, tree, or man. These are absolute terms.

“The second class embraces terms whose logical form involves the conception of relation, and which require the addition of another term to complete the denotation. These discriminate objects with a distinct consciousness of discrimination. They regard an object as over against another, that is as relative; as father of, lover of, or servant of. These are simple relative terms.

“The third class embraces terms whose logical form involves the conception of bringing things into relation, and which require the addition of more than one term to complete the denotation. They discriminate not only with consciousness of discrimination, but with consciousness of its origin. They regard an object as medium or third between two others, that is as conjugative; as giver of ── to ──, or buyer of ── for ── from ──. These may be termed conjugative terms.

“The conjugative term involves the conception of third, the relative that of second or other, the absolute term simply considers an object. No fourth class of terms exists involving the conception of fourth, because when that of third is introduced, since it involves the conception of bringing objects into relation, all higher numbers are given at once, inasmuch as the conception of bringing objects into relation is independent of the number of members of the relationship. Whether this reason for the fact that there is no fourth class of terms fundamentally different from the third is satisfactory of not, the fact itself is made perfectly evident by the study of the logic of relatives.” (CP 3.63) [1]

We take the first “class” above as largely relating to the Attributes. The next two classes, including the conjugative terms, we relate to External Relations. To this we add the Representations as the Thirdness within our revised relations category. Each of these three categories is described more fully below with further discussion as to the rationale for these splits.

We think this organization of relational categories is consistent with Peirce’s thinking, even though he never had today’s concepts of computerized knowledge representation as an objective for his analysis. For example, he labeled one of his major sections “The Conceptions of Quality, Relation and Representation, Applied to this Subject.” (1867, “Upon Logical Comprehension and Extension”; CP 2.418) Thirty five years later, Peirce still held to this split,”. . . there are but three elementary forms of predication or signification, which as I originally named them (but with bracketed additions now made to render the terms more intelligible) were qualities (of feeling), (dyadic) relations, and (predications of) representations.” (1903, EP 424; CP 1.561)

And, of course, human intelligence and communication is a symbolic world. So, our computer-reasoning basis should also be geared to the manipulation of ideas, which in a knowledge context is the accumulation of (approximately) known false and known true assertions about the world. These are our statements or propositions or assertions. Peirce elaborates:

“Now every simple idea is composed of one of three classes; and a compound idea is in most cases predominantly of one of those classes. Namely, it may, in the first place, be a quality of feeling, which is positively such as it is, and is indescribable; which attaches to one object regardless of every other; and which is sui generis and incapable, in its own being, of comparison with any other feeling, because in comparisons it is representations of feelings and not the very feelings themselves that are compared [Attributes]. Or, in the second place, the idea may be that of a single happening or fact, which is attached at once to two objects, as an experience, for example, is attached to the experiencer and to the object experienced [External Relations]. Or, in the third place, it is the idea of a sign or communication conveyed by one person to another (or to himself at a later time) in regard to a certain object well known to both [Representations].” (CP 5.7) (Emphasis brackets added.)

Peirce’s recommendations as to how to analyze a question proceed from defining the domain and its relations (the speculative grammar) to the logical analysis of it, including hypotheses about still questionable areas or emerging from new insights or combinations. The methods of this progression should be purposeful and targeted to produce a better likelihood of economic results or outcomes. This overall process he called the pragmatic maxim, and is a key insight into Peirce’s reputation at the father of pragmatism.

The concepts above, then, represent our starting speculative grammar for how to organize the relations, including the choice of the three adics, or branches of the trichotomy [6]. We also set the guidance of how each adic branch may be analyzed and split according to the universal categories (which is the subject of the next Part IV in this series.)

Nature of Propositions and Predicates

In terms of Peirce’s more formal definition of signs, a proposition is a dicisign, and it consists of a subject + predicate. (CP 2.316) A predicate is a rhema (CP 2.95). In terms of OWL2 with its RDF triples (subject – property – object), the predicate in this model is property + object, or a multitude of annotations that are representations of the subject. Further, “Every proposition refers to some index” (CP 2.369), that is its subject (also referred to as object). “Thus every kind of proposition is either meaningless or has a real Secondness as its object.” (CP 2.315) The idea of an individual type (or general) is also a Secondness [7]. “I term those occasions or objects which are denoted by the indices the subjects of the assertion.” (CP 2.238) The assertion give us the basic OWL statement, also known as a triple.

A proposition captures a relation, which is the basis for the assertion about the ‘subjects’. “Any portion of a proposition expressing ideas but requiring something to be attached to it in order to complete the sense, is in a general way relational. But it is only a relative in case the attachment of indexical signs will suffice to make it a proposition, or, at least, a complete general name.” (CP 3.463)  “But the Logic of Relations has now reduced logic to order, and it is seen that a proposition may have any number of subjects but can have but one predicate which is invariably general.” (CP 5.151)

We now have the building blocks to represent the nature of the proposition:

The Proposition Subject-Predicate Model

Subject(s) and a general predicate make up the proposition (statement, assertion). Subjects need to be individual things (including generals) and are defined, denoted and indicated by various indexical representations, including icons and images. Active predicates that may be reasoned over include the attributes (characteristics) of individual subjects or the relations between objects.

This basic structure also lends itself to information theoretics. “Every term has two powers or significations, according as it is subject or predicate. The former, which will here be termed its breadth, comprises the objects to which it is applied; while the latter, which will here be termed its depth, comprises the characters which are attributed to every one of the objects to which it can be applied.” (CP 2.473) Peirce importantly defines the total information regarding a subject to consist of the “sum of synthetical propositions in which the symbol is subject or predicate, or the information concerning the symbol.” (CP 2.418) In other words, information = breadth x depth. We can reason over attributes and external relations, but our total information also consists in our representations.

These insights give us some powerful bases for defining and categorizing the terms or tokens within our knowledge space. By following these constructs, I believe we can extract the maximum information from our input content.

Definitions of the Relations

Best practice using semantic technologies includes providing precise, actionable definitions to key concepts and constructs. Here are the official statements regarding this trichotomy of relations.

Attribute Relations

Attributes are the intensional characteristics of an object, event, entity, type (when viewed as an instance), or concept. The relationship is between the individual instance (or Particular) and its own attributes and characteristics, in the form of A:A. Attributes may be intrinsic characteristics or essences of single particulars, such as colors, shapes, sizes, or other descriptive characteristics. Attributes may be adjunctual or accidental happenings to the particular, such as birth or death. Or, attributes may be contextual in terms of placing the particular within time or space or in relation to external circumstances.

Attributes are specific to the individual, and only include events that are notable for the individual. They are a Firstness, and in totality try to capture the complete characteristics of the individual particular, which is a Secondness.

These attributes are categorized according to these distinctions and grouped and organized into types, which will be presented in the next part.

External Relations

External relations are assertions between an object, event, entity, type, or concept and another particular or general. An external relationship has the form of A:B. External relations may be simple ones of a direct relationship between two different instances. External relations may be copulative by combining objects or asserting membership, quantity, action or circumstance. Or, external relations may be mediative to provide meaning, context, relevance, generalizations, or other explanations of the subject with respect to the external world. External relations are extensional.

External relations are by definition a Secondness. These external relations are categorized according to these distinctions and grouped and organized into types, which will be presented in the next part.

The events discussion in the previous Part II pertained mostly to external relations.

Representational Relations

Representations are signs (CP 8.191), and the means by which we point to, draw or direct attention to, or designate, denote or describe a particular object, entity, event, type or general.  A representational relationship has the form of re:A. Representations can be designative of the subject, that is, be icons or symbols (including labels, definitions, and descriptions). Representations may be indexes that more-or-less help situate or provide traceable reference to the subject. Or, representations may be associations, resemblances and likelihoods in relation to the subject, more often of indeterminate character.

The representational relation includes what is known as annotations or metadata in other contexts, such as images, links, labels, descriptions, pointers, references, or indexes. Representations can not be reasoned over, except abductive reasoning, but some characteristics may be derived or analyzed through non-inferential means.

These representations are categorized according to these distinctions and grouped and organized into types, which will be presented in the next part.

Summary of the Three Relations

We can now pull these threads together to present a summary chart of these three main relational branches:

KBpedia Three Relations Model

This trichotomy sets the boundaries and affirms the method by which further sub-divisions will be presented in the next installment in this series.

A Strong Relation Schema

We now have a much clearer way for how to build up the assertions in our knowledge representations, according to linguistic predicate construction and predicate calculus. We can now explicitly state a premise underlying our choice of Peirce and his architectonic for the design of KBpedia: it is the most accurate, expressive basis for capturing human language and logical reasoning, both individually and together. Our ability to create new symbolic intelligence from human knowledge requires that we be able to compute and reason over human language.

In the next part we will establish sub-categories for each of these three branches according to the universal categories [8].

This series on KBpedia relations covers topics from background, to grammar, to design, and then to implications from explicitly representing relations in accordance to the principals put forth through the universal categories by Charles Sanders Peirce. Relations are an essential complement to entities and concepts in order to extract the maximum information from knowledge bases. This series accompanies the next release of KBpedia (v 150), which includes the relations enhancements discussed. [1] Peirce citation schemes tend to use an abbreviation for source, followed by volume number using Arabic numerals followed by section number (such as CP 1.208) or page, depending on the source. For CP, see the electronic edition of The Collected Papers of Charles Sanders Peirce, reproducing Vols. I-VI, Charles Hartshorne and Paul Weiss, eds., 1931-1935, Harvard University Press, Cambridge, Mass., and Arthur W. Burks, ed., 1958, Vols. VII-VIII, Harvard University Press, Cambridge, Mass. For EP, see Nathan Houser and Christian Kloesel, eds., 1992. The Essential Peirce – Volume 1, Selected Philosophical Writings‚ (1867–1893), Indiana University Press, 428 pp. For EP2, see The Peirce Edition Project, 1998. The Essential Peirce – Volume 2, Selected Philosophical Writings‚ (1893-1913), Indiana University Press, 624 pp. [2] Attributes, External Relations and Representations comprise OWL properties. In general, Attributes correspond to the OWL datatypes property; External Relations to the OWL object property; and Representations to the OWL annotation properties. These specific OWL terms are not used in our speculative grammar, however, because some attributes may be drawn from controlled vocabularies, such as colors or shapes, that can be represented as one of a list of attribute choices. In these cases, such attributes are defined as object properties. Nonetheless, the mappings of our speculative grammar to existing OWL properties is quite close. In the actual KKO, these labels are replaced with AttributeTypes, RelationTypes, and RepresentationTypes, respectively, when talking about Generals, to conform to the typing terminology of the ontology. [3] C.S. Peirce, 1870. “Description of a Notation for the Logic of Relatives, Resulting from an Amplification of the Conceptions of Boole’s Calculus of Logic”, Memoirs of the American Academy of Arts and Sciences 9, 317–378, 26 January 1870. Reprinted, Collected Papers (CP3.45–149), Chronological Edition (CE2, 359–429). [4] C.S. Peirce, 1897. “Logic of Relatives,” The Monist Vol VII, No 2, pp. 161-217. See https://ia801703.us.archive.org/12/items/jstor-27897407/27897407.pdf [5] The attributes-relation split has been a not uncommon one in the KB literature, insofar as such matters are discussed. For example, see Nicola Guarino, 1997. “Some Organizing Principles for a Unified Top-level Ontology,” in AAAI Spring Symposium on Ontological Engineering, pp. 57-63. 1997. Also, see Yankai Lin, Zhiyuan Liu, and Maosong Sun, 2016. “Knowledge Representation Learning with Entities, Attributes and Relations.” ethnicity 1 (2016): 41-52; the authors propose splitting existing KG-relations into attributes and relations, and propose a KR model with entities, attributes and relations (KR-EAR). [6] See further M.K. Bergman, 2016. “A Speculative Grammar for Knowledge Bases“, AI3:::Adaptive Information blog, June 20, 2016. [7] Peirce recognized the importance of being able to talk of the individual type or general as an object in itself. It was only until the revision of OWL2 that such punning was added to the OWL language. [8] Some additional useful quotes from Peirce related to this topic of relations and these splits are (with emphases per the originals):

  • “Whether or not every proposition has a principal subject, and, if so, whether it can or cannot have more than one, will be considered below. A proposition may be defined as a sign which separately indicates its object. For example, a portrait with the proper name of the original written below it is a proposition asserting that so that original looked. If this broad definition of a proposition be accepted, a proposition need not be a symbol. Thus a weathercock “tells” from which direction the wind blows by virtue of a real relation which it would still have to the wind, even if it were never intended or understood to indicate the wind. It separately indicates the wind because its construction is such that it must point to the quarter from which the wind blows; and this construction is distinct from its position at any particular time. But what we usually mean by a proposition or judgment is a symbolic proposition, or symbol, separately indicating its object. Every subject partakes of the nature of an index, in that its function is the characteristic function of an index, that of forcing the attention upon its object. Yet the subject of a symbolic proposition cannot strictly be an index. When a baby points at a flower and says, “Pretty,” that is a symbolic proposition; for the word “pretty” being used, it represents its object only by virtue of a relation to it which it could not have if it were not intended and understood as a sign. The pointing arm, however, which is the subject of this proposition, usually indicates its object only by virtue of a relation to this object, which would still exist, though it were not intended or understood as a sign. But when it enters into the proposition as its subject, it indicates its object in another way. For it cannot be the subject of that symbolic proposition unless it is intended and understood to be so. Its merely being an index of the flower is not enough. It only becomes the subject of the proposition, because its being an index of the flower is evidence that it was intended to be. In like manner, all ordinary propositions refer to the real universe, and usually to the nearer environment. Thus, if somebody rushes into the room and says, “There is a great fire!” we know he is talking about the neighbourhood and not about the world of the Arabian Nights’ Entertainments. It is the circumstances under which the proposition is uttered or written which indicate that environment as that which is referred to. But they do so not simply as index of the environment, but as evidence of an intentional relation of the speech to its object, which relation it could not have if it were not intended for a sign. The expressed subject of an ordinary proposition approaches most nearly to the nature of an index when it is a proper name which, although its connection with its object is purely intentional, yet has no reason (or, at least, none is thought of in using it) except the mere desirability of giving the familiar object a designation.” (CP 2.357)
  • “But it remains to point out that there are usually two Objects, and more than two Interpretants. Namely, we have to distinguish the Immediate Object, which is the Object as the Sign itself represents it, and whose Being is thus dependent upon the Representation of it in the Sign, from the Dynamical Object, which is the Reality which by some means contrives to determine the Sign to its Representation. In regard to the Interpretant we have equally to distinguish, in the first place, the Immediate Interpretant, which is the interpretant as it is revealed in the right understanding of the Sign itself, and is ordinarily called the meaning of the sign; while in the second place, we have to take note of the Dynamical Interpretant which is the actual effect which the Sign, as a Sign, really determines. Finally there is what I provisionally term the Final Interpretant, which refers to the manner in which the Sign tends to represent itself to be related to its Object. I confess that my own conception of this third interpretant is not yet quite free from mist.” (CP 4.536)
  • “A rhema which has one blank is called a monad; a rhema of two blanks, a dyad; a rhema of three blanks, a triad; etc. A rhema with no blank is called a medad, and is a complete proposition. A rhema of more than two blanks is a polyad. A rhema of more than one blank is a relative. Every proposition has an ultimate predicate, produced by putting a blank in every place where a blank can be placed, without substituting for some word its definition.” [CP 4.438]
  • “Hence, as soon as we admit the idea of absurdity, we are bound to class the rejection of an argumentation among argumentations. Thus, as was said, a proposition is nothing more nor less than an argumentation whose propositions have had their assertiveness removed, just as a term is a proposition whose subjects have had their denotative force removed.” (CP 2.356)
  • “The only way of directly communicating an idea is by means of an icon; and every indirect method of communicating an idea must depend for its establishment upon the use of an icon. Hence, every assertion must contain an icon or set of icons, or else must contain signs whose meaning is only explicable by icons. The idea which the set of icons (or the equivalent of a set of icons) contained in an assertion signifies may be termed the predicate of the assertion.” (CP 2.278)
  • “Thus, we have in thought three elements: first, the representative function which makes it a representation; second, the pure denotative application, or real connection, which brings one thought into relation with another; and third, the material quality, or how it feels, which gives thought its quality.†” (CP 5.290)
  • “Every informational sign thus involves a Fact, which is its Syntax. It is quite evident, then, that Indexical Dicisigns equally accord with the definition and the corollaries.” (CP2.320)
  • “The monad has no features but its suchness, which in logic is embodied in the signification of the verb. As such it is developed in the lowest of the three chief forms of which logic treats, the term, the proposition, and the syllogism.” (CP 1.471)
  • “The unity to which the understanding reduces impressions is the unity of a proposition. This unity consists in the connection of the predicate with the subject; and, therefore, that which is implied in the copula, or the conception of being, is that which completes the work of conceptions of reducing the manifold to unity.” [CP 1.548]
  • “This search resulted in what I call my categories. I then named them Quality, Relation, and Representation. But I was not then aware that undecomposable relations may necessarily require more subjects than two; for this reason Reaction is a better term. Moreover, I did not then know enough about language to see that to attempt to make the word representation serve for an idea so much more general than any it habitually carried, was injudicious. The word mediation would be better.” (CP 4.3)
  • “Every thought, or cognitive representation, is of the nature of a sign. “Representation” and “sign” are synonyms. The whole purpose of a sign is that it shall be interpreted in another sign; and its whole purport lies in the special character which it imparts to that interpretation.” (CP 8.191)

KBpedia Relations, Part II: An Event-Action Model

Mon, 05/15/2017 - 17:59

Download as PDF

Events are a Quasi-Entity and Cross All of Peirce’s Universal Categories

Most knowledge graphs have an orientation to things and concepts, what might be called the nouns of the knowledge space. Entities and concepts have certainly occupied my own attention in my work on the UMBEL and KBpedia ontologies over the past decade. In Part I of this series, I discussed how knowledge graphs, or ontologies, needed to move beyond simple representations of things to embrace how those things actually interact in the world, which is the understanding of context. What I discuss in this part is how one might see actions and events in a way that has logic and coherency. What I discuss in this part is an event-action model.

As we all know, knowledge statements or assertions are propositions that combine a subject with a predicate. Those predicates might describe the nature or character of the subject or might relate the subject to other objects or situations. For covering this aspect we need to pay close attention to the verbs or relations that connect these things.

How we model or represent these things is one of the critical design choices in a knowledge graph. But these choices go beyond simply using RDF or OWL properties (or whatever predicate basis your modeling language may provide). Modeling relations and predicates needs to capture a worldview of how things are connected, preferably based on some coherent, underlying rationale. Similar to how we categorize the things and entities in our world, we also need to make ontological choices (in the classic sense of the Greek ontos, or the nature of being) as to what a predicate is and how predicates may be classified and organized. As I noted in Part I, much less is discussed about this topic in the literature.

Some Things Just Take Time

Information and information theory have been my passion my entire professional career. In that period, two questions stand out as the most perplexing to me, each of which took some years to resolve to some level of personal intellectual satisfaction. My first perplexing question was how to place information in text on to a common, equal basis to the information in a database, such as a structured record. (Yeah, I know, kind of a weird question.) These ruminations, now what we call being able to place unstructured, semi-structured and structured information on to a common footing, was finally solved for me by the RDF (Resource Description Framework) data model. But, prior to RDF, and for perhaps a decade or more, I thought long and hard and read much on this question. I’m sure there were other data models out there at the time that could have perhaps given me the way forward, but I did not discover them. It took RDF and its basic subject-predicate-object (s-p-o) triple assertion to show me the way forward. It was not only a light going on once I understood, but the opening of a door to a whole new world of thinking about knowledge representation.

My second question is one that has been gnawing at me for at least five or six years. The question is, What is an event? (Yeah, I know, another kind of weird question.) When one starts representing information in a knowledge graph, we model things. My early ideas of what is an entity is that it was some form of nameable thing. By that light, the War of 1812, or a heartbeat, or an Industrial Age are all entities. But these things are events, granted of greatly varying length, that are somehow different from tangible objects that we can see or sense in the world, and different from ideas or thoughts. These things all differ, but how and why?

Actually, the splits noted in the prior paragraph give us this clue. Events are part of time, occupy some length of time, and sometimes are so notable as to get their own names, either as types or named events. They have no substance or tangibility. These characteristics are surely different than tangible objects which occupy some space, have physicality, exist over some length of time, and also get their own names as types or named instances. And both of these are different still than concepts or ideas that are creatures of thought.

These distinctions, mostly first sensed or intuited, are hard to think about because we need a vocabulary and mindset (context) by which to evaluate and discern believable differences. For me, the idea and definition of What is an event? was my focus and entry point to try to probe this question. Somehow, I felt events to be a key to the very structures used for knowledge representation (KR) or knowledge-based artificial intelligence (KBAI), which need to be governed by some form of conceptual schema. In the semantic Web space, such schema are known as “ontologies”, since they attempt to capture the nature or being (Greek ὄντως, or ontós) of the knowledge domain at hand. Because the word ‘ontology’ is a bit intimidating, a better variant has proven to be the knowledge graph (because all semantic ontologies take the structural form of a graph). In Cognonto‘s KBAI efforts, we tend to use the terms ontology and knowledge graph interchangeably.

A key guide to this question of What is an event? are the views of Charles Sanders Peirce, the great 19th century American logician, polymath and philosopher. His theory of the universal categories — what he termed Firstness, Secondness and Thirdness — provides the groundings for his views on logic and sign-making. As we’ve noted before about KBpedia, Peirce’s theory of universal categories greatly informs how we have constructed it [1]. Peirce’s categories, while unique, are an organizational framework not unlike categories of being put forward by many philosophers; see [1] for more background on this topic.

Thus, with liberal quotes from Peirce himself [2], I work through below some of the background context for how we treat events — and related topics such as actions, relations, situations and predicates — in our pending KBpedia v 150 release.

What is an Event?

Of course, I am hardly raising new questions. The philosophical question of What is an event? is readily traced back to Plato and Aristotle. The fact we have no real intellectual consensus as to What is an event? after 2500 years suggests both that it is a good question, but also that any “answer” is unlikely to find consensus. Nonetheless, I think through the application of Peircean principles we can still find a formulation that is coherent and logically consistent (and, thus, computable).

To begin this evaluation, let’s first summarize the diversity of views of What is an event? Given the long history of this question, and the voluminous writings and diversity of opinion on the matter, a good place to start is the Stanford Encyclopedia of Philosophy, which offers a kind of Cliff Notes version overviewing various views on events [3], among many other articles in philosophy. I encourage interested students of this question to study that entry in detail. However, we can summarize the various views as fitting into one or more of these definitions:

  • Events are objects (also potentially referred to as entities)
  • Events are facts
  • Events are actions
  • Events are properties
  • Events are times
  • Events are situations.

Within the context of current knowledge bases, the Cyc knowledge base, for example, asserts situations are a generalization of events, and actions are a specialization of events. Most current semantic Web ontologies place events in the same category or class as entities. Some upper ontologies model time in a different way by viewing objects as temporal parts that change over time, or other dichotomous splits around the questions of events and actions. Like I said, there is really no consensus model for events and actually little discussion of them.

From our use in general language, I think we can fairly define events as having some of these characteristics:

  • Events occur in time; “For example, everyday experience is that events occur in time, and that time has but one dimension.” (CP 1.273)
  • Events have a beginning, duration and end
  • Events may be of nearly instantaneous duration (beta decay) to periods spanning centuries or millenia (the Industrial Age, the Cenozoic Era)
  • Events can refer to individual instances (tokens in Peircean terms) or general types
  • Events may be single or recurring (birthdays)
  • Events occur, or “take place”
  • Events, if sufficiently notable, can be properly named (World War II).

For Peirce, “We perceive objects brought before us; but that which we especially experience — the kind of thing to which the word ‘experience’ is more particularly applied — is an event. We cannot accurately be said to perceive events;” (CP 1.336). He further states that “If I ask you what the actuality of an event consists in, you will tell me that it consists in its happening then and there. The specifications then and there involve all its relations to other existents. The actuality of the event seems to lie in its relations to the universe of existents.” (CP 1.24)

We often look for causes for events, but Peirce cautions us that, “Men’s minds are confused by a looseness of language and of thought which leads them to talk of the causes of single events. They ought to consider that it is not the single actuality, in its identity, which is the subject of a law, but an ingredient of it, an indeterminate predicate. Consequently, the question is, not whether each and every event is precisely caused, in one respect or another, but whether every predicate of that event is caused.” (EP p 396) Peirce notes that the chance flash or shock, say a natural phenomenon like a lightning strike or an accident, which by definition is not predictable, can itself through perception of or reaction to the shock “cause” an event. Chance occurrences are a central feature in Peirce’s doctrine of tychism.

Though events are said to occur, to happen or to take place, entities are said to exist. From Peirce again:

“The event is the existential junction of states (that is, of that which in existence corresponds to a statement about a given subject in representation) whose combination in one subject would violate the logical law of contradiction. The event, therefore, considered as a junction, is not a subject and does not inhere in a subject. What is it, then? Its mode of being is existential quasi-existence, or that approach to existence where contraries can be united in one subject. Time is that diversity of existence whereby that which is existentially a subject is enabled to receive contrary determinations in existence.” (CP 1.494).”

Nonetheless, “Individual objects and single events cover all reality . . . .” (CP 5.429). Other possibly useful statements by Peirce regarding events may be found under [4].

In these regards, we can see both entities and single events as individual instances within our KBpedia Knowledge Ontology, what we call Particulars, which represent the second (or Secondness) of the three main branches in KKO. Per our use of the universal categories to evaluate our subsequent category structures (see [1]) within Particulars, events are treated as a Secondness due to their triggering and “quasi-existence” nature, with entities treated as a Thirdness [5]. Like entities, we can also generalize events into types, which are placed under the third main branch of KKO, the Generals. Event types can be defined and are real in a similar way to entity types.

Within events, we can also categorize according to the three universal categories. What I present below are comments with respect to the event examples first mentioned in the introductory material above.

Events Resulting from Chance or Flash

As noted, the chance event, the unexpected flash or shock, is a Firstness within events. Peirce’s doctrine of tychism places a central emphasis on chance, being viewed as the source of processes in nature such as evolution and the “surprising fact” that causes us to re-investigate our assumptions leading to new knowledge.”Chance is any event not especially intended, either not calculated, or, with a given and limited stock of knowledge, incalculable.” (CP 6.602 ref). The surprising fact is the spark that causes us to continually reassess the nature of the world, to assess and categorize anew [6].

“Anything which startles us is an indication, in so far as it marks the junction between two portions of experience. Thus a tremendous thunderbolt indicates that something considerable happened, though we may not know precisely what the event was.” (EP What is a Sign, Sec 5) The chance event or shock joins energetic effort or perception as the stimulants of action. “Effort and surprise are the only experiences from which we can derive the concept of action.” (EP p 385)

The chance shock produces sensation, which itself may be a stimulant (reaction) to produce further action. “This is present in all sensation, meaning by sensation the initiation of a state of feeling; — for by feeling I mean nothing but sensation minus the attribution of it to any particular subject. In my use of words, when an ear-splitting, soul-bursting locomotive whistle starts, there is a sensation, which ceases when the screech has been going on for any considerable fraction of a minute; and at the instant it stops there is a second sensation. Between them there is a state of feeling.” (CP 1.332)

Still, the chance shock is the one form of event for which there is not a discernible cause-and-effect. It remains inexplicable because the triggering event remains unpredictable. “In order to explain what I mean, let us take one of the most familiar, although not one of the most scientifically accurate statements of the axiom viz.: that every event has a cause. I question whether this is exactly true . . . . may it be that chance, in the Aristotelian sense, mere absence of cause, has to be admitted as having some slight place in the universe . . . . ” (W4:546)

Events Resulting from Actions

We more commonly associate an event with action, and that is indeed a major cause of events. (Though, as we saw, chance events or accidents, as an indeterminate group, may trigger events.) An action is a Secondness, however, because it is always paired with a reaction. Reactions may then cause new actions, itself a new event. In this manner activities and processes can come into being, which while combinatorial and compound, can also be called events, including those of longer duration. That entire progression of multiple actions represents increasing order, and thus the transition to Thirdness.

One of Peirce’s more famous quotes deals with the question of action and reaction, even with respect to our cognition, and their necessary pairing:

“We are continually bumping up against hard fact. We expected one thing, or passively took it for granted, and had the image of it in our minds, but experience forces that idea into the background, and compels us to think quite differently. You get this kind of consciousness in some approach to purity when you put your shoulder against a door and try to force it open. You have a sense of resistance and at the same time a sense of effort. There can be no resistance without effort; there can be no effort without resistance. They are only two ways of describing the same experience. It is a double consciousness. We become aware of ourself in becoming aware of the not self. The waking state is a consciousness of reaction; and as the consciousness itself is two sided, so it has also two varieties; namely, action, where our modification of other things is more prominent than their reaction on us, and perception, where their effect on us is overwhelmingly greater than our effect on them.” (CP 1.324)

So, we see that actions can be triggered by chance, energetic effort, perceptions, and reactions to prior actions, sometimes cascading into processes involving a chain of actions, reactions and events.

Thoughts as Events

Peirce makes the interesting insight that thoughts are events, too. “Now the logical comprehension of a thought is usually said to consist of the thoughts contained in it; but thoughts are events, acts of the mind. Two thoughts are two events separated in time, and one cannot literally be contained in the other.” (CP 5.288) Similarly, in “Law of the Mind” Peirce calls an idea “an event in an individual consciousness” (CP 6.105) Through these assertions, the sticky question of thinking and cognition (always placed as a Thirdness by Peirce) is clearly put into the event category.

Events Resulting from Continuity

Of course, the essence of Thirdness is continuity, what Peirce called synechism. The very nature of continuity in a temporal sense are events, some infinitesimal, transitioning from one to another, with breaks, if we are to become aware of them, merely breaks in the continuity of time. Entities, at least as we define them, provide a similar function, but now over the continuity of space. All objects are deformations of continuous space. By this neat trick of relating events to time and entities to space, all of which is (yes, singular tense) continuous, Peirce nailed one of the hard metaphysical nuts to crack. Some claim that Peirce was the first philosopher to anticipate the space-time continuum [7].

In addition to its embeddedness and embedding into continuity, there are also actions which themselves are expressions of triadic relations. Peirce first says:

“Let me remind you of the distinction … between dynamical, or dyadic, action; and intelligent, or triadic action. An event, A, may, by brute force, produce an event, B; and then the event, B, may in its turn produce a third event, C. The fact that the event, C, is about to be produced by B has no influence at all upon the production of B by A. It is impossible that it should, since the action of B in producing C is a contingent future event at the time B is produced. Such is dyadic action, which is so called because each step of it concerns a pair of objects.” (CP 5.472)

In triadic action, the classic example is ‘A gives B to C’ (EP 2 170-171). The other classic triadic example is Peirce’s sign relation between object, sign and interpretant. Peirce adopted the term semiosis for this triadic relation and defined it to mean an “action, or influence, which is, or involves, a coöperation of three subjects, such as a sign, its object, and its interpretant, this tri-relative influence not being in any way resolvable into actions between pairs” (EP 2 411). Peirce’s reduction thesis also maintains that all higher order relationships (polyadic with more than three terms) can be decomposed to monadic, dyadic or triadic relations. All three are required to capture the universe or potential relations, but any relation can be reduced to one of those three [1]. Further, Peirce also maintained that the triadic relation is primary, with monadic and dyadic relations being degenerate forms of it.

Now, all symbols (therefore, also the basis for human language) are also a Thirdness. The symbol (sign) stands for an object which the interpretant understands to have a meaning relationship to the object. Symbols, too, may be causes of events. As Peirce states,”Thus a symbol may be the cause of real individual events and things. It is easy to see that nothing but a symbol can be such a cause, since a cause is by its definition the premiss of an argument; and a symbol alone can be an argument.” (EP, p 317).

It is not always easy to interpret Peirce. The ideas of creation and destruction, for example, would seem to be elements of Firstness, being closely allied to the idea of potentiality. Yet, as Peirce states, “But the event may, on the other hand, consist in the coming into existence of something that did not exist, or the reverse. There is still a contradiction here; but instead of consisting in the material, or purely monadic, repugnance of two qualities, it is an incompatibility between two forms of triadic relation . . . .” (CP 1.493) This statement seems to suggest that creation and destruction are somehow related to Thirdness. It is not always easy to evaluate where certain concepts fit within Peirce’s universal categories.

While I am unsure of some aspects of my Peircean analysis as to exact placements into Firstness, Secondness and Thirdness, what is also true is that Peirce sets conditions and mindsets for looking at these very questions. Ultimately open questions such as I mention are amenable to analysis and argumentation according to the principles underlying Peirce’s universal categories of Firstness, Secondness and Thirdness. The challenge is not due to the criteria for evaluation; rather, it comes from probing what is truly meant and implied within any question.

The Event-Action Model

What does not seem complicated or confusing is Peirce’s basic model for actions. Actions are grounded in events. As discussed above, Peirce provides for us broad and comprehensive examples of events — chance, actions, thoughts and continuity. Events are always things that occur to an individual (including the idea of an individual type). Peirce categorically states that “. . . the two chief parts of the event itself are the action and the reaction . . . .” (CP 5.424)

We now have the building blocks to enable us to diagram the event-action model embodied in KBpedia:

The KBpedia Event-Action Model

The single event may arise from any of the bulleted items shown on this diagram. Though every action is paired with a reaction, one or the other might be more primary for different kinds of events. As Peirce notes, the event represents a juxtaposition of states, the comparison of the subject prior and after the event providing the basis for the nature of the event. Each change in state represents a new event, which can trigger new actions and reactions leading to still further events. Simple events represent relatively single changes in state, such as turning off a light switch or a bolt of lightning. More complicated events are the topic of the next section.

Relating Events More Broadly to KBpedia

The next diagram relates events more broadly to KBpedia particulars and generals. Events may be the triggers for actions, both embedded within situations that provide their context (and often influence the exact nature and course of the event). Events also precede the creation or destruction of entities, which are manifestations, and events do occur over time to affect those very same entities. These events and these entities are singular, and if notable, we may give the individual instances of them names. These items are shown at the top of the diagram:

KBpedia Particulars and Types

On the left-hand portion of the diagram we have the cascade from events to actions to activities and then processes. The progression down the cascade requires the chaining together of events and actions and paired reactions, getting more ordered and purposeful as we move down the cascade. Of course, single events may trigger single actions and reactions, and events express themselves at every level of the cascade. Events and actions always occur in a situation, the context of which may have influence on the resulting nature of the event and its actions and reactions. This mediation is the exact reason that Thirdness is a logical foundation.

Parallel with events are entities, which themselves result from events. Entities, too, may be named. This side of the diagram cascade leads to classes or types of individual entities, which now become generals, and that may be classified or organized by types. A similar type aggregation may be applied to individual events, actions, activities and processes. At this point, we are moving into the territory of what is known as Peirce’s token-type distinction (particulars v generals). With generals, we now move beyond the focus of events; see further my own typology piece for this transition [8].

Events Provide the Entree to Understand Relations

Events are like the spark that leads us to better understand actions and what emerges from them, which in turn helps us better understand predicates and relations. These are topics for next parts in this series.

What we learn from Peirce is that events are quasi-entities, based on time rather than space, and, like entities, are a Secondness. Like entities, we can name events and intrinsically inspect their attributes. Events may also range from the simple to the triadic and durative. Events are the fundamental portions of activity and process cascades, and also capture such seemingly non-energetic actions like thought. Thought, itself, may be a source of further events and action, as may be the expressions of our thought, symbols. And actions always carry with them a reaction, which can itself be the impetus for the next action in the event cascade.

What this investigation shows us is that events are the real triggering and causative factors in reality. Entities are a result and manifestation of events, but less central to the notion of relations. Events, like entities, can be understood through Peirce’s universal categories of Firstness, Secondness and Thirdness.

Events help give us a key to understand the dynamic nature of Charles Peirce’s worldview. I hope in subsequent parts of this series to help elucidate further how an understanding of events helps to unmask the role and purpose of relations. Though entities, events and generals may all be suitable subjects for our assertions within knowledge bases and knowledge graphs, it is really through the relations of our system, in this case KBpedia, where we begin to understand the predicates and actions of our chosen domain.

This series on KBpedia relations covers topics from background, to grammar, to design, and then to implications from explicitly representing relations in accordance to the principals put forth through the universal categories by Charles Sanders Peirce. Relations are an essential complement to entities and concepts in order to extract the maximum information from knowledge bases. This series accompanies the next release of KBpedia (v 150), which includes the relations enhancements discussed. [1] M.K. Bergman, 2016. “The Irreducible Truth of Threes,” AI3:::Adaptive Information blog, September 27, 2016. [2] Peirce citation schemes tend to use an abbreviation for source, followed by volume number using Arabic numerals followed by section number (such as CP 1.208) or page, depending on the source. For CP, see the electronic edition of The Collected Papers of Charles Sanders Peirce, reproducing Vols. I-VI, Charles Hartshorne and Paul Weiss, eds., 1931-1935, Harvard University Press, Cambridge, Mass., and Arthur W. Burks, ed., 1958, Vols. VII-VIII, Harvard University Press, Cambridge, Mass. For EP, see Nathan Houser and Christian Kloesel, eds., 1992. The Essential Peirce – Volume 1, Selected Philosophical Writings‚ (1867–1893), Indiana University Press, 428 pp. For EP2, see The Peirce Edition Project, 1998. The Essential Peirce – Volume 2, Selected Philosophical Writings‚ (1893-1913), Indiana University Press, 624 pp. [3] Roberto Casati and Achille Varzi, 2014. “Events“, Stanford Encyclopedia of Philosophy, Aug 27, 2014. Retrieved May 2, 2017. [4] What constitutes the potentials, realized particulars, and generalizations that may be drawn from a query or investigation is contextual in nature. That is why the mindset of Peirce’s triadic logic is a powerful guide to how to think about and organize the things and ideas in our world. We can apply this triadic logic to any level of information granularity. Here are some further Peirce statements about events:

  • “An event always involves a junction of contradictory inherences in the subjects existentially the same, whether there is a simple monadic quality inhering in a single subject, or whether they be inherences of contradictory monadic elements of dyads or polyads, in single sets of subjects. But there is a more important possible variation in the nature of events. In the kind of events so far considered, while it is not necessary that the subjects should be existentially of the nature of subjects — that is, that they should be substantial things — since it may be a mere wave, or an optical focus, or something else of like nature which is the subject of change, yet it is necessary that these subjects should be in some measure permanent, that is, should be capable of accidental determinations, and therefore should have dyadic existence. But the event may, on the other hand, consist in the coming into existence of something that did not exist, or the reverse. There is still a contradiction here; but instead of consisting in the material, or purely monadic, repugnance of two qualities, it is an incompatibility between two forms of triadic relation, as we shall better understand later. In general, however, we may say that for an event there is requisite: first, a contradiction; second, existential embodiments of these contradictory states; [third,] an immediate existential junction of these two contradictory existential embodiments or facts, so that the subjects are existentially identical; and fourth, in this existential junction a definite one of the two facts must be existentially first in the order of evolution and existentially second in the order of involution.”(CP 1.493)
  • “A Sinsign (where the syllable sin is taken as meaning “being only once,” as in single, simple, Latin semel, etc.) is an actual existent thing or event which is a sign. It can only be so through its qualities; so that it involves a qualisign, or rather, several qualisigns. But these qualisigns are of a peculiar kind and only form a sign through being actually embodied.” (EP Nomenclature of Triadic; p 291) That is, a sinsign is either an existing thing or event; further, events have attributes.
  • “Another Universe [Secondness] is that of, first, Objects whose Being consists in their Brute reactions, and of, second, the facts (reactions, events, qualities, etc.) concerning those Objects, all of which facts, in the last analysis, consist in their reactions. I call the Objects, Things, or more unambiguously, Existents, and the facts about them I call Facts. Every member of this Universe is either a Single Object subject, alike to the Principles of Contradiction and to that of Excluded Middle, or it is expressible by a proposition having such a singular subject.” (EP p 479) The latter is an event.
[5] Under Particulars, the instantiation of qualities (making them subjects as opposed to unformed potentiality) is the Firstness. [6] M.K. Bergman, 2016. “A Foundational Mindset: Firstness, Secondness, Thirdness,” AI3:::Adaptive Information blog, March 21, 2016. [7] See, for example, A. Nicolaidis, 2008. “Categorical Foundation of Quantum Mechanics and String Theory,” arXiv:0812.1946, 10 Dec 2008. It is not unusual to see grand claims for the foresight exhibited by Peirce in his writings, which sometimes have an inadequate basis for the claims of prescience. However, Peirce’s general observations often pre-date current, modern interpretations, even if not fully articulated. [8] M.K. Bergman, 2016. “Threes All the Way Down to Typologies,” AI3:::Adaptive Information blog, October 13, 2016.

Pulse: A Major Survey of KBAI

Wed, 05/10/2017 - 13:36
This is the Place to Start with the Academic Literature

Dan Roth and his former post-doc, Yangqiu Song, yesterday released a major paper on machine learning with knowledge bases, “Machine Learning with World Knowledge: The Position and Survey[1]. This 20-page paper with 250 references is a goldmine of starting points and a useful organizational schema for how to look at various machine learning applications and methods based on large-scale knowledge bases. The paper covers the exact territory that we refer to as knowledge-based artificial intelligence, or KBAI.

I always have my eye out for papers by Roth. Both he and his colleague at the University of Illinois Urbana-Champaign, Jiawei Han, publish well-thought and practical papers in the areas of knowledge representation and data mining. Various groups at Illinois also offer open-source software resources useful to these tasks. These efforts, in my view, are some of the best available worldwide.

The authors are proponents for the use of knowledge bases in machine learning for the same reasons we are: “Two essential problems of machine learning are how to generate features and how to acquire labels for machines to learn. Particularly, labeling large amount of data for each domain-specific problem can be very time consuming and costly. It has become a key obstacle in making learning protocols realistic in applications.” The authors then go on to address in specifics how knowledge bases can overcome these problems.

Besides the valuable reference listing, the other real contribution of the paper is how it frames and organizes what roles knowledge bases can play in AI. Like the paper’s problem statement, the authors organize their presentation around features and labels. They discuss the use and role of various techniques in relation to machine learning applications. The way they structure their presentation should be a help to those new to KBAI and the variety of terminology inherent to the field. Based on my own experience, I find their characterizations and guidance to be spot on.

I really have only one bone to pick with this otherwise excellent paper. No where do the authors discuss the quality, coherence or accuracy of the underlying knowledge bases used to perform these tasks. Many of the cited KB sources have known quality problems, some of which we have discussed before, such as Wikipedia (coverage and category structure), YAGO (reliance on WordNet), Freebase (highly variable quality and no longer maintained), Cyc (questionable upper ontology and mis-assignments), DBpedia (simplistic schema), etc. One of the major reasons for our efforts with KBpedia is to continue to work to create cleaner training environments suitable to machine learning so as to reduce the GIGO problem.

Still, quibbles aside, this paper will prove highly useful to anyone interested in distant supervised machine learning and knowledge-based artificial intelligence. For the foreseeable future, this paper should be a standard reference in your KBAI library.

[1] Yangqiu Song and Dan Roth, 2017. “Machine Learning with World Knowledge: The Position and Survey,” arXiv:1705.02908, 8 May 2017.

KBpedia Relations, Part I: Smarter Knowledge Graphs

Mon, 05/08/2017 - 16:19
It’s Time for Ontologies to Put on Their Big Boy Pants

Many of us have been in the semantic Web game for more than a decade. My own first exposure in the early 2000s was spent trying to figure out what the difference was between XML and RDF. (Fortunately, that confusion has long since passed.) We also grappled with the then-new concept of ontologies, now more easily understood as knowledge graphs. In this process many lessons have been learned, but also much promise has yet to be realized.

One of the most important lessons is that the semantic Web is best seen not as an end unto itself. Rather, it, and the semantic technologies that underly it, is really just a means to get at important, longstanding challenges in data interoperability and artificial intelligence. Our work with knowledge graphs needs to be viewed through this lens of what we can do with these technologies to address real problems, not solely for technology’s sake.

It is with this spirit in mind that we are working on our next release of KBpedia, the knowledge structure that knits together six major public knowledge bases for the purpose of speeding machine learning and providing a scaffolding for data interoperability. This pending new release will expand KBpedia in important ways. It will provide a next step on the path to realizing the promise of knowledge graphs.

I will be sharing a series of articles to lay the groundwork for this release, as well as then, after release, to explain what some of it means. This first article begins by discussing the state-of-the-art in semantic knowledge graphs, what they currently do, and what they (often) currently don’t. I grade each of three major areas related to knowledge graphs, in declining order of achievement. My basic report is that we have gotten many things right — witness the growth and credibility of knowledge graphs across all current search services and intelligent agents — but it is time for knowledge graphs to grow out of knickers and don big boy pants.

Important note: Some ontologies in industrial, engineering and biomedical realms do take a more sophisticated view of relations and data. However, these are not the commonly known knowledge graphs used in artificial intelligence, natural language understanding, or intelligent, virtual agents. These latter areas are my primary focus due to our emphasis on knowledge-based artificial intelligence. The Current Knowledge Graph Reader: Concepts and Entities

We watch our children first learn the names of things as they begin mastering language. The learning focus is on nouns, and building a vocabulary about the things that populate the tangible world. By the time we begin putting together our first sentences, lampooned in such early books such as Dick and Jane and the dog Spot, our nouns are getting increasingly numerous and rich, though our verbs remain simple. Early language acquisition, as with the world itself, is much more populated by different kinds of objects than different kinds of actions. Our initial verbs tend to be fewer in number and much less varied than the differences of form and circumstance we can see from objects. Most knowledge graphs have an orientation to things and concepts, the nouns of the knowledge space, much like a Dick and Jane reader. Entities and concepts have occupied my own work on the UMBEL and KBpedia ontologies over the past decade. It is clear similar emphasis has occurred in public knowledge bases, as well. Nouns and categorizing things have been the major focus of efforts to date.

For example, major knowledge base constituents of KBpedia, such as Wikidata, Wikipedia or GeoNames, have millions of concepts or entities within them, but fewer than a few thousand predicates (approx. 2500 useful in Wikidata and 750 or so in DBpedia and schema.org). Further, reasoners that we apply over these graphs have not been expanded to deal with rich predicates. Reasoners mostly rely on inference over subsumption hierarchies, disjointedness, and property conditions like cardinality and range. Mapping predicates are mostly related to subsumption and equivalence, with the latter commonly misused [1].

Yet, even within the bounds of nouns, we unfortunately have not done well in identifying context. Disambiguation is made difficult without context. Though context may be partially described by nouns related to perception, situations, states and roles, we ultimately require an understanding of events, actions and relations. Until these latter factors are better captured and understood, our ability to establish context remains limited.

The semantic technology languages of RDF and OWL give us the tools to handle these constructs, at least within the limits of first-order logic, but we have mostly spent the past 15 years mastering kindergarten-level basics. To illustrate how basic this is, try to understand how different knowledge graphs treat entities (are they individuals, instances, particulars, or including events or concepts?) versus concepts (are they classes, types, generals, including or not abstractions?). There is certainly not uniformity of treatment of these basic noun grammars. Poor mappings and the inability to capture context further drag down this grade.

Grade: B-

Only the Simplest of Relations

I’ve already noted the paucity of relations in (most) current knowledge graphs. But a limited vocabulary is not the only challenge.

There is no general nor coherent theory expressed in how to handle relations in use within the semantic Web. We have expressions that characterize individual things, what we, in our own work, term attributes. We have expressions that name or describe things, including annotations or metadata, what we term denotatives. We have expressions that point to or indicate things, what we term indexicals. And, we have expressions that characterize relations between external objects or actions an agent might take, what we term external relations. These are some of our terms for these relations — which we will describe in detail in the second and third parts of this series — but it is unlikely you will find most or all of these distinctions in any knowledge graph. This lack is a reflection of the inattention to relations.

Modeling relations and predicates needs to capture a worldview of how things are connected, preferably based on some coherent, underlying rationale. Similar to how we categorize the things and entities in our world, we also need to make ontological choices (in the classic sense of the Greek ontos, or the nature of being) as to what a predicate is and how predicates may be classified and organized. Not much is discussed about this topic in the knowledge graph literature, let alone put into practice.

The semantic Web has no well-known or accepted ontology of relations or properties. True, OWL offers the distinction of annotation, object and datatype properties, and also allows property characteristics such as transitivity, domain, range, cardinality, inversion, reflexivity, disjunction and the like to be expressed, but it is a rare ontology that uses any or many of these constructs. The subProperty expression is used, but only in limited instances and rarely (none, to my knowledge) in a systematic schema. For example, it is readily obvious that some broader predicate such as animalAction could be split into involuntaryAction and voluntaryAction, and then into specific actions such as breathing or walking, and so on, but schema with these kinds of logical property subsumptions are not evident. Structurally, OWL can be used to reason over actions and relations in a similar means as we reason over entities and types, but our common ontologies have yet to do so. Yet creating such schema are within grasp, since we have language structures such as VerbNet and other resources we could put to the task.

We want to establish such a schema so as to be able to reason and organize (categorize) actions and relations. We further want such a schema to segregate out intrinsic relations (attributes) from relations between things, or from descriptions about or indexes to things. This greater understanding is exactly what is needed to reason over relations. It is also what is called for in being able to relate parsed tokens to a semantic grammar. Relation and fact extraction from text further requires this form of schema. Without these broader understandings, we can not adequately capture situations and context, necessary for disambiguating the things in our world.

Though the splits and names may not be exactly as I would have preferred, we nonetheless have sufficient syntax and primitives in OWL by which we can develop such a schema of relations. However, since virtually nothing has been done in this regard over the 15 years of the semantic Web, I have to decrement its grade accordingly.

Grade: C

Oh, and Then There’s the Problem with Data

Besides machine learning, my personal motivations and strongly held beliefs in semantic technologies have been driven by the role they can play in data interoperability. By this term I mean the ability to bring information together from two or more sources so as to effectively analyze and make decisions over the combined information. The first challenge in data interoperability is to ensure that when we talk about things in two or more settings, we understand whether we are talking about the same or different things. To date, this has been a primary use of semantic technologies, though equivalence distinctions remain problematic [1]. We can now relate information in unstructured, semi-structured and structured formats to a common basis. Ontologies are getting mature for capturing nouns. That portion of data interoperability, as noted above, gets a grade of B-.

But there are two additional factors in play with data interoperability. The first is to ensure we are understanding situations and contexts, what received a grade of C above. The remaining factor is actually relating the values associated with the entities or things at hand. In this regard, our track record to date has been abysmal.

As Kingsley Idehen is wont to explain, the linked data model of the semantic Web can be seen to conform to the EAV (entity-attribute-value) data model. We can do pretty well about entities (E), so long as we agree what constitutes an entity and we can accept some mis-assignments. No one really agrees as to what constitutes an attribute (A) (a true attribute, a property, or something other). And while we all intuitively know what constitutes a value, there is no agreement as to data types, units, or ways to relate values in different formats to one another. Though the semantic Web knows how to pump out data using the EAV model, there’s actually very little guidance on how we ingest and conform values across sources. Without this factor, there is no data interoperability. The semantic Web may know how to port relational data to a semantic model, but it still does not how to reconcile values. The ABox, in descriptive logic terms, is barely being tapped [2].

We fortunately have a rich reservoir of past logical, semantic and philosophical writings to draw upon in relation to all of these factors. We also have many formalized measuring systems and crosswalks between many of them. We are also seeing a renewed effort surrounding more uniform ways to characterize the essential characteristics of data, namely quantities, units, dimensions and datatypes (QUDT) [3]. Better models for data interoperability and resolving these areas exist. Yet, however, insufficient time and effort has yet been expended to bring these resources together into a logical, computable schema. Until all of these factors are brought together with focus, actual data interoperability based on semantic technologies will remain limited.

Grade: C-

Why Important to AI?

Relations identification and contextual understanding are at the heart of current challenges in artificial intelligence applications related to knowledge and text. Without these perspectives, it is harder to do sentiment analysis, “fact” (or assertion) extraction, reasoning over relations, reasoning over attributes, context analysis, or disambiguation. We need to learn how to speak the King’s English in these matters, and graduate beyond kindergarten readers.

Deep learning and inclusion of both supervised and unsupervised machine learning is best served when the feature (variable) pool is rich and logically coherent, and when the output targets are accurately defined. “Garbage in, garbage out” applies to artificial intelligence learning in the very same ways it applies to any kind of computational activity. We want coherence, clarity and accuracy in our training sets and corpuses no less than we want it in our analysis and characterizations of the world.

“Dirty” training bases with embedded error can be trained to do no better than their inputs. If we want to train our knowledge applications with Dick and Jane reader inputs, too often in error to begin with, we will not get beyond the most basic of knowledge levels. We can not make the transition to more sophisticated levels without a more sophisticated understanding of the symbolic means for communicating knowledge: that is, human language. Predicate understanding expressed through predicate representations are necessary for predicate logic.

To be sure, progress has been made in the first decade and one-half of the semantic Web. We have learned many best practices and have started to get pretty good in capturing nouns and their types. But what results is a stilted, halting conversation. To begin to become fluent, our knowledge bases must be able to capture and represent verbs, actions and events.

The Anticipated Series

Part II of this series will discuss what the ontological role is of events, and how that relates to a broader model of actions, activities and situations. This foundation will enable a discussion in Parts III and IV of the actual relations model in KBpedia, and how it is expressed in the KBpedia Knowledge Ontology (KKO). A summary of the KBpedia grammar will be provided in Part V. These next parts will set the context for our release of KBpedia v 150, incorporating these new representations, to coincide at the same time.

After this release of KBpedia, the series will continue to discuss such topics as what is real and reality, and some speculations as to practical applications arising from  the new relations capabilities in KBpedia. Some of the topics to be discussed in concluding parts will be semantic parsers and natural language understanding, robotics as a driving force in expanded knowledge graphs, and best practices for constructing capable ontologies.

Throughout this series I will repeatedly harken to the teachings of Charles Sanders Perice, and how his insights in logic and sign-making help inform the ontological choices that we are making. We have been formulating our thoughts in this area for years, and Peirce provides important guidance for how to crack some very hard nuts. I hope we can help the evolving state of knowledge graphs grow up a bit, in the process establishing a more complete, coherent and logical basis for constructing knowledge structures useful for advances in artificial intelligence.

This series on KBpedia relations covers topics from background, to grammar, to design, and then to implications from explicitly representing relations in accordance to the principals put forth through the universal categories by Charles Sanders Peirce. Relations are an essential complement to entities and concepts in order to extract the maximum information from knowledge bases. This series accompanies the next release of KBpedia (v 150), which includes the relations enhancements discussed. [1] M. K. Bergman, 2009. “When Linked Data Rules Fail,” AI3:::Adaptive Information blog, November 16, 2009. [2] As I have previously written: “Description logics and their semantics traditionally split concepts and their relationships from the different treatment of instances and their attributes and roles, expressed as fact assertions. The concept split is known as the TBox (for terminological knowledge, the basis for T in TBox) and represents the schema or taxonomy of the domain at hand. The TBox is the structural and intensional component of conceptual relationships. The second split of instances is known as the ABox (for assertions, the basis for A in ABox) and describes the attributes of instances (and individuals), the roles between instances, and other assertions about instances regarding their class membership with the TBox concepts.” [3] See QUDT – Quantities, Units, Dimensions and Data Types Ontologies, Retrieved May 6, 2017.

Fare Thee Well, OpenCyc

Tue, 04/04/2017 - 18:18
AI Brings an End to an Era

OpenCyc is no longer available to the public. Without notice and with only some minor statements on Web pages, Cycorp has pulled OpenCyc from the marketplace. It appears this change occurred in March 2017. After 15 years, the abandonment of OpenCyc represents the end of one of the more important open source knowledge graphs of the early semantic Web.

OpenCyc was the first large-scale, open-source knowledge base provided in OWL format. OpenCyc preceded Wikipedia in a form usable by the semantic Web, though it never assumed the prominent position that DBpedia did in terms of helping to organize semantic Web content.

OpenCyc was first announced in July 2001, with the first release occurring in 2002. By release 0.9, OpenCyc had grown to include some 47,000 concepts in an OWL distribution. By the time of OpenCyc’s last version 4.0 release in mid-2012, the size of the system had grown to some 239,000 concepts. This last version also included significant links to DBpedia, WordNet and UMBEL, among other external sources. This last version included references to about 19 K places, 26 K organizations, 13 K persons, and 28 K business-related things. Over the course of its lifetime, OpenCyc was downloaded at least 60,000 times, perhaps more than 100,000, and was a common reference in many research papers and other semantic Web projects [1].

At the height of its use, the distribution of OpenCyc not only included the knowledge graph, but also a Java-based inference engine, a browser for the knowledge base, and a specification of the CycL language and a specification of the Cyc API for application development.

The company has indicated it may offer a cloud option in the future for research or educational purposes, but the date and plans are unspecified. Cycorp will continue to support its ResearchCyc and EnterpriseCyc versions.

Reasons for the Retirement

Cycorp’s Web site states OpenCyc was discontinued because OpenCyc was “fragmented” and was confused by the technical community with the other versions of Cyc. Current verbiage also indicates that OpenCyc was an “experiment” that “proved to be more confusing than it was helpful.” We made outreach to senior Cycorp officials for additional clarification as to the reasons for its retirement but have not received a response.

I suspect the reasons for the retirement go deeper than this. As recently as last summer, senior Cycorp officials were claiming a new major release of OpenCyc was “imminent”.  There always appeared to be a tension within the company about the use and role of an open source version. Key early advocates for OpenCyc, including John De Oliveira, Stephen Reed and Larry Lefkowitz, are no longer with the company. The Cyc Foundation established to support the open source initiative was quietly shut down in 2015. The failure last year of the major AI initiative known as Lucid.ai, which was focused on a major commercialization push behind Cyc and reportedly to be backed by “hundreds of millions of dollars” of venture capital that never materialized, also apparently took its toll on company attention and resources.

Whatever the reasons, and there are likely others, it is hard to see how a 15-year effort could be characterized as experimental. While versions of OpenCyc v 4.0 can still be downloaded from third parties, including a fork, it is clear this venerable contributor to the early semantic Web will soon be available no longer, third parties or not.

Impact on Cognonto

OpenCyc is one of the six major core knowledge bases that form the nucleus of Cognonto‘s KBpedia knowledge structure. This linkage to OpenCyc extends back to UMBEL, another of the six core knowledge bases. UMBEL is itself a subset extraction of OpenCyc [2].

As we began the KBpedia effort, it was clear to us that major design decisions within Cyc (all versions) were problematic to our modeling needs [3]. Because of its common-sense nature, Cyc places a major emphasis on the “tangibility” of objects, including “partial tangibility”. We also found (in our view) major modeling issues in how Cyc handles events v actions v situations. KBpedia’s grounding in the logic and semiosis of Charles Sanders Peirce was at odds with these basic ontological commitments.

I have considered at various times writing one or more articles on the differences we came to see with OpenCyc, but felt it was perhaps snarky to get into these differences, given the different purposes of our systems. We continue to use portions of OpenCyc with important and useful subsumption hierarchies, but have also replaced the entire upper structure better reflective of our approach to knowledge-based artificial intelligence (KBAI). We will continue to retain these existing relations.

Thus, fortunately, given our own design decisions from some years back, the retirement of OpenCyc will have no adverse impact on KBpedia. However, UMBEL, as a faithful subset of OpenCyc designed for possible advanced reasoning, may be impacted. We will await what form possible new Cycorp initiatives takes before making any decisions regarding UMBEL. Again, however, KBpedia remains unaffected.

Fare Thee Well!

So, it is with sadness and regret that I bid adieu to OpenCyc. It was a noble effort to help jump-start the early semantic Web, and one that perhaps could have had more of an impact had there been greater top-level commitment. But, like many things in the Internet, generations come and go at ever increasing speed.

OpenCyc certainly helped guide our understanding and basis for our own semantic technology efforts, and for that we will be eternally grateful to the system and its developers and sponsors. Thanks for a good ride!

[1] You can see some of these statistics yourself from the Wayback Machine of the Internet Archive using the URLs of http://www.opencyc.org/, http://www.cyc.com/, https://sourceforge.net/projects/opencyc/ and http://cycfoundation.org. [2] The intent of UMBEL is to provide a lightweight scaffolding for relating concepts on the Web to one another. About 99% of UMBEL is a direct subset extraction of OpenCyc. This design approach was purposeful to allow systems linked to UMBEL to further reach through to Cyc (OpenCyc, but other versions as well) for advanced reasoning. [3] I discuss some of these design decisions in M.K. Bergman, 2016. “Threes All the Way Down to Typologies,” blog post on AI3:::Adaptive Information, October 3, 2016.

New Cognonto Entry Page

Mon, 03/27/2017 - 09:27

When we first released Cognonto toward the end of 2016, we provided a starting Web site that had all of the basics, but no frills. In looking at the competition in the artificial intelligence and semantic technology space, we decided a snazzier entry page was warranted. So, we are pleased to announce our new entry page:

We also had fun playing around with using recent AI programs to generate images based on various input visual styles. We used AI imagery and our own Cognonto logo as the way to generate some of these.

As I said to a colleague, maybe it was time for us to try to “run with the cool kids.” We hope you like it. We made some other site tweaks as well along the way to releasing this new entry page.

Let me know if you have any comments (good or bad) on this site re-design. Meanwhile, it’s time to get back to the substance . . . .

Uses and Control of Inferencing in Knowledge Graphs

Wed, 03/15/2017 - 17:10

Download as PDF

Dialing In Queries from the General to the Specific

Inferencing is a common term heard in association with semantic technologies, but one that is rarely defined and still less frequently described as to value and rationale. I try to redress this gap in part with this article.

Inferencing is the drawing of new facts, probabilities or conclusions based on reasoning over existing evidence. Charles Sanders Peirce classed inferencing into three modes: deductive reasoning, inductive reasoning and abductive reasoning. Deductive reasoning extends from premises known to be true and clear to infer new facts. Inductive reasoning looks at the preponderance of evidence to infer what is probably true. And abductive reasoning poses possible explanations or hypotheses based on available evidence, often winnowing through the possibilities based on the total weight of evidence at hand or what is the simplest explanation. Though all three reasoning modes may be applied to knowledge graphs, the standard and most used form is deductive reasoning.

An inference engine may be applied to a knowledge graph and its knowledge bases in order to deduce new knowledge. Inference engines apply either backward- or forward-chaining deductive reasoning. In backward chaining, the reasoning tests are conducted “backwards” from a current consequent or “fact” to determine what antecedents can support that conclusion, based on the rules used to construct the graph. (“What reasons bring us to this fact?”) In forward chaining the opposite occurs; namely, a goal or series of goals are stated and then existing facts (as rules) are checked to see which ones can lead to the goal. (” A goal X may be possible because of?”) The process is iterated until the goal is reached or not; if reached, new knowledge in terms of heretofore unstated connections may be added to the knowledge base.

Inference engines can be applied at the time of graph building or extension to test the consistency and logic of the new additions. Or, semantic reasoners may be applied to a current graph in order to expand queries for semantic search or for these other reasoning purposes. In the case of Cognonto‘s KBpedia knowledge structure, which is written in OWL 2, though the terminology is slightly different, the groundings are in first-order logic (FOL) and description logics. These logical foundations provide the standard rules by which reasoners can be applied to the knowledge graph [1]. In this article, we will not be looking at how inferencing is applied during graph construction, a deserving topic in its own right. Rather, we will be looking at how inferencing may be applied to the existing graph.

Use of Reasoning at Run Time

Once a completed graph passes its logic tests during construction, perhaps importantly after being expanded for the given domain coverage, its principal use is as a read-only knowledge structure for making subset selections or querying. The standard SPARQL query language, occasionally supplemented by rule-based queries using SWRL or for bulk actions using the OWL API, are the means by which we access the knowledge graph in real time. In many instances, such as for the KBpedia knowledge graph, these are patterned queries. In such instances, we substitute variables in the queries and pass those from the HTML to query templates.

When doing machine learning, generally slices get retrieved via query and then staged for the learner. A similar approach is taken to generate entity lists for things like training recognizers and taggers. Some of the actions may also do graph traversals in order to retrieve the applicable subset.

However, the main real-time use of the knowledge structure is search. This relies totally on SPARQL. We discuss some options on how this is controlled below.

Hyponymy, Subsumption and Natural Classes

The principal reasoning basis in the knowledge graph is based on hierarchical, hyponymous relations and instance types. These establish the parent-child lineages, and enable individuals (or instances, which might be entities or events) to be related to their natural kinds, or types. Entities belong to types that share certain defining essences and shared descriptive attributes.

For inferencing to be effective, it is important to try to classify entities into the most natural kinds possible. I have spoken elsewhere about this topic [2]; clean classing into appropriate types is one way to ensure the benefits from related search and related querying are realized. Types may also have parental types in a hyponymous relation. This ‘accordion-like’ design is an important aspect that enables external schema to be tied into multiple points in KBpedia [3].

Disjointedness assertions, where two classes are logically distinct, and other relatedness options provide other powerful bases for winnowing potential candidates and testing placements and assignments. Each of these factors also may be used in SPARQL queries.

These constructs of semantic Web standards, combined with a properly constructed knowledge graph and the use of synonymous and related vocabularies in semsets as described in a previous use case, provide powerful mechanisms for how to query a knowledge base. By using these techniques, we may dial-in or broaden our queries, much in the same way that we choose different types of sprays for our garden watering hose. We can focus our queries to the particular need at hand. We explain some of these techniques in the next sections.

Adjusting Query Focus

We can see a crude application of this control when browsing the KBpedia knowledge graph. When we enter a particular query, in this case, ‘knowledge graph‘, one result entry is for the concept of ontology in information science. We see that a direct query gives us a single answer:

However, by picking the inferred option, we now see a listing of some 83 super classes for our ontology concept:

By reasoning for deductive inference, we are actually broadening our query to include all of the parental links in the subsumption chain within the graph. Ultimately, this inference chain traces upward into the highest order concept in the graph, namely owl:Thing. (By convention, owl:Thing itself is excluded from these inferred results.)

By invoking inference in this case, while we have indeed broadened the query, it also is quite indiscriminate. We are reaching all of the ancestors to our subject concept, reaching all of the way to the root of the graph. This broadening is perhaps more than what we actually seek.

Scoping Queries via Proerty Paths

Among many other options, SPARQL also gives us the ability to query specific property paths [4]. We can invoke these options either in our query templates or programmatically in order to control the breadth and depth of our desired query results.

Let’s first begin with the SPARQL query that uses ‘knowledge graph’ in its altLabel: 

============== select ?s ?p ?o from <http://kbpedia.org/1.40/> where { ?s <http://www.w3.org/2004/02/skos/core#altLabel> "Knowledge graph"@en ; ?p ?o . } ==============

You can see from the results below that only the concept of ontology (information science) is returned as a prefLabel result, with the concept’s other altLabels also shown:

============== s p o http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Class http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2000/01/rdf-schema#isDefinedBy http://kbpedia.org/kko/rc/ http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2000/01/rdf-schema#subClassOf http://kbpedia.org/kko/rc/KnowledgeRepresentation-CW http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2000/01/rdf-schema#subClassOf http://kbpedia.org/kko/rc/Ontology http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2000/01/rdf-schema#subClassOf http://wikipedia.org/wiki/Ontology_(information_science) <http://wikipedia.org/wiki/Ontology_%28information_science%29> http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#prefLabel "Ontology (information science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontological distinction (computer science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontological distinction(computer science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontology Language"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontology media"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontologies"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "New media relations"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Strong ontology"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontologies (computer science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontology library (information science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontology Libraries (computer science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontologing"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Computational ontology"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontology (computer science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Ontology library (computer science)"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Populated ontology"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Knowledge graph"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#altLabel "Domain ontology"@en http://kbpedia.org/kko/rc/OntologyInformationScience http://www.w3.org/2004/02/skos/core#definition "In computer science and information science, an ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse."@en http://kbpedia.org/kko/rc/OntologyInformationScience http://kbpedia.org/ontologies/kko#superClassOf http://wikipedia.org/wiki/Ontology_(information_science) <http://wikipedia.org/wiki/Ontology_%28information_science%29> ==============

This result gives us the basis for now asking for the direct parents of our ontology concept, using this query:

============== select ?directParent from <http://kbpedia.org/1.40/> where { <http://kbpedia.org/kko/rc/OntologyInformationScience> <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?directParent . } ==============

We see that the general concepts of knowledge representation-CW and ontology are parents to our concept, as well as the external Wikipedia result on ontology (information science):

============== directParent http://kbpedia.org/kko/rc/KnowledgeRepresentation-CW http://kbpedia.org/kko/rc/Ontology http://wikipedia.org/wiki/Ontology_(information_science) <http://wikipedia.org/wiki/Ontology_%28information_science%29> ==============

If we turn on the inferred option, we will get the full listing of the 83 concepts noted earlier. This is way too general for our current needs.

While it is not possible to specify a depth using SPARQL, it is possible to use property paths to control the extent of the query results from the source. In this case, we specify a path length of 1:

============== select ?inferredParent from <http://kbpedia.org/1.40/> where { <http://kbpedia.org/kko/rc/OntologyInformationScience> <http://www.w3.org/2000/01/rdf-schema#subClassOf>{,1} ?inferredParent . } ==============

Which produces results equivalent to the “direct” search (namely, direct parents only):

============== directParent http://kbpedia.org/kko/rc/KnowledgeRepresentation-CW http://kbpedia.org/kko/rc/Ontology http://wikipedia.org/wiki/Ontology_(information_science) <http://wikipedia.org/wiki/Ontology_%28information_science%29> ==============

However, by expanding our path length to two, we now can request the parents and grandparents for the ontology (information science) concept:

============== select ?inferredParent from <http://kbpedia.org/1.40/> where { <http://kbpedia.org/kko/rc/OntologyInformationScience> <http://www.w3.org/2000/01/rdf-schema#subClassOf>{,2} ?inferredParent . } =============

This now gives us 15 results from the parental chain:

============== inferredParent http://kbpedia.org/kko/rc/OntologyInformationScience http://wikipedia.org/wiki/Ontology_(information_science) <http://wikipedia.org/wiki/Ontology_%28information_science%29> http://kbpedia.org/kko/rc/Ontology http://kbpedia.org/kko/rc/KnowledgeRepresentation-CW http://umbel.org/umbel/rc/KnowledgeRepresentation-CW http://kbpedia.org/kko/rc/PropositionalConceptualWork http://wikipedia.org/wiki/Knowledge_representation http://sw.opencyc.org/concept/Mx4r4e_7xpGBQdmREI4QPyn0Gw http://umbel.org/umbel/rc/Ontology http://kbpedia.org/kko/rc/StructuredInformationSource http://kbpedia.org/kko/rc/ClassificationSystem http://wikipedia.org/wiki/Ontology http://sw.opencyc.org/concept/Mx4rv7D_EBSHQdiLMuoH7dC2KQ http://kbpedia.org/kko/rc/Technology-Artifact http://www.wikidata.org/entity/Q324254 ==============

Similarly we can expand our query request to a path length of 3, which gives us the parental chain from parents + grandparents + great-grandparents):

============== select ?inferredParent from <http://kbpedia.org/1.40/> where { <http://kbpedia.org/kko/rc/OntologyInformationScience> <http://www.w3.org/2000/01/rdf-schema#subClassOf>{,3} ?inferredParent . } =============

In this particular case, we do not add any further results for great-grandparents:

============== inferredParent http://kbpedia.org/kko/rc/OntologyInformationScience http://wikipedia.org/wiki/Ontology_(information_science) <http://wikipedia.org/wiki/Ontology_%28information_science%29> http://kbpedia.org/kko/rc/Ontology http://kbpedia.org/kko/rc/KnowledgeRepresentation-CW http://umbel.org/umbel/rc/KnowledgeRepresentation-CW http://kbpedia.org/kko/rc/PropositionalConceptualWork http://wikipedia.org/wiki/Knowledge_representation http://sw.opencyc.org/concept/Mx4r4e_7xpGBQdmREI4QPyn0Gw http://umbel.org/umbel/rc/Ontology http://kbpedia.org/kko/rc/StructuredInformationSource http://kbpedia.org/kko/rc/ClassificationSystem http://wikipedia.org/wiki/Ontology http://sw.opencyc.org/concept/Mx4rv7D_EBSHQdiLMuoH7dC2KQ http://kbpedia.org/kko/rc/Technology-Artifact http://www.wikidata.org/entity/Q324254 ==============

Without a property path specification, our inferred request would produce the listing of 83 results shown by the Inferred tab on the KBpedia knowledge graph, as shown in the screen capture provided earlier.

The online knowledge graph does not use these property path restrictions in its standard query templates. But these examples show how programmatically it is possible to broaden or narrow our searches of the graph, depending on the relation chosen (subClassOf in this example) and the length of the specified property path.

Many More Options and Potential for Control

This use case is but a small example of the ways in which SPARQL may be used to dial-in or control the scope of queries posed to the knowledge graph. Besides all of the standard query options provided by the SPARQL standard, we may also remove duplicates, identify negated items, and search inverses, selected named graphs or selected graph patterns.

Beyond SPARQL and now using SWRL, we may also apply abductive reasoning and hypothesis generation to our graphs, as well as mimic the action of expert systems in AI through if-then rule constructs based on any structure within the knowledge graph. A nice tutorial with examples that helps highlight some of the possibilities in combining OWL 2 with SWRL is provided by [5]

A key use of inference is its ability to be applied to natural language understanding and the extension of our data systems to include unstructured text, as well as structured data. For this potential to be fully realized, it is important that we chunk (“parse”) our natural language using primitives that themselves are built upon logical foundations. Charles S. Peirce made many contributions in this area as well. Semantic grammars that tie directly into logic tests and reasoning would be a powerful addition to our standard semantic technologies. Revisions to the approach taken to Montague grammars may be one way to achieve this illusive aim. This is a topic we will likely return to in the months to come.

Finally, of course, inference is a critical method for testing the logic and consistency of our knowledge graphs as we add new concepts, make new relations or connections, or add attribute data to our instances. All of these changes need to be tested for consistency moving forward. Nurturing graphs by testing added concepts, entities and connections is an essential prerequisite to leveraging inferencing at run time as well.

This article is part of an occasional series describing non-machine learning use cases and applications for Cognonto’s KBpedia knowledge graph. Most center around the general use and benefits of knowledge graphs, but best practices and other applications are also discussed. Prior machine learning use cases, and the ones from this series, may be found on the Cognonto Web site under the Use Cases main menu item. [1] See, for example, Markus Krötzsch, Frantisek Simancik, and Ian Horrocks, 2012. “A Description Logic Primer.” arXiv preprint, arXiv:1201.4089; and Franz Baader, 2009.  “Description Logics,” in Sergio Tessaris, Enrico Franconi, Thomas Eiter, Claudio Gutierrez, Siegfried Handschuh, Marie-Christine  Rousset, and Renate  A. Schmidt, editors, Reasoning Web. Semantic Technologies for Information Systems – 5th International Summer School, 2009, volume 5689 of LNCS, pages 1–39. Springer, 2009.  [2] M.K. Bergman, 2015. “‘Natural Classes’ in the Knowledge Web,” in AI3:::Adaptive Information blog, July 13, 2015. [3] M.K. Bergman, 2016. “Rationales for Typology Designs in Knowledge Bases,” in AI3:::Adaptive Information blog, June 6, 2016. [4] Steve Harris and Andy Seaborne, eds., 2013. SPARQL 1.1 Query Language, World Wide Web Consortium (W3C) Recommendation, 21 March 2013; see especially Section 9 on property paths. [5] Martin Kuba, 2012. “Owl 2 and SWRL Tutorial,” from Kuba’s Web site.