How do I describe the definition of the ontology

Introduction to ontologies

Transcript

1 Introduction to Ontologies Gregor Pickert Abstract "People can t share knowledge if they don t speak a common language" [t. Davenport] People, organizations and software systems need to communicate with each other and with each other. They bring different background knowledge, assumptions, viewpoints and a varying language vocabulary with them. As a result, misunderstandings and communication errors arise again and again. The solution to the problem can be to reduce the confusion of concepts and terminology or, better, to eliminate it and replace it with a uniform terminology with the corresponding concepts that is understandable to all. One of these concepts are the ontologies. In order to support the sharing and reuse of formally represented knowledge among AI systems, it is useful to define the general vocabulary in which the knowledge is to be represented. A specification of a representational vocabulary for a divided area, the presentation or definitions of the categories, the relations, the functions and other objects is called an ontology. [1] 1. Introduction The first thing to do is to clarify what an ontology is and what it can be used for. Originally ontology is a concept of philosophy. 1.1 Definition philosophy What we may call ontology is the attempt to say what entities exist, Metaphysics, by contrast, is the attempt to say, of these entities, what they are. [2] The term was later assimilated to computer science. 1.2 Definitions of Computer Science ... An ontology is a catalog of the types of things that are assumed to exist in a domain of interest D from the perspective of a person who uses a language for the purpose of talking about D. [3] " Ontology is the term used to refer to the shared understanding of some domain of interest which may be used as a unifying framework to solve the above problems. An ontology necessarily entails or embodies some sort of world view with respect to a given domain. The world view is often conceived as a set of concepts (eg entities, attributes, and processes), their definitions and their inter-relationships; this is referred to as a conceptualization. " [4] "The main purpose of an ontology is to enable communication between computer systems in a way that is independent of the individual system technologies, information architectures and application domain. The key ingredients that make up an ontology are a vocabulary of basic terms and a precise specification of those terms mean .... An ontology provides a set of well-founded constructs that can be leveraged to build meaningful higher level knowledge. The terms in an ontology are selected with great care, ensuring that the most basic (abstract ) foundational concepts and distinctions are defined and specified. The terms chosen form a complete set, whose relationship one to another is defined using formal techniques .... "[5] A specification of a representational vocabulary for a shared domain of discourse [. ..] is called an ontology. [1] In summary, one can say that ontologies should improve or enable communication between computer systems, computer systems and humans, but also between humans in the area of ​​a certain domain. Ontologies usually consist of a defined basic vocabulary, definitions and entities that are based on this basic vocabulary and a description of the relationships that exist among one another. The concept of ontology is closely linked to that of the knowledge base. The term database is well known, in the simplest case a database is simply an Excel table. A knowledge base goes beyond that, it is more powerful. According to [6], a knowledge base is a set of representations of facts about the world. Each of these facts is formulated in one sentence in a language of representation. You can add facts, ask for facts and, with the help of an interference mechanism, draw new conclusions, i.e. calculate over the knowledge base. Normally the closed-world assumption (prologue) is assumed for a knowledge base, which means that one assumes that nothing exists that is not represented in the knowledge base. Using the example of a customer database of a savings bank, for example, it can be concluded that a person who is not listed in this database

2 is not a customer. Other possibilities, such as an error in the database or that are not yet in the database, are excluded. This restriction makes the logical structure of the knowledge base simpler, for example one can conclude that something that is not black must be white if the knowledge base does not recognize any other classes. This of course also makes the calculations that run on the knowledge base much easier. On the other hand, it restricts the world very much because there will always be something that has not yet been classified. Before creating a knowledge base, you have to decide how to build it up. There are two broad distinctions: informal knowledge bases Here the types of knowledge bases are not defined or only defined in natural language. formal knowledge bases They consist of a collection of concepts and relations that are organized in a type-subtype order. This means that the types are logically different from one another and that things in the world are structured by a tree. One advantage of this is that I can derive knowledge. For example, I can tell taxi drivers about an object that it has a driver's license, is at least 18 years old and is a human, provided that these types appear in the ancestry chain. Problems arise with this type of knowledge base mainly when it comes to reusing knowledge, for example to combine several knowledge bases. John F. Sowa [3] lists three classes of problems, using natural language as an example: accidental In German, for example, the term "hand" extends from the fingertips to the wrist, but in Russian the corresponding word "ruka" extends to Elbow. So when a Russian operation report speaks of an operation on the forearm, "ruka" should not be translated as "hand". These problems are just there, they have no system. systematic Differences in sentence structure, for example, in English the arrangement subject, predicate, object applies, while in Latin or Japanese subject-object-predicate is used. cultural problems that arise due to cultural differences. For example, there are no words in German for Chinese spices that we don't use. The consequence of these difficulties is the Tower of Babel problem, one knowledge base cannot communicate with the other or only with immense effort, since the knowledge is not identically formalized. The problem is not the actual internal representation, but the classification, the theoretical division of the world. And this is where the concept of ontology comes in. As already mentioned, every knowledge base, even every database, of course, has a classification of the objects it contains. However, if you think about the Tower of Babel problem and the term ontology beforehand, you decouple the classification of the world from the rest of the structure of the knowledge base. One thinks about how best to divide and abstract the world to be represented. Particular emphasis is placed on the field of knowledge reuse, i.e. the reusability of knowledge. So you don't build the ontology completely application-specific, but try to make it applicable in other areas as well. Of course, you have to make compromises with this process of further abstraction, for several reasons: Inefficient to model the whole world if you only want to classify 100 screws. - Many of the classification decisions are ambiguous, for example depending on a specific language theory. The opposites of level of detail and breath of applicability, which are problematized in this way, must therefore be brought together, or an ideal compromise must be found. One approach to deal with this problem is the use of modules, which means that every ontology has an identical structure in the upper, general area, and parts can be loaded onto the lower levels, where the application-specific classes are listed [7]. Another problem so far has been the lack of standards that regulate the classification. Because only if the hierarchies in which the ontologies are built can also be transferred to one another, meaningful knowledge reuse can be carried out [8]. 2. Definition of ontologies In order to understand ontologies, the following context, which is described by the triangle of meaning [Figure 1], which defines the interaction between symbols, concepts and objects in the world, must be understood. Figure 1. Meaning triangle Symbols are mapped onto objects [Figure 2]. The ontology reduces the number of images from symbols to objects in the real world; in the ideal case, the image is unambiguous. [Figure 3]

3 Remark: Since there is an n m mapping between the lexicon and terms / relations, F and G are defined on sets. Set A of ontology axioms as well as the taxonomy H: terms are taxonomically linked by an irreflexive, acyclic and transitive relation H, {H C C}. H (C i, c j) means that C i is a search term for C j. 4. Method for developing ontologies Figure 2. Mapping symbols onto objects Since there is no standard method for developing ontologies or no clearly delimitable ontology engineering, I would like to introduce a method from Uschold [4] that includes the following: I Identify purpose II Development of the ontology IIa Record ontology IIb Program ontology IIc Integrate existing ontologies III Evaluation IV Documentation V Development guidelines Figure 3. Reduction 3. Abstract model of an ontology (1) Definition: An ontology is a tuple O: = (L, C, R, F, G, H, A), the components of which are defined as follows: Lexicon L: The lexicon contains a set of symbols (lexical entries) for terms, LC, and a set of symbols for relations, LR. Their association is the lexicon L: = LC LR. Set C of concepts: There is at least one statement about every c C in the ontology, through which it is embedded in the ontology. Set R of two-digit relations: R denotes a set of two-digit relations, where the domain and range of values ​​(CD, CR) are specified with CD, CR C. In addition, the functions d and r are introduced. Applied to a relation r R, these supply the corresponding definition and value range concepts CD and CR. Here d stands for domain and r for range, i.e. the functions describe the definition and value range accordingly. Two mapping functions F, G with F: 2 Lc 2 C and G: 2 Lt 2 R. F and G link symbols {l1, l2, ..., ln} L with the associated terms and relations in the given ontology. A symbol can refer to several terms or relations; conversely, reference can be made to a term or relation of several symbols. Documentation Identify the purpose Development of the ontology Capture Collect Generate definitions Review Programming Basic concepts Language selection Programming Integrate external evaluation Figure 4. Method for ontology development Guidelines Clarity Coherence Extensibility

4 Identifying the purpose In this part the following questions essentially need to be answered: 1. Which area should the ontology cover? 2. What should the ontology be used for? 3. Which questions should the ontology provide answers to? (competency questions) Development of the ontology Recording ontology The recording of an ontology can in turn be divided into four phases: collecting possibilities, creating definitions, reviewing and developing a metaontology. 1) Gathering options - The first thing to do is to have a brainstorming session in which all potentially relevant terms are put together. Each term presents a concept on its own. It is normal to have ambiguities and differing opinions. Then you start to group the terms. This means that you sort whether terms are related to one another or not. Finally, you identify semantic relationships. 2) Generate Definitions - Before starting to clear up ambiguities and look for words for the definitions, consider whether you are going top-down, bottom-up, or middle-out. Each method has its advantages and disadvantages. The bottom-up method achieves a high level of detail. This makes it harder to discover similarities and increases the risk of inconsistencies. The top-down method allows better control of the level of detail, but it involves the risk of indiscriminately specifying higher-level categories, which can lead to instability in the model. The middle-out method, on the other hand, provides an appropriate level of detail and leads to little additional effort. It starts with the most important concepts and uses them to formulate the concepts of higher levels. But how do you deal with ambiguity? The first thing to do is to stop using the ambiguous term. Second, you need to be clear about the various concepts in which the term has been used and rephrase those concepts and definitions. It can also be helpful to consider several definitions in parallel in order to decide which are possibly insignificant for the ontology. Finally, you should decide on a term and avoid the originally ambiguous term. In general, the natural language definitions should be formulated as precisely as possible and inconsistencies and circles avoided. 3) Review- A critical review of all definitions should take place, especially those that have changed significantly in the course of the process. 4) Development of a metaontology- As implicit requirements for the specification, natural language definitions should be used when creating the metaontology. These then provide the basis for the formal definitions. Programming ontologies Programming an ontology means the explicit representation of the recorded concepts and relationships in formal language. This includes: Defining the basic terms that are required to specify the ontology (e.g. classes, entities, relations) - also known as metaontology, language selection and programming. The acquisition and programming of ontologies is sometimes combined into one step. Integrating existing ontologies During the programming and creation of the ontology, the question arises as to how and whether one should use existing ontologies. In general, this is a very difficult problem. There is an approach by D. Skuce for this purpose. He believes that the main work should be to save all ontology matches. One way to achieve this is that all assumptions that the ontologies are subject to must be explicitly represented. Evaluation Uschold [4] is of the opinion that one should first look at what knowledge sharing has to offer in this area in order to then derive this for ontologies. Gomez-Perez offers a good definition of evaluation in the context of knowledge sharing: "to make a technical judgment of the ontologies, their associated software environment, and documentation with respect to a frame of reference ... The frame of reference may be requirements specifications, competency questions, and / or the real world. " [9] Documentation Poor documentation is the main barrier to using and reusing ontologies. To avoid this, one should document all important assumptions, both about the concepts that are defined in the ontology and all primitive terms that are used to define the ontology (meta-ontology). There are tools such as "Ontolingua" and "KSL Ontology Editor", which support the formal and informal documentation of assumptions. Development Guidelines The following criteria should be observed at all stages of development. Clarity This means avoiding ambiguities, justifying differences, and giving examples to help the reader. If possible, should

5 definitions can be specified using formal axioms and described using natural language. Coherence An ontology should be consistent in itself. This applies above all to the defined axioms. However, there should also be coherence in all parts of the definition that are not axiomatic, such as natural language and examples. Extensibility An ontology should be developed in such a way that it anticipates the use of further vocabulary and offers concepts to solve a lot of tasks. It should be possible to define terms for specific usage without having to revise existing definitions. This means that there should be as few ontological specifications as possible so that an ontology can be used for several knowledge bases and domains. Therefore, as few assumptions as possible should be made about the world model. Furthermore, the concepts and relationships should be specified on a level that is independent of the coding so that they can be implemented in multiple systems. 5. Ontology-based applications Sensible applications are conceivable in many areas of data processing and knowledge storage, here are just a few from Smith [10]: Information Retrieval and Extraction For example, the area of ​​Internet navigation is an area of ​​application, the aim would be to access the immense variety of sources classify. Systems like Yahoo already work that way by offering categories.If you wanted to set up an ontology here, you would have to work on these classifications in order to create a generally recognized system. Enterprise Integration When companies from around the world or from a region join forces, many problems arise due to different company structures, different areas of work or simply different languages. An ontology could help to see through the problems and to solve them more quickly. In this way, the targeted synergy effects could be harnessed more quickly. Military applications Here, too, there is a wide range of possible applications, such as the uniting of multinational troops. But also the compilation of militarily relevant information or the merging of cartographic data for missile control are areas of application. Other areas include database design, natural language processing, knowledge engineering, knowledge representation, knowledge management, knowledge sharing and knowledge integration, to name just a few keywords. 6. Summary and outlook On the previous pages, the general and theoretical foundations of the ontologies were explained. Everyone now agrees on the usefulness of ontologies. Different definitions come about due to the different coining of the authors. The situation is different in the area of ​​development methods. Uschold's development method subdivides e.g. "Capture purpose", "Develop the ontology" ("capture", "program", "integrate external ontologies") and "evaluation". It provides guidelines and recommends the documentation of the ontologies during development. It should be clear to everyone that this area will continue to be the subject of research, particularly with regard to the development of ontologies in certain areas such as biology, medicine and business administration. Looking at ontologies in practice, it becomes clear that research is needed in the field of tools to support development, in particular to improve the integration of existing ontologies and to enable the use of ontology libraries. It is foreseeable that the use of ontologies will go beyond the previous areas of communication, software engineering and interoperability. References [1] Gruber: A Translation Approach to Portable Ontology Specifications gicl.mcs.drexel.edu/people/regli/classes/ KBA / Readings / KSL pdf [2] Dictionary of Philosophy of Mind [3] JOHN F. SOWA [4 ] M. Uschold, M. Gruninger, "Ontologies: Principles, Methods and Application", Knowledge Engineering Review, 11 (2), 1996 [5] [6] Russel / Norvig (1995): Artificial Intelligence - A Modern Approach Prentice Hall International Editions, ISBN [7] [8] [9] A. Gómez-Pérez, N. Juristo, and J.Pazos: Evaluation and assessment of knowledge sharing technology, In NJ Mars; Towards Very Large Knowledge Bases - Knowledge Building and Knowledge Sharing, pages IOS Press Amsterdam, 1995 [10] Smith /ontologies.htm FactCIA LiLog Klose / Lang / Pierlein (eds.) (1992): Ontology and axiomatics of the knowledge base of LILOG. Computer science reports 307, Springer Verlag