Loading...

The Use of Ontologies in Practice

Research Paper (undergraduate) 2015 21 Pages

Computer Science - Applied

Excerpt

Content

1. Introduction

2. Revision of Semantic Concepts
2.1 Knowledge and Semantic Web
2.1.1 DIKW Pyramid
2.1.2 Semantic Web Technology Stack
2.1.3 Description Logics
2.2 Ontologies, XML and RDF(S)
2.2.1 Ontologies
2.2.2 XML
2.2.3 RDF(S)
2.3 OWL and SPARQL
2.3.1 OWL
2.3.2 SPARQL

3. Ontology-based Information Integration
3.1 Methods
3.1.1 Mappings
3.1.2 Ontology Integration Architectures
3.1.3 Reasoning, Inferring, Expert Systems
3.2 Tools
3.2.1 Semantic Web Tools
3.2.2 Ontology Editors

4. Ontology Editor - Protégé
4.1 General Background
4.2 Historical Background
4.3 Features
4.4 Internals
4.5 Building Ontologies
4.6 Critical Appraisal

5. Practical Applications
5.1 Industry Solutions
5.2 Established Ontologies
5.3 Biomedical Science and Protégé

6. Conclusion

7. References

1. Introduction

Enormous costs and efforts arise when it comes to information and data integration of distributed and heterogeneous information systems with the intention to create value. Often it is a challenge to establish a base for a common agreed understanding of the content of knowledge bases. For example, in the field of medicine a tremendous amount of expert terms builds the knowledge physicians have to work with. Those terms are related and have logic interrelations to other terms.

The web has a high potential as a platform to store and share knowledge. The problem is that web content was initially made to be understandable for human users, not machines. Thus, the meaning of terms and phrases is not necessarily clear to non-human agents. In addition, defective system can be the result.

The semantic web plays a role as a web concept where the semantic meaning of the web content is clarified and can be examined and analyzed. A very valuable ability is to infer new knowledge. Ontologies can help in this case as they are formal and explicit representations of concepts with intrinsic logical relationships. The W3C and the open-source community developed a variety of standards to represent ontologies in an expressive, formal and explicit way - also containing semantic relationships. Furthermore, a lot of methods and tools are available to support programmers, scientists and decision makers in the creation of ontologies. Especially, the ontology editor Protégé is a well-established tool to create, edit, visualize and to reason ontologies.

The remainder of this paper is organized as follows: Section 2 revises some of the fundamental semantic concepts necessary to create ontologies in OWL - The Web Ontology Language. Section 3 is devoted to a description of methods and tools for information integration based on ontologies. In section 4 we present the Protégé tool and discuss it’s characteristics in detail. Section 5 gives insight in some major practical applications where ontologies are used. Section 6 concludes the main findings.

2. Revision of Semantic Concepts

Semantic concepts are technologies and standards to describe the content of the web in a machine-readable format enriched by semantic meanings and logical interrelations between terms. In this section we give insight in the definition of knowledge and the interdependencies between semantic web technologies. Furthermore, we define the basics of logics. They are represented in ontologies to infer new knowledge from existing domain concepts. Various technologies are used to build up OWL (Web Ontology Language) and SPARQL (SPARQL Protocol And RDF Query Language) to represent and query knowledge in ontologies.

2.1 Knowledge and Semantic Web

2.1.1 DIKW Pyramid

For the understanding of the semantic web it is necessary to understand what knowledge is about. Regarding to the knowledge pyramid information is raw data extended by meaning, whereas knowledge is information extended by a context. Thus, it is obvious that knowledge can be implied by some information base and a context. Later we will call this approach reasoning (see Chapter 3.1). Figure 1 illustrates an adapted version of the original knowledge pyramid extended by examples [9],[36].

illustration not visible in this excerpt

Figure 1 - DIKW (Data Information Knowledge Wisdom) Pyramid [9]

2.1.2 Semantic Web Technology Stack

The semantic web technology stack (Figure 2) comprises the standards and technologies of the semantic web. According to this the foundation layer of the stack encompasses the standards for symbols and resources as a web platform. URIs (Unique Resource Identifiers) as an identification technology for web content serves using a web protocol like HTTP (Hypertext Transfer Protocol). Sharing of structured information is supported by solutions like XML (Extensible Markup Language). The creation of graph-based data models is done for example with the RDF model (Resource Description Framework) incorporating the URIs. Supported by a strong vocabulary and logical interdependencies OWL (Web Ontology Language) can then serve as the global language of ontologies, with even more expressiveness than RDFS (RDF Schema). On the top logical interdependencies built into the proofing module and OWL guarantees the ability to infer knowledge and the gathering of new relations [36].

illustration not visible in this excerpt

Figure 2 - Semantic Web Technology Stack [28]

2.1.3 Description Logics

Unlike propositional logic which only deals with entire propositions and first order logic which has an inefficient problem solving capability, description logic is expressive enough to represent information with their semantics and offering the logical capability to infer new knowledge [4]. Furthermore, properties are very formal. Reasoning algorithms are well-known. Basic logics in description logics include:

Atomic negations

Concept intersection

Universal restrictions

Limited existing quantification

Nominal

Inverse properties

Cardinality restrictions

2.2 Ontologies, XML and RDF(S)

To understand the later OWL technology it is essential to carefully investigate the properties of ontologies. OWL is constructed by the synthesis of several basic technologies like XML and RDF(S).

2.2.1 Ontologies

As Gruber formulated, an ontology is “an explicit, formal specification of a shared conceptualization” [17]. Thus, it represents concepts and their relationships within a specific domain in a formal and explicit way. As we have already seen ontologies can be modelled and analyzed with technologies of the semantic web technology stack.

Ontologies have several purposes. They are needed to represent a shared common knowledge of a specific domain and facilitate the reuse and analysis of this knowledge. Furthermore, they declare semantics explicitly and enable the knowledge sharing among various agents like software or people. Also, ontologies are helpful to make clear expressive statements.

An ontology consists of classes and their properties, as well as individuals (instances) and semantic relationships [31],[44]. Some of the most used relationships are:

Meronymy (“part of”)

Holonymy (“the whole of”)

Synonymy (“equal”)

Antonomy (“opposite”)

Hyponymy (specialization)

Hypernymy (generalization)

Figure 3 shows an example of an ontology about pizzas. As we can see the ontology helps to clarify what a “cheesy pizza” is. We can identify specializations like “CheesyPizza is a Pizza” as well as special relations like “CheesyPizza hasTopping CheeseTopping”.

illustration not visible in this excerpt

Figure 3 - Exemplary pizza ontology (based on [20])

2.2.2 XML

XML (Extensible Markup Language) is a tag-based meta-language. All elements, attributes and content is defined by named markup tags. The content is always plain string text or other tags with content arranged in a tree structure.

From this point of view XML is marginally useful to build knowledge in some extent since it provides a sharable format, is a widely spread web standard and able to develop markup languages for domains. Ultimately, XML can only describe syntax but no semantics and relations. XML tags are rather meaningless for software agents [31].

2.2.3 RDF(S)

RDF (Resource Description Framework) is a data model to provide the internet with metadata. Its core are triple statements to describe resources and attributes in a simple subject-predicate-object relation whereas the subject is the resource, the predicate the relation and the object a resource or literal.

On the one hand RDF supports formalizing knowledge in the way it describes semantics in a machine-readable, formal, explicit and standardized way. One the other hand it neither offers a format to convey content nor it can describe knowledge on an instance level. There is still a lack in complexity to describe ontologies with RDF. A combination of RDF/XML would lead to a possibility to represent the syntax too. However, it would not be possible to represent class descriptions.

RDFS (RDF Schema) compensates some of the disadvantages RDF struggles with to represent ontologies. It is a domain-neutral, formal schema language which provides a basis structure for classes and their properties. At this point we still miss some features to represent ontologies e.g. advanced logics to infer new knowledge out of existing knowledge [31].

2.3 OWL and SPARQL

The two most widespread standards in the area of the creation of ontologies are OWL and SPARQL. Ontology editors make highly use of them. Other ontology standards are SHOE and OIL (DAML+OIL).

2.3.1 OWL

OWL stands for “Web Ontology Language” and is a W3C standard. It extends the former semantic standards by expressive definitions of classes and properties, as well as semantics based on description logic. There are two OWL versions (OWL 1.0, 2004 and OWL 2.0, 2009) and three sub-languages (OWL Lite, OWL DL and OWL Full) available. The sub-languages differ in that extent that they contain a distinct expressiveness and decidability. OWL is known as the de-facto language of global ontologies. Based on a strong vocabulary and expressive description logic utilization consistency and satisfiability of ontologies can be checked. Furthermore, reasoning and inference of new knowledge is supported [21],[37].

OWL elements include namespaces, an ontology header (with metadata about the ontology), class and subclass definitions, properties and their characteristics, restrictions, maps and individuals. The new vocabulary includes many options to logically combine classes (disjunction, equivalence, complement), restrict relations by cardinalities and define properties of properties through transitivity, symmetry, functional and inverse.

2.3.2 SPARQL

SPARQL is a graph-based RDF query language to query RDF and OWL documents. Query triples do match data triples. As a result of a query a combination of matches will be delivered. The syntax is similar to SQL (SELECT, FROM, WHERE clauses). A SPARQL query consists of a prefix (namespace URIs), a query results clause (results forms, dataset sources and query pattern) and optional query modifiers [41].

3. Ontology-based Information Integration

A variety of methods and tools exist to overcome heterogeneity in information systems with the help of ontologies and the semantic web. Ontologies support the process in a way that they enable the automatic and semantic-oriented interoperability between machines. Supporting methods and concepts are mainly mappings, architectural styles and reasoners. Many tools like semantic frameworks or ontology editors are available to programmers and engineers to convey their concepts into practice [36].

3.1 Methods

Semantic heterogeneity is defined by a diverging semantic interpretation of the meaning that can be concluded when investigating a schema. With the help of ontologies and their logical interdependencies concepts can be inferred by implicit semantics in the schema. With the specification and the shared knowledge of a domain followed by logical inference semantic heterogeneity can be overcome.

Ontologies help to link semantically heterogenic systems as it delivers a vocabulary to describe concepts and relations of formal models. Furthermore, they act as global schemas of the mediation layer, a layer of indirection between users/applications and the data source layer. They offer explicit definitions of terms and relationships to be interpreted accurately from multiple different sources. In addition, ontologies provide a global query schema and verification techniques to guarantee the correctness between multiple sources [1],[36].

[...]

Details

Pages
21
Year
2015
ISBN (eBook)
9783656976103
ISBN (Book)
9783656976110
File size
1.8 MB
Language
English
Catalog Number
v300018
Institution / College
Technical University of Berlin
Grade
1,0
Tags
Ontology Editors Protégé Semantic Web Technologies Heterogeneity Information Integration

Author

Previous

Title: The Use of Ontologies in Practice