Loading...

Ontology Unification/Merging

Seminar Paper 2004 35 Pages

Computer Science - Commercial Information Technology

Excerpt

Table of Contents

1 Introduction

2 Concept of the Semantic Web
2.1 Basic Layers - Unicode, URI, XML, RDF
2.2 Ontologies
2.3 Logic, Proof, Trust

3 Terminology

4 OWL
4.1 Basic Elements
4.1.1 Namespaces
4.1.2 Classes / Individuals
4.1.3 Properties
4.2 Import and Mapping Mechanisms
4.2.1 Import
4.2.2 Mapping Classes, Properties and Individuals

5 Merging OWL Ontologies
5.1 Mapping
5.1.1 Mismatches
5.1.2 Mapping Techniques
5.2 Aligning
5.3 Combining

6 Merging Tools: Two Examples
6.1 Testing Environment
6.2 MoA
6.2.1 Algorithm
6.2.2 Test Results
6.3 Prompt
6.3.1 Algorithm
6.3.2 Test Results
6.4 Comparison

7 Conclusion

Appendix A: Test Ontologies

Cars Ontology

Automobiles Ontology

Appendix B: MoA’s Results

MoA’s User Interface Output

MoA’s Merged Ontology

Appendix C: Prompt’s Results

Prompt’s Merged Ontology

References

1 Introduction

Within the last two decades, the internet has grown from a small network used by military and educational facilities to a world-wide system reaching about 800 million people[1]. With the growth of users came an increase of available web pages, everyday thousands of pages are added to the net, each of them bearing additional information.

Today, a lot of the world’s knowledge can already be found on the internet, somewhere. Still it is hard to find the information a user needs. Basically all of it is stored in form of natural language, but not just in a single one: Almost every language or dialect that exists in the world can also be found on the internet. This makes it very hard for machines to efficiently retrieve information and almost impossible to infer new knowledge.

The concept of the Semantic Web[2]is an attempt to address this issue. Its basic idea is to create a representation mechanism that allows internet content to be machine understandable and usable.

Since 1998, when Tim Berners-Lee, “inventor” of the World Wide Web, firstly presented the vision of a Semantic Web[3], a lot of research has been done in this field. But while the basic technologies - like XML - are available and widely accepted, the upper level tools have not matured yet. As they are the ones that are supposed to fill the Semantic Web with life, they are a necessary prerequisite for any Semantic Web system to work. Especially without a stan- dard ontology language, there is no common way to encode any knowledge on the Semantic Web.

In the last couple of years, many proposals have been made for such a standard language. Finally, in Feb. 2004, the W3C recommended specifications for the Web Ontology Language (OWL), which is likely to become a de-facto standard. This paper will therefore focus on OWL, which will be introduced later.

Still, for the Semantic Web to work, there must not only be a language to encode knowledge, but machines must also be able to infer new information. To achieve this goal, they need to be able to combine several different knowledge bases in order to extract new facts that were not encoded in one of the single representations before. This process is called Ontology Merging or Ontology Unification. Without it, the Semantic Web would not really be a “Web”; searching it would only be investigating single sites. Only through merging of different sources, the whole potential of the Web can be revealed. This paper will introduce some methods for combining various ontologies. Furthermore, OWL integrates some syntax features that support this process. They will be presented as well.

There are already some well developed techniques for Ontology Merging, most of them using proprietary ontology languages, though. Hopefully, with the W3C specifications made, research can be concentrated on this common language. Still a lot of work has to be done before the vision of the Semantic Web can come true.

2 Concept of the Semantic Web

In 1998, Tim Berners-Lee suggested a layer model for the development of the Semantic Web (Figure 1)[3]. Today, this model is widely accepted.

Abbildung in dieser Leseprobe nicht enthalten

Figure 1: Layers of the Semantic Web[4]

2.1 Basic Layers - Unicode, URI, XML, RDF

The technical bases for the Semantic Web (SW) are Unicode and URIs.

In terms of the Semantic Web, URIs are used to refer to concepts[2]. This fixes a problem of natural language, in which the same word might mean different things depending on the con- text. By using different URIs for different entities or concepts, it is easy to distinct between them.

Ontologies are encoded using underlying Extensible Markup Language (XML)[4], which has become a widely accepted standard in the internet world. Additionally, XML-Schemas define the structure of XML-Documents[5]. To avoid confusion when two different tags in a com- mon system have the same name, Namespaces (NS) are utilized, too. That way, concepts in ontologies can be clearly recognized and distinguished, even if they have identical labels.

The Resource Description Framework (RDF) can be seen as the first layer of the Semantic Web[6]and is used as the base for describing machine readable metadata. RDF can be completely encoded in XML.

2.2 Ontologies

RDF supplies a common syntax for expressing information. Its major drawback is, however, that it offers only basic semantics for describing knowledge[7]. RDF documents describe a labelled graph, but not more. RDF Schema basically provides a subclass hierarchy and a property hierarchy. Unfortunately, it is not expressive enough to encode the complex semantics needed for the Semantic Web. This is where ontologies provide a solution.

“Ontology” originates in philosophy, where it is a theory about the nature of existence[2]. Philosophers researching on Ontology “try to answer the questions ‘what being is?’ and ‘what are the features common to all being?’”[8].

In the last years, ontologies have gathered common interest in computer science, too. Here, it “refers to an engineering artefact, constituted by a specific vocabulary used to describe a cer- tain reality, plus a set of explicit assumptions regarding the intended meaning of the vocabu- lary”[9].

Usually, ontologies are used to represent knowledge. In order for the reader to understand this knowledge, he must use the same language as the developer of an ontology. A lot of different ontology languages were developed in recent years1, all of which have specific advantages and disadvantages. In chapter 4, OWL will be introduced as an example of such a language.

2.3 Logic, Proof, Trust

In the Logic Layer, rules should be encoded that allow reasoning on the facts stored in the ontology layer[6]. That way, new information can be derived, which can then be used in the Proof Layer to verify a given statement. Conceptually, a user will be able to ask a certain question that will then be proofed right or wrong by the machine. Logic and Proof Layers are used in this process to verify the information.

Of course, not everyone on the Web is trustworthy. To deal with this issue, a Trust Layer will be used to set different levels of trust[11]. In order to verify the ownership of a piece of information, a Digital Signature can be applied.

Although there has already been some research on these topics, no well developed technologies exist to support these requirements[6]. Of course, as the Semantic Web develops bottomup, it can be expected that they will follow in the next few years.

3 Terminology

Many different terms exist concerning ontology merging. In the related literature, some of the terms are synonymous or contradictory. Nevertheless, it is important to have a clear under- standing what the terms mean when reading this paper. In this section, some of them will be clarified.

Before two ontologies can actually be merged, their included concepts need to be mapped. Mapping is relating of similar concepts from different sources to each other by an equivalence relation[12]. To do this, one has to define what concepts can be considered similar. In Mapping, the source ontologies remain unchanged[13].

After the mapping has been done, the ontologies can be aligned. Aligning means to find a representation, in which both ontologies agree with each other[14]. Both of them can then be modified to correspond to that representation. Aligned ontologies are typically required to be consistent and coherent[15]. Usually, sub-concepts are introduced into the ontologies in order to mutually create counterparts for concepts in the other ones.

A partial alignment is, as the name implies, an alignment of only certain parts of the ontologies[16]. This can proof useful if a user or developer needs only specific pieces out of a rather big ontology. A complete alignment could be too much effort in that case.

There are two basic methods to combine ontologies: merging and integrating. Mapping and aligning are sub-processes of both of them.

Merging is the process of combining several ontologies into one without additional information[17]. Thus, there must be a common subject to all of the ontologies, i.e. they must have the same domain. Sowa calls the merging process Unification[18]. Throughout this paper, the terms Merging and Unification will be used synonymously.

In contrast to merging, Integration considers ontologies that have no common subject[17]. In order to combine them, an integration tool can try to obtain further ontologies that can be used to establish a connection[19]. It will then combine the given ontologies using the acquired data to link them.

In part 4.2, some mapping mechanisms are introduced, which are supported on the OWL language level. Later, in chapter 5, Merging and its sub-elements Mapping and Aligning will be discussed in more detail. This paper will not cover Ontology Integration.

4 OWL

OWL is the official recommendation of the W3C for a standard Web Ontology Language [20]. It was chosen out of many submissions that were proposed in the recommendation proc- ess and is based on DAML+OIL, a language developed in EU/US cooperation. Experience has shown that most of the W3C’s recommendations have become standards after some time; OWL is likely to develop into one, too. Thus, this paper will only introduce OWL as an on- tology language. It should be noted though, that there are many other ontology languages2 that have quite similar features[10].

In this chapter, the fundamental concepts of OWL will be introduced as well as some more specific techniques used for concept mapping3. For a better understanding, most of them will be explained by specific examples that were picked from a car domain.

4.1 Basic Elements

In contrast to some other ontology languages, OWL is completely based on RDF and XML. Every correct OWL document is also a legal RDF and XML document.

4.1.1 Namespaces

When building two different ontologies, it might happen that the same tags are used in both of them to describe different ideas. To correct this problem, namespaces are used to put all of the used terms in a specific context. With them, the same keyword can be used in different or even the same ontology without conflict. Also, namespaces allow concepts to be referenced in another ontology.

Abbildung in dieser Leseprobe nicht enthalten

The first line declares the namespace with the prefix cars, linking it to a specific URI. The second line makes the same resource the default namespace. The last line sets the namespace for the owl prefixed tags. Usually, this statement is followed by more declarations for rdf, rdfs and xsd namespaces.

4.1.2 Classes / Individuals

OWL implements a system of classes, objects and inheritance like it is known from all current programming languages. A class is defined using the owl:class tag. Within a class declara- tion, the rdfs:subClassOf statement can be used to declare inheritance from another class. This way, complex taxonomies of many different classes can be built. The following example specifies a class with the identifier Car, which is a direct subclass of another (elsewhere declared) class Vehicle:

Abbildung in dieser Leseprobe nicht enthalten

An individual in OWL is what is known as an object in programming languages, i.e. a member of a class. Individuals are declared using the Class-Identifiers specified earlier. The following defines JimsCar, a member of the Car-class:

Abbildung in dieser Leseprobe nicht enthalten

4.1.3 Properties

Properties can be used to state facts either about complete classes or about individuals. There are two different types of properties:

An object property refers to another object and can be used to define relationships between different classes or individuals. It uses a quite complex syntax:

Abbildung in dieser Leseprobe nicht enthalten

These statements define a property of the class Car. It has the identifier hasEngine and relates the Car class to another class named Motor.

For individuals, object properties are applied by using their names as tags. If Jim’s car has a V8 engine, that could be stated as follows:

Abbildung in dieser Leseprobe nicht enthalten

A datatype property is used to relate classes or individuals to RDF literals or XML Schema datatypes. Its OWL syntax is quite similar to the object datatype one:

Abbildung in dieser Leseprobe nicht enthalten

The above assigns a property named hasHorsepower to the class Car and has a value in the positive integer range.

Datatype properties for individuals are used with the same syntax as object properties.

4.2 Import and Mapping Mechanisms

If separate ontologies shall be merged, connections need to be made between single elements within them. OWL supports this process by supplying some syntax elements that allow introducing relationships between ontologies.

Using these features, a developer can not only support easy unification of ontologies but also spare herself work that has already been done before. That is, these instruments allow the reuse of already existing ontologies.

4.2.1 Import

Using namespaces (see 4.1.1), an ontology A can directly reference another ontology B. Still, this method does not include the semantics of the referenced items into A. To fix this prob- lem, the developers of OWL have included an import statement that adds ontology B com- pletely to A.

Imagine you would want to add color information to the Car ontology. Because there is an ontology on the web that covers everything about paints and varnishes, you do not want to model everything again. Just use the following statements to import the other ontology:

Abbildung in dieser Leseprobe nicht enthalten

You probably also want to add a namespace declaration and a doctype definition in order to be able to use the imported ontology more conveniently:

Abbildung in dieser Leseprobe nicht enthalten

Here, the paint Bordeaux Metallic from the imported ontology is used to describe the color of Jim’s car.

Note that the import statement is transitive. That means that if you import an ontology, which in turn imports other ontologies, they will also be loaded.

4.2.2 Mapping Classes, Properties and Individuals

If two ontologies are to be merged, a tool needs to know where both of them overlap. As a result, the developer can define which of their concepts are equivalent.

[...]


1 For an overwiew, see [10].

2 For example SHOE, OML, XOL as well as the predecessors of OWL, DAML and OIL.

3 For a more detailled description of all the OWL syntax elements presented in this chapter refer to [21].

Details

Pages
35
Year
2004
ISBN (eBook)
9783638309066
File size
1 MB
Language
English
Catalog Number
v29389
Institution / College
University of Trier – Information Systems
Grade
1,0 (A)
Tags
Ontology Unification/Merging Technologien Semantic

Author

Previous

Title: Ontology Unification/Merging