1.2 Goals and Scope
2.1 Fifields of eGovernment
2.3 Process Integration
2.4 Semantic Web meets eGovernment
2.5 eGovernment Scenario
2.6 From eGovernment to eGovernance
3.3 Web Services
3.4 Web Service Composition
3.5 Semantic Web Services
3.6 Semantic Information Integration .
3.7 Related Work
4.3 General Idea
4.4 Overall View
4.5 Domain Ontologies and Semantic Web Services
4.6 Semantic Bridges
4.7 Matching and Data Flow
4.8 Process Execution
4.9 Logical Architecture
5.2 System Architecture for Composition Design
5.3 Semantic Web Services
5.4 Semantic Bridges
5.5 Semantic Web Service Composer
5.6 System Architecture for Composition Execution
5.7 Execution Engine
5.9 Validation and Veri cation
6.1 Coverage of Goals
6.2 Network E ect
7 Conclusion and Outlook
7.2 Open Problems
7.3 Future Work
This chapter elaborates the motivation for the thesis. It gives an insight into the relation between the domain of electronic government (eGovernment) and Semantic Web research. It outlines the interoperability challenges in eGovernment process integration and domain speci c conditions, which encourage the application of Semantic Web concepts. Furthermore, the goals of this thesis are explained and the scope of this work is clari ed. Finally, an outline of the thesis is given in the last section.
Electronic Government is a term for describing the application of information and com- munication technology (ICT) in public sector processes. Focus is put on process re- engineering to achieve more e ective and e cient public sector activities and to enhance the relationship between citizen and government. eGovernment covers a wide range of fields such as eDomocracy or eHealth, which is discussed in more detail in chapter 2. One cross- field problem is the integration of processes whenever di erent public agen- cies are involved in an activity. Processes are backed by ICT systems and applications have to be interconnected to seamlessly support process integration. In order to enable process integration, applications need to provide interfaces which are accessible by other applications. The demand for process integration has arisen from many domains, which has lead to standardized interfaces and protocols for inter-application communication. In the last years Web serviceshave taken the lead role in this field.
Today XML-based standards such as WSDL, SOAP and BPELare widely used for describing, composing and invoking Web services to achieve process integration. These technologies represent the foundation for establishing syntactical interoperability be- tween di erent applications. But semantic interoperability issues are left open. Because the semantics of Web services are not described explicitly they can not be automatically processed. Thus, necessary steps for composing services are primarily done manually, which leads to a high degree of complexity. For example, in a rst step of composing appropriate services have to be selected from hierarchical repositories [4, 5], whereas the speci c category for each service needs to be known a priori. Having chosen appropriate services a user needs to understand their implicit semantics in order to design the control ow and data ow.
With respect to the multitude of services participating in a process the creation of the data ow, i.e. the parameter assignments between the activities is a complex task and it requires the user to have and extensive knowledge about the underlying type representations.
In particular, when using services from di erent application domains comprehensive data type transformations have to be added manually due to existing di erent information representations.
The idea of bringing implicitly de ned service semantics to an explicit level by providing machine understandable Web service descriptions with formally de ned semantics promises to ease the composition process. The long term vision is to enable dynamic goal-oriented service composition and to use powerful inference engines and matchmaking mechanisms, in order to automate the whole composition process including discovery, composition, execution and interoperation of Web services. As it has been argued inon the road towards this goal still many problems need to be solved, whereby each further step can increase the level of automation.
The interconnection between eGovernment and Semantic Web research is twofold: On the one hand from the perspective of Semantic Web research eGovernment provides an ideal testbed. Due to the heterogeneity of information space in eGovernment, it provides the challenge to achieve interoperability for process integration in a largely distributed environment. At the same time the eGovernment domain exhibits a high degree of formality in key areas imposed by laws and regulations, thus encouraging the application of Semantic Web concepts based on formal modeling and description logics.
On the other hand from the perspective of eGovernment research Semantic Web technologies are the key for semantic interoperability in process integration and represent the foundation for achieving the vision of a knowledge-based, user centric, distributed and networked eGovernment [Mg2006].
1.2 Goals and Scope
As discussed before, achieving interoperability is an extensive process and thus often leads to signi cant obstacles in process integration. Therefore, this thesis aims at reducing the complexity of process integration by means of enhancing interoperability. Chapter 3 describes the state-of-the-art in this context and elaborates the three di erent aspects of interoperability namely technical, semantic, and organizational aspects. Whereas this work focuses on semantic interoperability.
In particular, this thesis elaborates an approach for semi-automatic design of data ows within the composition of semantically described eGovernment Web services across different domains making use of di erent information representations.
This thesis does not target automated planning as known in the context of arti cial intel- ligence research. The approach aims at easing the composition task by semi-automatic data ow design but leaves the planning task (i.e. which services to include at which part into the composition) to the eGovernment process expert. The data ow design should be supported by graphical modeling, which guides the process expert through the composition design and recommends parameter assignments between involved services. Thus, the technical and complex data ow design is shifted on a more abstract level. Necessary type transformation due to di erent information representations will be kept transparent from the design process. The core concept of the thesis will be the integration of a mechanism into the visual composition design realizing mediation between di erent information representations based on Semantic Web technologies. However, with respect to already existing or early evolving eGovernment Web services the approach should be realized as an additional layer on top of existing technology.
The concept should allow to express the mediation on the level of domain models rather than on application level, so that mappings between di erent information representa- tions only need to be de ned once. Furthermore, the concept should take into account the realistic perspective, that di erent domain models and resulting information repre- sentations need to evolve relatively independent from each other to serve best for their domain. Additionally, the mediation mechanism has to provide a level of expressiveness that enables complicated mappings between information representations from di erent domain models. And at the same time the mechanism should be easy to handle to assure maintainability.
To validate the approach a prototypical composition tool has been developed, which includes a graphical composer and an execution engine to run the composed services. However, the tool assumes that all services involved in a composition have already been discovered. In order to focus on the data ow challenges, the composition description only allows to de ne sequences of services and no further control ow design is supported. As input the tool requires eGovernment service descriptions and mappings between their di erent domain models. As the result of the composition design, an execution plan of the composed services speci ed in a proprietary format has been elaborated. Additionally, an execution engine has been realized which interprets the execution plan. Furthermore, it will be outlined how this execution plan, which is based on Semantic Web technology, can be mapped to existing industry languages for business process execution such as BPEL. The thesis will not focus on organizational questions, e.g. how to manage the representation of domain models and their mutual mappings.
Chapter 6 refers back to the goals presented in this section to evaluate whether and how far they have been reached.
Chapter 2 introduces into the domain of eGovernment and describes challenges arising from process integration. Subsequently, the motivation on how these issues can be addressed by applying Semantic Web concepts is outlined. Furthermore, an eGovernment scenario spanning multiple domains is presented to demonstrate the aforementioned integration challenge. The scenario is taken as a reference point in subsequent chapters. Finally, the chapter highlights the evolvement from eGovernment to eGovernance and points out its relation to this work.
The state-of-the-art is presented in Chapter 3. It gives information how the challenges in process integration can be addressed and sets the background for the later presented approach. The dimensions of interoperability are analyzed and related to the thesis goals. Base technologies for achieving interoperability are presented including Web Ser- vices, Web Service composition, Semantic Web services and information integration approaches. Finally, related work in the context of Web service composition support is presented.
Chapter 4 explains the concept of the developed approach for cross-ontology Semantic Web service composition. After specifying the requirements for the concept, the general idea of lifting the abstraction level in Web service composition is explained. Starting with an overview the di erent aspects of the concept are presented including domain ontologies and Semantic Web services, semantic bridges, the matching mechanism for data ow design and the process execution. Finally, a logical architecture for the concept is illustrated.
Chapter 5 deals with the implementation of the developed prototype. The implementation is divided into three parts: composition design, composition execution and scenario. Based on an illustrated system architecture the realization of the di erent components is explained. Finally, the validation and veri cation of the realized prototype is described and the usage of the prototype is clari ed.
Chapter 6 covers the evaluation of the presented approach. It analyzes to which extent the goals set in section 1.2 have been met. Furthermore, the impact of the presented approach on a large scale is analyzed.
Chapter 7 summarizes the presented approach and outlines the problems that have been left open. Moreover, it gives an outlook on future work.
2.1 Fifields of eGovernment
Electronic Government is a term for describing the application of information and communication technology (ICT) in the fields of state activity divided in its three branches legislature, judiciary and executive. It aims to improve operational processes in all three branches by re-engineering of traditional processes towards increased e ectiveness and e ciency and enhancement of participation and involvement of citizens.
Each branch targets the aforementioned goals from its speci c perspective. Applications in the legislature fall into categories such as eDemocracy or eParticipation. eDemocracy aims at enabling more direct democracy. By improving the citizen's awareness and knowledge through better information access and by easing the voting process through online voting, referendums could be hfield more frequently. eParticipation targets the involvement of citizens in public a airs. The idea is to increase participation, by for example online forums and polls based on open communicated government activities. Consequently, transparency increases, which in the long run should lead to a more accountable government.
Applications within the judiciary focus on assistance in legal processes. Terms like eJu- dice describe the application of trust and security technologies, e.g. digital identi cation. Furthermore, legal processes can be enhanced by legal knowledge based systems. The objective is to achieve better quality and e ciency in decision making processes as well as improved accuracy and consistency. Thereby, legal documents as laws and regula- tions, which are the foundation for decision making in legal processes, are modeled by trying to extract the semantics and logical relationships within these documents. Thus, expert systems using inference machines can be developed to support the mentioned goals.
The executive branch is in charge of the implementation of law, thus opening a wide range of operational processes. Here, the focus of eGovernment or eAdministration in particular, is on process automatization and on process re-engineering to achieve more e ective and e cient public administration and public services. Process re-engineering often includes process integration that raises interoperability problems, which are ad- dressed in this thesis. As process integration is relevant in all three eGovernment fields but in particular in eAdministration the next section covers it in more detail.
eAdministration covers a wide range of activities. Activities comprise the provision of online public services as online resident registration, online application of o cial documents such as a birth certi cate, online application for social aid or online tax declaration for businesses to name a few examples. Another field of activity for instance is to improve the administration in the health sector (eHealth), where lots of documents of patient data or treatment calculations need to be processed. An additional promising field is eProcurement. The idea is that online aggregation of procurement orders from di erent collaborating agencies can reduce purchase prices due to large-scale orders.
These various activities can be categorized into the following interaction fields:
- G2C: public services delivery from government to citizens
- G2B: public services delivery from government to businesses
- G2N: public services delivery from government to non-governmental organizations or to the non-pro t third sector
- G2G: public services delivery from government to other governmental agencies on local, regional, or inter-/national level
However G2C, G2B and G2N interactions to deliver services from the front-o ce often include G2G interactions with back-o ces from other governmental agencies.
Within these interaction fields the following interaction levels can be divided:
1. information: provision of government information to citizens, e.g. Web pages of public agencies
2. communication: interactive exchange of information, e.g. email communication with public agencies
3. transaction: online transaction to consume public services, e.g. online application for social aid
Present administrative systems are exceedingly paper intensive and require multiple levels of processing especially in transaction processes. In order to empower a successful transition from manual to electronic and highly automated processes, re-engineering of these processes is required. In this context process re-engineering means that during digitalization traditional operating processes are not mapped one-to-one into electronic processes, but whenever digitalization allows new and more adequate ways of realizing the process goals, they should be applied. In particular, this targets the perspective from which the processes are modeled. In order to achieve user-centric eGovernment, these processes are re-engineered around so-called life events, e.g. birth, marriage, job loss etc. or business events e.g. tax declaration or registration of a new company. This approach for process design demands for a more direct inter-agency communication to integrate the digitalized processes.
2.3 Process Integration
To get a clear understanding on the terminology following two de nitions are given.
Integration: Forming of a temporary or permanent larger unit of government entities for the purpose of merging processes and/or sharing information.
Business Process: The complete response that a business makes to an event. A business process entails the execution of a sequence of one or more process steps. It has a clearly de ned deliverable or outcome. A Business Process is de ned by the business event that triggers the process, the inputs and outputs, all the operational steps required to produce the output, the sequential relationship between the process steps, the business decisions that are part of the event response, and the ow of material and/or information between process steps.
Focusing on the operational issues processes in eAdministration can be regarded as equivalents to business processes in the business domain. In recent years business process management systemshave emerged to support business processes by the application of ICT. In order to enable process integration, business process interoperability (BPI) needs to be present. The basis for achieving BPI are service oriented architectures (SOA)  in general and Web service technologies  in particular. Furthermore, information integration issues have to be considered, which are discussed in chapter 3 in more detail. These concepts are the foundation for implementing processes spanning multiple organizational domains and applications.
The heterogeneous and distributed nature of the eGovernment domain makes process integration discussed above a crucial issue. Still many challenges need to be tackled (cp. 1.1). Therefore, research agendas include the exploration of integration and interoperability issues.
For example the European Commission funded ICT research project for innovative government Roadmapping eGovernment RTD2020 includes following goals:
- single access point to reach all public agencies
- di erent ICT systems need to seamlessly networked together, in particular new innovative systems shall exchange data interoperable with older ICT systems and applications
- ICT research shall emphasize new forms of dynamic networked co-operative busi- ness processes and optimized work organizations
The eGovernment action plan of the German Society for Informatics  named the following overall eGovernment research areas as to be investigated in the near future:
- Monitoring - Adaptation - Transfer
- Inter-Government Integration
- Information and Knowledge
- Digital Identity
- Human Resource and Change Management
2.4 Semantic Web meets eGovernment
The eGovernment domain is introduced in this chapter, whereas the background on Semantic Web research is given in chapter 3. The interconnection of these two research fields can be seen from di erent perspectives. On the one hand eGovernment can be regarded as a use-case or application for Semantic Web research. On the other hand one can start from eGovernment and apply Semantic Web technologies for progress towards distributed and networked eGovernment. This thesis mainly takes the latter approach. It starts from the interoperability challenge in eGovernment and elaborates an approach to ease interoperability based on Semantic Web technologies.
The eGovernment domain is a large, heterogeneous, dynamic and shared information space with various semantic di erences of interpretation. This results in the challenge to achieve interoperability. Technical interoperability issues could mostly be overcome through standardization of protocols and technical interfaces (cp. 3.2). However, seman- tic interoperability is still a key obstacle for networked and thus integrated eGovernment processes due to di erent representations of data objects and interfaces. Therefore, con- cepts and technologies are required to express and handle the semantics of these entities.
Furthermore, the eGovernment domain exhibits some key characteristics:
- a high degree of formality in key areas imposed by laws and regulations
- strong requirements to come to same decisions in similar situations
These characteristics encourage the application of Semantic Web concepts, that are based on formal modeling and description logics.
International research conferences re ect the potential for mutual gain in these domains, e.g. 2006 AAAI Spring Symposium Series Stanford, The Semantic Web meets eGovern- ment or The European Semantic Web Conference 2006, Workshop Semantic Web for eGovernment.
2.5 eGovernment Scenario
In order to demonstrate the interoperability challenge rising from process integration within the eGovernment domain, a cross-organizational scenario for the online application of a birth certi cate, as illustrated in gure 2.1, is given. Taking into account the step-by-step digitalization of eGovernment processes the output of the birth certi cate application is still a paper-based document assuming the lack of an infrastructure for digital signatures. The process includes a service for handling the payment of the birth certi cate fee, a resident registry service for checking the citizen input for consistence, a vital records o ce responsible for issuing the birth certi cate, and a statistical o ce to which the vital records o ce reports its activities.
Usually, in order to ensure interoperability, eGovernment applications need to provide standard Web service interfaces including well-de ned message sets. Indeed, in various countries national interoperability frameworks de ne XML schemes and Web service in- terfaces for exchanging data between administrations. An example for such a national e ort is the Danish eGovernment initiative, which focuses not only on the de nition but also on the reuse of base types and XML domain data structures. A key achieve- ment of the initiative is the "InfoStructureBase" , a shared repository for XML- based schemas. In Germany due to its federal structure the approach is less centralized. However, there are some initiatives such as OSCI-XÖV . But this initiative only ensures seamless interoperability within domain boundaries, e.g. through OSCI/XMfield (information exchange between registration o ces) or OSCI/XJustiz (XML Schema ex- change standard for legal authorities). It has to be taken into account that in cross- organizational and cross-border eGovernment processes services of various public agen- cies from di erent domains and with di erent areas of operations are involved. In such scenarios the lack of cross-domain semantic interoperability results in enormous integra- tion e orts.
illustration not visible in this excerpt
Figure 2.1: Internet application of a birth certi cate.
Refering back to the given cross-organizational scenario ( gure 2.1) the semantic inter- operability challenge arises in the following way: The domain standard employed by the resident registry uses a di erent data representation for names and addresses than that used by the vital records o ce. In one domain Name and Address might be di erent entities and Name might be a complex type consisting of di erent attributes for Given Name and Surname, while an Address might be a complex type consisting of attributes such as Street, Street Number, etc. In the other domain standard a concept for an address named PostalAddress might be modeled as a complex type that contains just one single attribute all together instead of Street and Street Number. FullName might be modeled similar. Moreover, ZIP from the vital records o ce and PostalCode from the statistical o ce might represent the same concept of a zip code but are represented as di erent XML schema types.
It is a fact that most eGovernment data exchange standards are being developed inde- pendently from each other and that in di erent eGovernment application domains the requirements for information granularity di er signi cantly. Therefore, it is not feasible to address this problem by introducing a global ontology or a global schema. In order to serve best for intra-domain integration, domain standards need to evolve independently from each other.
Chapter 4 refers back to this scenario ( gure 2.1) and presents a composition approach that provides a mechanism to ease semantic interoperability for inter-domain integration while at the same time preserving the independence of domain-speci c standards.
2.6 From eGovernment to eGovernance
In recent years the focus on multi-layered governance processes has gained popularity in political science as an alternative to traditional isolated government analysis. In consideration of globalization and technological progress governance describes the growing involvement of non-state actors in collective problem-solving at all levels from local to global, addressing the three main functions of collective problem-solving, i.e. policymaking, regulation, and service delivery.
illustration not visible in this excerpt
Figure 2.2: Interaction of political Actors forming Governance
The focus is put on the various interactions in networked governance processes and how they contribute to achieve public interest objectives. The same terminology follows the shift from eGovernment to eGovernance, where the impact of ICT in these networked governance processes is addressed. Hence, the role of process integration and interoper- ability becomes a central issue. However, there is not yet a common conceptualization of eGovernance  and sometimes the term is misused for describing the governance of ICT infrastructure. Nevertheless, eGovernance can be a crucial part in enhancing governance. Especially with regard to global governance - where no government exists - virtual collaboration, much easier to achieve and to maintain than traditional collab- oration, can contribute to link the various actors targeting the challenges globalization raises.
This chapter gives background information on how the challenges in process integration discussed in chapter 1 and chapter 2 can be addressed. The interoperability challenge is analysed and related to the goals of this thesis. Technologies for achieving interoper- ability are presented, including Web services, Web service composition, Semantic Web services and semantic information integration approaches. Finally, related work is pre- sented for both the composition aspect of this work and the resulting ontology mediation aspect.
In the context of the European Union's Information Society activities interoperability is de ned as the means by which the inter-linking of systems, information and ways of working, whether within or between administrations, nationally or across Europe, or with the enterprise sector, occurs. Furthermore, it says that interoperability is the chain that allows information and computer systems to be joined up both within organisations and then across organisational boundaries with other organisations, administrations, enterprises or citizens. Interoperability has three aspects:
- technical interoperability, which is concerned with the technical issues of linking up computer systems, the de nition of open interfaces and telecommunications
- semantic interoperability, which is concerned with ensuring that the precise mean- ing of exchanged information is understandable by any other application not ini- tially developed for this purpose
- organisational interoperability, which is concerned with modeling business pro- cesses, aligning information architectures with organisational goals and helping business processes to co-operate
While the requirement for interoperability seems obvious, it is a fact that information systems today are not interoperable in the way that eGovernment process integration can be realized to its full potential. Only with the ubiquity of internet technologies, based on open standards and speci cations namely TCP/IP, HTTP and SMTP etc., it has been possible to achieve a high degree of technical interoperability. In the context of process integration the recent development of Web service standards , which are discussed in more detail in the next section 3.3, have to be considered as well.
In order to enable ICT applications to exchange and combine information and accordingly process it in a meaningful manner, it requires agreement on more complex issues such as the relation to the context within information is created and used. This is what semantic interoperability is about. It involves agreement on how to represent and give context to information in order to exchange it. Semantic interoperability is a core requirement for distributed information systems to share and process information, even when they have been designed independently.
As this work focuses on semantic interoperability the subsequent sections concentrate on technologies and concepts which are exploited to achieve technical interoperability, which is a foundation for semantic interoperability on the one hand and technologies and concepts for accomplishing semantic interoperability itself on the other hand. In chapter 7 this work will also touch the aspect of organizational interoperability and outline in how far the presented approach raises requirements for organizational measures.
3.3 Web Services
Web service standards are the foundation for technical interoperability in process integration. The W3C de nes Web services as programmatic interfaces for application to application communication over the World Wide Web. Web services feature the following characteristics:
- programmable - Web services are accessible by programmable interfaces
- self descriptive - Web services include meta-data which are processable during runtime, e.g. name, description, version, quality of service etc.
- encapsulated - Web services encapsulate independent and discrete functionalities
- loosely coupled - Web services communicate over messages, implementation details are hidden
- location transparent - Web services are accessible from anywhere at any time only dependent on access rights
- reusable - Web services can be reused and combined to a new Web service
The interaction model of Web services is illustrated in the following gure 3.1:
illustration not visible in this excerpt
Figure 3.1: Service Interaction Model
The idea behind Web services is not new. Some researchers consider the CORBA middle- ware speci cationusing Interface Description Language (IDL) and Object Request Broker (ORB) communicating over internet protocols as rst solutions. However, with the emergence of XML as a standard exchange format Web services based on XML mes- sage exchange formats and XML based interface descriptions have taken the lead role. Within this context in recent years a couple of Web service standards have emerged. The base speci cations are:
- Simple Object Access Protocol (SOAP) - XML-based message exchange format and protocol,
- Web Service Description Language - XML-based description of the Web service's functionality, and
- Universal Description, Discovery and Integration - directory service for registration and dynamic discovery of Web services.
Standards for non-functional properties such as transaction supportor security and encryption issuesare evolving.
3.4 Web Service Composition
Web services are mostly applied as an instantiation of a service oriented architecture (SOA). Thereby, ICT systems supporting business processes are split of into a set of loosely coupled reusable services, where each service realizes one modular unit of busi- ness logic. Due to standardization Web services can be implemented independent from any platform and programming language. Subsequently, ICT applications that support business processes can be exibly realized as a composition of several services together realizing the business goal. The targeted exibility is ensured based on the characteristics of Web services described above. SOA promises to allow exible application integration and adaptation on changing business processes and thus has received much interest to support business-to-business applications or enterprise application integration (EAI).
The design process of such service compositions is also called programming in the large and their execution is refered to as Web service orchestration. In order to keep the composition independent from the underlying ICT infrastructure, the exact data ow and control ow is provided in a composition language, which can be interpreted by work ow execution engines. Di erent approaches for such languages have arisen, e.g. WSFL  or XLANG . However, the standardized Business Process Execution Language for Web services (BPEL) , which is based on the before mentioned Web service speci cations, has been the most successful. BPEL1 de nes a business process as an XML-serialized description of data ow and control ow between participating Web services and allows to run the process in a long-running asynchronous manner. Data ow and manipulation can be expressed in XML-related languages such as XPath and XSLT . In order to ease the design of service compositions in BPEL, vendors o er a range of graphical integrated development environments as illustrated in gure
3.2, e.g. Oracle BPEL Process Manager.
illustration not visible in this excerpt
Figure 3.2: integrated graphical development environment for BPEL process design
However, the composition design is still complex and time-consuming (cp. chatpter 1). The approach behind WSDL-based Web service descriptions and BPEL Web service composition is mainly syntactical. Thus, the implicit semantics of services can only be understood by a human composer. The lack of explicit semantics in Web service descriptions is an obstacle in increasing automatization and further tool support in the process of composition design.
3.5 Semantic Web Services
In consideration of the before discussed shortcomings the idea of bringing implicit ser- vice semantics to an explicit level has arisen. By providing machine understandable Web service descriptions with formally de ned semantics powerful inference engines and matchmaking mechanisms could be enabled, in order to automate the whole composition process including discovery, composition, execution and interoperation of Web services.
3.5.1 Semantic Web
The concepts and technologies for expressing explicit service semantics are based on Semantic Web research. The Semantic Web is an extension to the current World Wide Web by enriching its content with machine processable meaning or semantics. In order to enable machines to process Web content with regard to its meaning, the content needs to be expressed in a machine understandable ontology. Whereas ontology means a formal and explicit speci cation of a shared conceptualization of a domain. The formal and explicit manner ensures that the so modelled meaning can be processed by machines and the shared aspect ensures a commonly accepted understanding, so that the modelled meaning can be processed the same way anywhere. In more detail an ontology consists of the following elements:
- individuals or instances2 - are the base components of ontologies and represent concrete or abstract objects
- classes3 - represent sets of individuals and can be considered as types
- properties - represent characteristics of individuals and concepts
- relations - individuals, concepts, and properties can be related to each other ex- pressed by properties
- rules - formulate statements about individuals, concepts, properties, and relations dependent on other statements
Some traditional approaches regard rules separate from ontologies. But since the mathe- matical formalization of ontologies in terms of description logics, this distinction becomes obsolete and rules become an essential part of domain conceptualizations. Description logics are a subset of predicate logic. The ontology elements discussed before are repre- sented as predicates4 and logic operators within formulas. Description logics are aimed at being tractable on the one hand but keeping a high degree of semantic expressiveness on the other hand. Therefore, description logics are designed to be decidable in con- trast to predicate logic, that is undecidable . Thus, knowledge modelling gets a solid mathematical foundation. Subsequently, this formalism enables machines to interpret or reason over knowledge representations. However, there is a trade-o between semantic expressiveness and computational complexity of reasoning and thus many di erent vari- ants of description logics have emerged . Having modelled content in that manner knowledge based systems using inference engines and reasoners as illustrated in gure
3.3 can query and process the content as a knowledge base.
illustration not visible in this excerpt
Figure 3.3: The primary Components of a typical Knowledge Representation System based upon Description Logics
The vision of the Semantic Web is about applying these concepts to the World Wide Web and using it as a huge knowledge base enabling powerful knowledge based applica- tions. Consequently, reasoning has to be realized on a partial and incomplete knowledge base. This background has yifield to the concept of open-world semantics in contrast to closed-world semantics used in the context described before. The concept of open-world semantics assumes that the absence of information about a fact does not indicate that this fact is false. Hence, it is possible to reason over a dynamic knowledge base without generating contradictions.
The W3C has released several standards to realize the Semantic Web vision as illustrated in gure 3.4.
The foundation is build by means of a standardized encoding of data (Unicode), which joins di erent character sets to one international character set together with the Unified
illustration not visible in this excerpt
Figure 3.4: Semantic Web Vision
Resource Identi er (URI) standard, which allows the identi cation of any resource in the Semantic Web.
XML enables the structuring of data through opening and closing tags, which eases the structured processing by parsers. Tag names can be speci ed in di erent namespaces
(NS) to avoid name collisions. The underlying structural model of XML is hierarchical and thus an XML instance can be regarded as a tree. XML schema allows to specify grammars to de ne how the di erent tags can be structured.
The Resource Description Framework (RDF) provides a mechanism to make statements about data. These statements also called triples consist of subject, predicate, and object. A set of statements spans a graph, which is also referred to as the RDF-Graph. Conceptually RDF is based on the semantic relational data model. For example, it can be stated that Nils is the author of this thesis. As RDF can be serialized by XML, the statement above could then be represented as the following:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ex="http://exampleVocabulary.org/1.0/">
<rdf:Description rdf:about="http://www.ThesisURI.net/"> <ex:author>
All important resources in the Web should be identi ed by an URI, so that statements can refer to it. Objects can also be empty nodes or literals. Furthermore, predicates are de ned in vocabularies, which are referred to by namespaces. These vocabularies are speci ed in RDF Schema, which is similar to an ontology de nition language but less expressive and less formalized. For example, RDF Schema allows to de ne classes in terms of using inheritance and properties speci ed with their domain and range. The speci cation of RDF Schema is given in.
The ontology layer is based on the RDF layer. With the Web Ontology Language (OWL) the W3C has speci ed a widely accepted language to de ne ontologies. OWL is based on former ontology projects namely the DARPA Agent Markup Languageand the Ontology Inference Layer  and uses the RDF syntax as well as most of the RDF Schema constructs. Together with RDF and RDF Schema OWL belongs to the key standards of the current Semantic Web.
The layers discussed so far represent the current state of the Semantic Web research that has reached a clear conceptualization, whether the upper layers represent the future requirements for the Semantic Web and are much more under construction. Besides the logic layer, which is discussed below, the far reaching vision is to enable heuristic engines which can proof whether a statement is correct or wrong based on ontologies and rules queried from the Semantic Web. Furthermore, it is aspired to create a Semantic Web of Trust based on authentication mechanisms including digital signatures and trust relations.
However, currently the main debate is focused on the logic layer and how to integrate rules to make OWL more expressive. Several candidates for rule extension of OWL have been submitted to the W3C, whereas the Semantic Web Rule Language (SWRL)  has received much attention recently. In that context a debate has arisen, whether this extension by rules should be realized by means of splitting the Semantic Web stack with rules and OWL ontologies sitting side by side on the same level on top of an extra intermediate layer. The purpose is to allow closed-world semantics as an alternative to open-world semantics exposed by OWL. With respect to compatibility to the existing languages namely RDF and OWL Horrocks et al. present an approach in  that allows for forms of closed-world assumption by remaining the stack architecture. This context is also related to the presented concept for data ow design in chapter 4 as in the speci c context of Semantic Web services composition design the closed-world semantics approach is much more suitable.
3.5.2 Semantic Web Services
Web services are the dynamic part of the World Wide Web. The Semantic Web service vision illustrated in gure 3.5 is about applying Semantic Web technology to Web services, in order to combine exibility, reusability, and universal access of Web services with the power of semantic markup and reasoning.
illustration not visible in this excerpt
Figure 3.5: Semantic Web Service Vision
Bringing Web service semantics to an explicit level in terms of providing machine under- standable Web service descriptions with formally de ned semantics through ontologies promises to enable automation of services on the Semantic Web. The long term vision behind Semantic Web services is to enable dynamic goal-oriented service composition and to use powerful inference engines and matchmaking mechanisms, in order to au- tomate the whole composition process including discovery, composition, execution and interoperation of Web services. Research background comes from the Semantic Web community on the one hand and from the field of dynamic planning in arti cial intelli- gence research on the other hand.
Many approaches for semantic enrichment of service descriptions have arisen with sometimes overlapping or opposing concepts. The following four di erent speci cations have been submitted to W3C in recent years:
- OWL Web Ontology Language for Services (OWL-S)
- Web Service Modeling Ontology (WSMO)
- Semantic Web Services Framework (SWSF)
- Web Service Semantics (WSDL-S)
OWL-S is an OWL-based Web service ontology, which supplies a core set of markup language constructs for describing the properties and capabilities of Web services in unambiguous, computer-interpretable form. OWL-S markup of Web services facilitates fuller automation of Web service tasks, such as Web service discovery, execution, com- position and interoperation . As OWL-S is applied in the elaborated approach in chapter 5, this section covers OWL-S in more detail later on. The other approaches are also brie y explained, whereas chapter 5 points out why OWL-S was applied for the realization.
WSMO shares the vision with OWL-S, but it di ers greatly in the approach for achiev- ing it. WSMO is an alternative approach, which is not build on OWL. Furthermore, in contrast to OWL-S it does not de ne explicit service ontologies, but it provides a conceptual framework where ontologies can be speci ed in. Additionally, WSMO in- cludes a mediator concept to deal with the interoperation problems between Semantic Web services. WSMO de nes speci c mediator services which perform translations be- tween ontologies. However, these mediators attempt to reconcile the di erences between goals of Web services and it can be di cult to map this approach to non-planning ori- ented classical problems of Web service interoperation, i.e. discovery, composition, and invocation as stated in.
SWSF is another alternative approach likewise not build on OWL. It is very complex and aims at being a comprehensive framework that spans the full range of service-related issues including orchestration and mediation. However, the design of orchestration concept focuses on automated planning as well as the mediation concept, which therefore concentrates on goals similar to the mediator concept in WSMO.
In contrast, WSDL-S, the most recent submission to the W3C, is a light-weight approach for Semantic Web service description. It is based on WSDL 3.3 and de nes a mechanism to associate semantic annotations within a WSDL description. It externalizes the ontol- ogy language representation for the semantic annotations and thus allows the binding to OWL.
OWL-S is probably the most mature and most widely deployed comprehensive Semantic Web service technology. In particular, OWL-S is an upper ontology for services. It is structured into three complementary parts, that are further illustrated in gure 3.6:
- service pro le - for advertising and discovering services
- service model - for describing the operations of services
- service grounding - for describing the invocation of services and their binding to traditional Web services
OWL-S speci es that a service can have multiple service pro les. Furthermore, a service can be described by at most one service model and each grounding must be associated with exactly one service.
The service pro le tells "what the service does" to be used by service requesters for discovering or directories to categorize advertised services. The service pro le consists of three pieces of information. The rst is a provider description which holds contact
illustration not visible in this excerpt
Figure 3.6: Top Level of the Service Ontology
information about the corresponding responsible operator. Furthermore and most im- portant, the service pro le includes the functional description of a service. It consists of a description of input and output parameters by means of relating them to OWL concepts from domain ontologies. The next section 3.6 refers back to these parameter descriptions. Additionally, it describes preconditions required by the service and its ex- pected e ects. They are represented by logical formulas speci ed in terms of OWL-based concepts for expressions. These concepts modeling expressions are de ned in OWL-S and refer to language constructs of SWRL. Finally, it is possible to describe various non-functional properties, e.g. quality-of-service ratings or response time information in terms of OWL concepts.
Once a service has been discovered the service pro le is not used anymore. Subsequently, the service model is processed, which speci es how to interact with the service in terms of regarding the service as a process. The service model can either consist of an atomic process or a composite process. An atomic process expects one message and produces one message, whereas a composite process builds upon several atomic processes that can expect di erent messages over time depending on before received messages. Thus, by describing the service model in terms of a composite process a stateful service is described. The di erent dependencies can be expressed by various control constructs which specify the message ow. In order to make it possible for the service client to interact properly with the service, the service model also presents input, output, precondition, and e ect descriptions (IOPEs) for each atomic process as speci ed in the pro le.
Finally, the grounding of a service speci es how to access the service in terms of protocol, addressing, and message formats. Furthermore, the service grounding needs to deal with the mapping of abstract input and output parameters of atomic processes to concrete messages processed by concrete service realization. The default mapping is the WSDL grounding mechanism, however, di erent mappings are possible. An OWL-S service can be bound to a concrete WSDL-based Web service by means of mapping OWL-S atomic processes to WSDL operations and OWL-S input and output parameters to WSDL messages. However, as message parts in WSDL are speci ed using XML Schema by default and parameters in OWL-S are expressed in terms of OWL classes, this mapping task becomes tricky because XML Schema can not express the description logic based semantics of OWL classes. In order to avoid this di culty it is also possible to di- rectly use OWL classes as the abstract types of message parts in WSDL and bind their RDF/XML serialization to the message type speci cation in WSDL. But that would require that the concrete Web services are so called "OWL native speakers" with sup- port of their underlying implementation, which is commonly not the case. However, the common use case for OWL-S is to describe the semantics of existing WSDL-based Web services, which message parts are declared in terms of XML Schema types. Therefore, an OWL-S service grounding provides an XSLT mapping mechanism that transforms OWL instances serialized in RDF/XML into corresponding XML instances structured according to given XML Schema type respectively for service inputs and vice versa for service outputs. But as well as XML Schema XSLT is based on XPath and therefore conceptualized on a completely di erent abstraction level. Therefore, it can not capture the semantics of OWL. Thus, the successful mapping demands for complicated XSLT scripts speci c for each RDF/XML serialization. However, by providing the three ontol- ogy parts for specifying a service pro le, service model, and service grounding OWL-S enables explicit semantic enrichment of traditional existing WSDL-based Web services without any impact to the underlying implementation.
3.6 Semantic Information Integration
Having discussed how Web services and Web service composition can contribute to tech- nical interoperability and how their semantics can be described, now focus should be given to how these explicit semantics enhance semantic interoperability and enable se- mantic information integration. Generally, semantic information integration, often also referred to as enterprise information integration, is required when information from dis- parate sources with di erent conceptual representation needs to be processed uniformly.
Before refering back to ontologies this sections highlights two traditional approaches for semantic information integration, one in database systems and the other one in the context of the open distributed processing reference model (RM-ODP).
3.6.1 Semantic Information Integration in Database Systems
In database systems the thematic occurs in the context of data integration. Two dif- ferent approaches for data integration can be distinguished. Firstly, data integration can be realized by so called materialized integration, where data from di erent sources gets extracted, transformed, and loaded (ETL) into one single data store for uniformed processing. This approach is also called data warehousing and is used for data analy- sis in the context of supporting business decision making tasks. The weakness of this approach is the lack of data coherence when the original sources are updated but the single data store still contains the old data. Then ETL processing needs to be done again. Alternatively, data can just be integrated virtually by loosely coupling the di er- ent sources. This avoids the repeated ETL process but increases complexity. Instead of integrating the data physically a mediator with an integrated query interface is provided which transforms the queries to the virtual integrated database into speci c queries to each original source. Considering that data is represented di erently in the underlying database schemas of the original sources the di erent source schemas need to be mapped to a so called global schema of the virtual or materialized database. This is where se- mantic information integration takes place. To de ne an appropriate global schema is a challenging task. The global schema needs to express the overlapping concepts from di erent source schemas in a uniform manner. This task is mainly done manually, how- ever, various approaches have been developed for (semi-) automatic schema matching . Such a matching can be used to de ne the global schema by means of matching two sources to extract the overlapping part. The matching does only cover the design time task of semantic information integration. During runtime the global queries need to be translated into queries for the local sources. For this process a mapping between the schemas needs to be de ned. Ideally, a mapping is the output of an automatic schema matching. A mapping can be expressed by making use of so called views. Views are read only virtual tables of a data base schema composed of the result set of a query. The main approaches for mapping are the following:
- Global-as-View (GAV), requires that the global schema is expressed in terms of the data sources. More precisely, to every element of the global schema, a view over the data sources is associated, so that its meaning is speci ed in terms of the data residing at the sources.
- Local-as-View (LAV), requires the global schema to be speci ed independently from the sources. In turn, the sources are de ned as views over the global schema. The relationships between the global schema and the sources are thus established by specifying the information content of every source in terms of a view over the global schema.
In the GAV approach the views need to be updated whenever a source changes or a new one is added, which is in exible in a dynamic environment. In this regard LAV is more appropriate as the global schema remains unchanged even when sources are changed or added. However, in GAV the query reformulation task for the mediator can be performed straight forward as queries for the sources are already de ned in the views. In contrast query reformulation in LAV is more complicated. Queries need to be constructed in terms of analysing the views over the global schema, whereas the relation between entities in the global schema and entities in the local schema is only given inverse. Subsection 3.6.3 tries to relate the presented approaches in database schema mapping to integration approaches with ontologies.
3.6.2 Semantic Information Integration in the Reference Model for Open Distributed Systems
The reference model for open distributed processing (RM-ODP)is a joint standard of the International Standards Organization (ISO) and the International Telecommuni- cations Union ITU. RM-ODP o ers a conceptual framework and an architecture that integrates aspects related to the distribution, interoperation and portability of software systems, in such way that hardware heterogeneity, operating systems, networks, pro- gramming languages, databases and management systems are transparent to the user. In this sense, RM-ODP manages complexity through a "separation of concerns", ad- dressing speci c problems from di erent points of view . It is very comprehensive and aims at being a coordinating framework for any current and future standards in the field of open distributed systems. However, one of ODP's fundamental concepts is the use of a common object model, thus following the object-oriented paradigm. Soft- ware components are modelled as objects that interact via interfaces with other objects. These objects can be remote objects and run each on di erent machines. Therefore, in- teractions are realized as remote procedure calls. This considerations could explain why RM-ODP has received much attention in the context of object-oriented distributed sys- tems and especially the various standards related to the Common Object Request Broker Architecture (CORBA). In contrast to the context of recent Web service developments, in which RM-ODP is rarely mentioned, although many concepts and approaches of Web services can be found in RM-ODP.
Another fundamental concept of RM-ODP is the speci cation of a distributed system in terms of viewpoints. Besides the enterprise, the computational, the engineering, and the technology viewpoint, RM-ODP provides the information viewpoint, which focuses on the semantics of information and their processing. It describes the information managed by the system and the structure and content type of the supporting data. One of the common functions on which RM-ODP gives outline de nitions is the trading func- tion that targets these information viewpoints issues. In general, the trading function provides a centralized service for discovery, binding and interaction between di erent objects by making use of attribute-based descriptions, e.g. security policies or service advertisements. The foreseen usage of the trading function in the ODP framework is to support inter-object communication via interrogations and announcements.
1 BPEL is also applied to model non-ICT driven work ows within business processes in terms of modelling each single involved activity as a virtual Web service.
2 A-Box: assertional knowledge
3 T-Box: terminological knowledge
4 e.g. unary predicates for atomic concepts, binary predicates for atomic relations