Loading...

XML Schema or just RELAX?

Seminar Paper 2001 22 Pages

Computer Science - Commercial Information Technology

Excerpt

Table of contents

1 Introduction

2 E-commerce requirements

3 The impact of DTD and XML on Internet commerce

4 The need for schema

5 Evaluation of the XML Schema recommendation
5.1 Schema and schema languages
5.2 XML Schema recommendation
5.3 Contrasting XML Schema against DTDs

6 RELAX as an alternative schema language
6.1 Why alternative schemas were born
6.2 RELAX grammar definition.
6.3 The better schema: RELAX versus XML Schema

7 Summary and conclusion

Table of abbreviations

illustration not visible in this excerpt

1 Introduction

Today’s e-business consists largely of interactive marketplaces and portals which offer services to their customers. Especially in the field of Business-to-Business (hereafter referred as to B2B) commerce vertical and horizontal interoperability between the heterogeneous IT systems and platforms of trading partners are essential for business endeavours in order to automate transac- tions and to cut costs respectively.1 With technologies such as XML (Extensible Markup Lan- guage) or DTD (Document Type Definitions) disparate systems of stakeholders can be inte- grated into the corporate value chain, thus eliminating human interaction by describing and in- terpreting the content of the data exchanged between them.2 However, due to the lack of one agreed-upon format or schema standard transenterprise communications remains a sophisticated challenge for e-commerce businesses.3 The attempt of developing a standard schema language has been one of the most watched efforts in the last couple of years. Recently, the W3C Consor- tium has released XML Schema, which is supposed to facilitate the data exchange among trad- ing partners.4 During its development phase, the need for interoperability among e-businesses grew to the extend that other organizations came up with alternative schema languages such as RELAX.5 However, the variety of emerging schema languages caused XML Schema not be- coming a standard yet.6

The intention of this paper is to show how schema languages emerged from the need of interop- erability and how they solve many challenges presented by e-commerce requirements. Further- more the ideas and philosophies behind two significant schemas for XML, the XML Schema and RELAX, are highlighted and contrasted against each other and against the intention of DTDs. The question to be answered is whether XML Schema can compete against alternative schemas in the long run and whether it can still become a widespread standard within the rap- idly growing e-commerce environment.

The main part of this paper is divided into four core parts. The first section will give the reader an overview about the current e-commerce needs and requirements, followed by a critically evaluation of DTD (Document Type Definitions) and XML (Extensible Markup Language) as considerable options to attempt meeting the e-businesses requirements presented earlier. The next section will introduce a recent advance in XML technology, the Schema language, whose contribution and utility for e-commerce will be highlighted. In particular XML Schema as the W3C recommendation will be evaluated and critically compared with its significant competitor RELAX in the fourth section. The last part comprises a critically discussion about the schemas’

2 E-commerce requirements

In the nineties the Internet revolutionized the Information age in many ways.7 In its first genera- tion (circa 1993-1997) Internet applications were basically focussed on simple browsers and static web sites. Web content was rather unstructured and presented in HTML. Later, between 1997 and 1999, the emphasis of the second generation of Internet applications was tending more on interactive applications, enterprise integration and application servers. As the number of commercial activities over the Internet, the number of trading partners as well as the sophistica- tion of Internet applications increased significantly, businesses which deployed e-commerce systems started stressing the importance of transenterprise communication and harmonizing business models, processes and representation formats.8 The idea of unifying the heterogeneous platforms, applications and frameworks between trading partners changed the views of e- commerce. Frictionless e-transactions and sustainable network relationships with trading part- ners would add significant value to customers since real-time data could be collected from them, thus being able to respond better and faster to their needs. In addition smooth transactions over the Web would increase productivity and improve data mining among the involved organiza- tions. The need for interoperability among systems deployed by businesses engaged in com- merce on the Internet became ubiquitous: “Your business simply won’t scale if you have to handle orders manually.”9

At the bottom line, integration of physically separated Internet application is one of the main objectives organizations in the e-business field are focussing on: “The rapid proliferation - and sometimes equally rapid disappearance - of a wide variety of portals, marketplaces and other B2B resources means IT managers have to come up with a way to quickly develop any type of online relationship that can help their companies achieve their business objectives.”10 Moreover, the highly competitive environment in e-commerce demands this higher level of interoperability and urges companies to integrate business and trading partners electronically into the own ERP or SCM system in order to not getting locked out of major opportunities.11 Examples within the company’s borders such as EAI (Enterprise Application Integration) and EDI (Electronic Data

Interchange) have shown that exchanging information between disparate corporate applications is difficult but worth the effort and investment.12

As a fact, the barrier of e-commerce is the lacking ability of Internet applications to share in- formation among each other. What has to be done is making the mass of information on the Internet and especially the e-business related documents much more usable than before; usable in a way that even trading partners a company is not doing business with today will be able to interpret and use the documents, and will get integrated to the corporate IT infrastructure eas- ily.13 Another challenge is that interoperability among disparate systems can be achieved in multiple ways, as for example with rich documents, content-syndication or business semantics, but the effectiveness and efficiency of these proposals depends on the angle from which they are seen.14

The second generation of Internet applications introduced the Document Type Definitions (DTD) and the Extensible Markup Language (XML). These technologies opened new opportu- nities in chaining together companies with business partners, suppliers and customers. They will be described and evaluated thoroughly in the following section.15

3 The impact of DTD and XML on Internet commerce

For describing the structure of documents DTD (Document Type Definitions) and XML (Exten- sible Markup Language) are key technologies. Document description is needed to make docu- ments usable among a variety of disparate systems. That is achieved if those descriptions are readable to automated processors such as compilers, parsers, editors or other tools. In their data descriptions they define for example what elements and attributes the data contains, how they should be used, what kind of interdependencies exist and interactions are taking place between parts of the documents.16 Apart from defining data structures, DTDs are also able to validate data in a document, at least to a limited extend. This validation relies upon post-processing and consists largely on declaring default values for attributes.17 DTDs originated from SGML (Stan- dard Generalized Markup Language), a description of how to specify a document Markup lan- guage or tag set.18 The syntax of DTDs is very compact and limited. Being now over 20 years old DTDs were initially designed for document publishing rather than for supporting data- interchange intensive applications as the level of Internet applications was not as sophisticated

as it is today.19 Therefore, although there is a widespread tools support, deployment (for in- stance in HTML or XHTML), enterprise wide expertise and practical application of DTDs, they cannot cope with the requirements and needs of today.20 Due to this reason it is difficult to com- pare DTDs with new emerging technologies, because these new technologies intend to meet the requirements of a newer generation of Internet applications than the DTDs did.

The second generation of Internet applications brought another significant technology that had an enormous impact on the Internet commerce: The Extensible Markup Language (XML).

Since content on the Web was almost defined for display and presentation only (using SGML- based languages), the solution to process Web content without human intervention was to put documents into a format that could easily be parsed, interpreted and understood by heterogene- ous machines. These machines should also be able to react upon the particular content of a document.21 Since 1996 the World Wide Web Consortium (W3C) has been developing the Ex- tensible Markup Language with its first release in 1998 and a revised version in 2000. Accord- ing to the W3C, XML is especially designed with the intention of enabling applications to rec- ognize and process information passed to them by other applications.22 The XML standard de- rived from SGML as well and was designed for the markup of documents; it therefore includes a subset of SGML’s Document Type Definitions. It is an extensible and not fixed framework which allows users to create their own tags to form a markup language; therefore one can define XML as a metalanguage.23 Its linear syntax for trees gained industry-wide acceptance and there- fore opened many opportunities for automating content processing between disparate Internet applications and systems. The hype of XML was largely based on the fact that its tree view and SGML-based syntax, which were important to the development of the Web, made XML a “uni- versal solution to the persuasive problem of format incompatibility”24.25 Before XML, one had to implement costly middleware adapters, which converted data between applications. A soft- ware upgrade required the matching middleware to be available. A common alternative was and still is EDI (Electronic Data Interchange), an application specific solution, which is, unfortu- nately, relatively error-prone and costly.26 XML overcomes many of the negative aspects of EDI and its features enable application-independent communication, making XML ideal for enter- prise data exchange.27 Basically XML gives documents a meaning, thus creating a semantic Web enabling better system interoperability. Its structure is adaptable, very flexible and capable of capturing data from databases, objects, legacy programs and other IT systems: “Its descrip- tive scalability enables the easy reuse of information for document publishing and data inter-

change solutions”28.29 The fact that XML is self-describing means that an XML document con- tains not only the data (payload) transferred to another application but also the universal de- scription of it. The destination application does not require any customized data translators to open and decipher the contents, it is able to interpret and understand the data solely based on its description. This leads to time and cost savings as transenterprise communication bases solely on standardized means. The elimination of custom coding for application interfaces as well as the fact, that XML is text-based and thereby human-readable and easy to debug respectively, XML is likely to become the “lingua franca of virtually Web-connected applications.”30

The separation of the management of content and its presentation opened further opportunities to make the technology more flexible in terms of reusing and repurposing information. The Extensible Style Sheet language (XLS) emerged. It aimed at creating a style sheet language to XML to print XML documents on paper and publish them on the Web. XLS constitutes of two parts, the transformation part and the formatting part. However, the transformation part, which allows for example transforming XML data into HTML, split of into its own specification called XLST (Extensible Style Sheet Transformations).31 XLS complements XML and makes it a powerful tool for Internet-based applications and business endeavors.

As a result, XML as a new Web standard represents a big step forward for both e-commerce and Internet applications by improving the way transenterprise communication is conducted. But it has also its limitations and weaknesses. On the one hand XML establishes an alphabet with which developers form a natural language by building words; but there is no guidance of how to built meaningful sentences out of these words.32 Besides performance concerns and lacking communication with SQL databases, there has been a considerable debate about how to deal with the multiple XML vocabularies, how they should be developed and managed, and how to support environments with both industry standard and proprietary vocabularies.33 This is the reason why XML is too rigid to allow Internet applications to deal with the variety of formats and types of data, which are passed among businesses, on the fly. The problem of XML is that the technology has become splintered, forcing users to convert documents among multiple XML flavors. The reason for that is that many businesses work at cross-purposes in the way they im- plement XML, requiring costly conversion between the XML dialects.34 While some make profit with these XML-to-XML conversion business models, others are hesitating with the im- plementation of XML in their IT environment. That is interesting, because the original intention of XML was that it does not require any kinds of value-added network services and costs. Due to this reason many believe that XML technology alone is not mature yet.35

4 The need for schema

The third generation of Internet applications is now, with an increasing number of emerging web- and e-commerce based services, although the impact of XML, still loosely-coupled. It is now more than ever important to be able to bridge disparate applications and networks and share information among them.36 Information in XML received from multiple different trading partners must be validated to fit the local format and business process requirements. Incoming data should enter the corporate databases only if it is in a proper schema. For doing this valida- tion of documents there has to be an agreement on a common XML vocabulary between the applications of business partners that involve the exchange of documents.37 With schema syn- chronization the potential for ambiguity and misunderstanding can be decreased significantly as the applications on each side are able to identify, understand and use the information passed to them. Consistent schema semantics enable greater interoperability and integration opportunities for content aggregation and syndication.38 The absence of such a common schema would mean that content syndication requires the work at a “least-common denominator level of abstrac- tion”. The focus for schemas is therefore an important distinctive feature in this generation of Internet applications compared to earlier ones. Moreover they are essential for “XML’s contin- ued adoption” in enterprises.39

5 Evaluation of the XML Schema recommendation

5.1 Schema and schema languages

Schemas represent “the components and rules for a reusable and sharable vocabulary”40, whereas schema languages are tools used to describe schemas.41 Schemas aim at intelligently utilizing information between heterogeneous XML-based Internet applications, thus “avoiding the interoperability problems that impact the formation and sustainability of very large scale electronic trading environments”42.43 Especially in e-commerce environments they solve many technical challenges. An XML schema tests the validity of XML-based business documents concerning their appropriateness for the own or the recipient’s IT environment.44 In addition,

schemas are needed to ensure that business partners involved in a transaction are authorized or that they are following proper procedures. Usually they are exchanged between trading partners, so that each party is able to understand the content and rules to exchange information. As a re- sult, the exchange of metadata enables transenterprise business transactions to be processed on the fly. This is a major advantage, since with schema-powered Internet applications developers do not need to anticipate and code all data formats in advance, the users simply help them- selves.45 To achieve interoperability attention should be drawn to reusability and refinement of information which is an important component of schemas: “For supporting global interoperabil- ity, the ability to extend, reuse, rename, and refine other people’s components is a major ena- bling technology”46. This is due to the fact that businesses have different trading practices, regu- lations and conventions that need to be accommodated in order to successfully chain their proc- esses together. Especially, distinct industries come up with their own proprietary sets of seman- tics referring to their particular business arenas, processes and requirements. Therefore, estab- lishing an inter-connected network of transactions becomes an even greater challenge when doing commerce between vertical industries as each business’s schema will differ significantly from the others’. However, there are industry-specific initiatives such as IOTP (Internet Open Trading Protocol) and RosettaNet, which translate customized semantic definitions into the schemas of other industries.47

[...]


1 Holland, 2001; Gregory, 2000; Liebmann, 2000; Smith, 1999

2 TIBCO, 2000; Liebmann, 2000; Klarlund, 2000; Holland, 2001; Floyd , 1999; Walsh, 1999

3 Holland, 2001; Gregory, 2000; O'Kelly, 2000

4 O'Kelly, 2000; Girishankar, 2000; Dyck, 2001; Dyck, 2000

5 Levitt, 2000; Sliwa, 2000 [1]; Dotts, 2001; Ogbuji, 2000; Klarlund, 2000; Alschuler, 2000

6 Holland, 2001; O'Kelly, 2000; Patrizio, 2001; Dyck, 2001; Alschuler, 2000

potential of becoming an enterprise-wide standard for XML document description. The end of this paper constitutes a summary of the results presented and conclusions drawn.

7 O'Kelly, 2000; Robie, 2000; Shantaram, 2001

8 O'Kelly, 2000; Smith, 1999

9 Liebmann, 2000

10 Smith, 1999; Liebmann, 2000

11 Liebmann, 2000; Gregory, 2000

12 TIBCO, 2000

13 Gregory, 2000; Smith, 1999; Shantaram, 2001

14 Smith, 1999

15 O'Kelly, 2000

16 Jeliffe, 2000; Lee, 2000; Floyd, 1999; Walsh, 1999

17 Laurent, 1999; Patrizio, 2001; Ogbuji. 2000; Walsh, 1999

18 WhatIs.com

19 Laurent, 1999; TIBCO, 2000; Lee, 2000

20 Laurent. 1999; O'Kelly, 2000; Wong, 2001; Walsh, 1999

21 Robie, 2000; Shantaram, 2001

22 Abdualsamid, 2001; Girishankar, 2000

23 Laurent, 1999; TIBCO, 2000; Gregory, 2000; McVicker, 2000; Patrizio, 2001; Abdualsamid, 2001

24 Klarlund, 2000

25 Sliwa, 2001; Girishankar, 2000; Klarlund, 2000

26 TIBCO, 2000; Patrizio, 2001; Yager, 2001

27 TIBCO, 2000; Yager, 2001

28 TIBCO, 2000

29 TIBCO, 2000; Gregory, 2000; Yager, 2001; Robie, 2000; Schmelzer, 2001

30 Johnston, 2000

31 TIBCO, 2000; Abdualsamid, 2001; Morgenthal, 2000; Robie, 2000; Klarlund, 2000

32 Smith, 1999

33 McVicker, 2000; Morgenthal, 2000

34 Messmer, 2000; Girishankar, 2000

35 Messmer, 2000

36 O'Kelly, 2000

37 Holland, 2001; Dyck, 2000; Ogbuji. 2000; Walsh, 1999

38 Holland, 2001; O'Kelly, 2000; Smith, 1999

39 Dyck, 2000; O'Kelly, 2000

40 TIBCO, 2000

41 O'Kelly, 2000

42 Smith, 1999

43 Smith, 1999; TIBCO, 2000

44 Abdualsamid, 2001; Walsh, 1999

Details

Pages
22
Year
2001
ISBN (eBook)
9783638111270
File size
522 KB
Language
English
Catalog Number
v1829
Institution / College
European Business School - International University Schloß Reichartshausen Oestrich-Winkel – Department of Information Systems
Grade
2,7 (B-)
Tags
XML Schema RELAX DTD grammar W3C interoperability electronic commerce validation

Author

Share

Previous

Title: XML Schema or just RELAX?