Table of Contents
2 Information Retrieval on the WorldWideWeb
2.1 The structure of information
2.2 The meaning of information
2.3 Locating information
2.4 The nature of a search query
2.5 Information gathering and query
3 The agent paradigm
3.1 An agent from the user’s point of view
3.2 Agent properties
3.2.6 Multi-agent systems
4 The CEMAS information model
4.1 Definition of user and search
4.2 The concept architecture
4.2.1 Definition of a link
4.2.2 Definition of a concept
4.2.3 The concept tree
5 The CEMAS agent architecture
5.1 The Knowledge Query and Manipulation Language
5.2 The Java Agent Template (JAT)
5.2.1 JAT knowledge
5.2.2 JAT communication
5.3 CEMAS agents
5.3.1 CEMAS knowledge
5.3.2 CEMAS communication
5.3.3 The ConceptBroker agent
5.3.4 The ConceptServer agent
5.3.5 The ConceptClient agent
5.3.6 The ConceptSearch agent
6.1 Example concept tree
6.2 Example session
6.2.1 Example ConceptClient session
6.2.2 Example ConceptSearch Session
7.1 JAT agent overview
7.2 JAT messaging
7.3 Package agent
7.4 Package context
7.5 Package resource
7.5.1 Resource types
7.5.2 ConceptFile class
7.6 ConceptUtil package
8.1 The agent paradigm
8.2 The concept model
This Diploma thesis describes the implementation of a prototype multi-agent system. The system consists of four different types of agents and is based on the Java Agent Template, an agent framework freely available from Stanford University.
The purpose of the multi-agent system is to aid users in searching and retrieving information available on the WorldWideWeb. Information is categorized in concepts and the different agent share and exchange the knowledge about concepts and documents on the WWW that matches these concepts.
This thesis presents how the information is modeled and how it is communicated between the agents of the system. It also includes prototypes of the agents that demonstrate a working implementation of the approach.
I wish to thank everybody who has contributed in making this work possible. It started out with a semester at a foreign university and since it has taken a little longer than expected, I really appreciate everybody’s patience and support.
First of all I wish to thank Martin Strecker for his help in organizing the foreign semester in France. Great thanks to the Laboratoire d’Intelligence Artificielle, my supervisor Jacqueline Ayel, the members of LIA for their helpful hints and companionship and for a very productive working environment (the office with the most awesome view).
I also wish to thank my supervising professor, H. W. von Henke for supporting and encouraging my uncommon idea of starting a thesis at a foreign university and for his helpful advice and cooperation.
I specially wish to thank my assistant Dr. Adelinde Uhrmacher for her extensive support and time to review the parts of my project and for her many helpful hints. I have learned a lot in our discussions.
Last but not least I would like to thank my family for supporting me through all these years.
List of Figures
Figure 4.1 A Link
Figure 4.2 A Concept
Figure 4.3 Example of a Concept Tree
Figure 5.1 Two-layer Message
Figure 5.2 CEMAS Architecture
Figure 5.3 ConceptBroker Window
Figure 5.4 ConceptServer Window
Figure 5.5 ConceptClient Window
Figure 6.1 Example Concept Tree
Figure 6.2 Example ConceptClient Window at Startup
Figure 6.3 Example ConceptClient Communication: startup
Figure 6.4 Example ConceptClient Communication: address lookup
Figure 6.5 Example ConceptClient Communication: root concept
Figure 6.6 Example ConceptClient Communication: concept
Figure 6.7 Example ConceptClient Communication: links
Figure 6.8 Example ConceptClient Communication: insert
Figure 6.9 Example ConceptSearch Communication: request concept
Figure 6.10 Example ConceptSearch Communication: request links
Figure 6.11 Example ConceptSearch Communication: add new link
Figure 7.1 JAT Packages
Figure 7.2 AgentContext Startup Class Instantiation
Figure 7.3 Agent Startup Class Instantiation
Figure 7.4 Class Invocation when Receiving a Message
Figure 7.5 Class Invocation when Sending a Message
Figure 7.6 Agent Class Hierarchy
Figure 7.7 MessageHandler Class Hierarchy
Figure 7.8 ResourceManager Class Hierarchy
Figure 7.9 Context Class Hierarchy
Figure 7.10 Interface Classes Hierarchy
Figure 7.11 Resource Container Class Hierarchy
Figure 7.12 Resource Class Hierarchy
Figure 7.13 ConceptFile Class Hierarchy
Figure 7.14 Class Hierarchy of the ConceptUtil Package
List of Tables
Table 5.1 Reserved KQML Performatives
Table 5.2 Performative Parameters
Table 5.3 JAT Resources
Table 5.4 JAT Knowledge
Table 5.5 JAT Ontology
Table 5.6 JAT Performatives
Table 5.7 CEMAS Knowledge
Table 5.8 CEMAS Ontologies
Table 5.9 Performatives, Object: service
Table 5.10 Performatives, Object: concept
Table 5.11 Performatives, Object: link
Table 7.1 Resource classes
The amount of information that is available on the WorldWideWeb these days can only be called enormous, and it keeps growing daily. It can be assumed that sooner or later informa- tion about any topic known to man will be available somewhere on the WWW (see “What is available on the Web“, [Boute96]). As new information is constantly being added the chances are high and continuously increasing that some particular information on any given topic is already available.
The average user searching for information about a particular topic is therefore increas- ingly overwhelmed by the sheer volume of information on the WWW. The manual search- ing process of browsing or using one of the available search engines1 is a time-consuming effort, and quite often the user ends up without finding what he was looking for. Therefore a number of different approaches have been suggested to improve the current situation and simplify finding the location of required information. These approaches range from some changing the standards and structure of the WWW to others that provide additional services which are more or less strongly bound into the existing WWW structure.
One type of approach circles around the general idea of an agent or system of multiple agents, a software tool that relieves the user of certain tasks or makes them easier to handle. These tools are called agents in analogy to a human agent that specializes in providing a certain set of services. The general paradigm of a software agent is not bound to a specific problem domain. Instead it can be seen as a different way to interpret a piece of software that has certain qualities.
Our approach was motivated by two main ideas:
1. To implement a multi-agent system prototype and to experiment with the question of how to apply the agent paradigm to the problem of information retrieval on the WWW. The aim was to find a way in which such a multi-agent system could actually be implemented and to find out how this agent-oriented approach would differ from conventional WWW search mechanisms and whether it provides any advantages over them.
2. To create an agent tool that helps to find information on the WWW. The users we had in mind were scientists typically specializing in a certain topic area of research, looking for comprehensive information about this topic area. This type of user is not only interested in finding all of the information related to his interest, but also wants to be kept up-to-date about newly available information. Another important requirement was to share each users knowledge or expertise about information with other users in the community. Therefore the distributed multi-agent approach, where each user is represented by his own agent in the system. The agents are thought of as representatives of their users, requesting and offering information on their behalf.
This thesis will explain our multi-agent implementation, and will also give an overview of the agent-oriented approach to searching and finding information on the WWW. The next chapter discusses the information available on the WWW and some of the problems of searching and finding the wanted information. The third chapter gives an overview of the agent paradigm and Internet-based software agents. In the forth chapter we will describe how information on the WWW was modeled. The fifth presents our own implementation called CEMAS (Concept Exchanging Multi-Agent System), a multi-agent system of coop- erating agents, followed by an example session. The seventh chapter explains the technical details of the implementation. We will conclude with a final discussion.
2 Information Retrieval on the WorldWideWeb
This chapter will give an introduction to the problems of searching and finding information on the WorldWideWeb. Even though the main object of this paper is the implementation of a multi-agent system, it is necessary to review the problem field the system was applied to, since that had an impact on several design decisions. The intention is to show how the agent paradigm can be applied to a practical problem.
In terms of a general Information Retrieval model, the WWW can be seen as a single large database, with the URLs2 pointing to specific documents as the objects. To find infor- mation about a certain topic therefore means to find documents containing text about this topic.
There are several differences to keep in mind though. Since the number of WWW-serv- ers and the documents they serve is very large and keeps growing steadily, the exact size of the database at any given point in time is not known and approaching infinity (see [Beigb97]). As opposed to a closed-world model like a local database where the contents are exactly known, the WWW is an open-world database model with no boundaries.
This results in the following differences for information contained in our virtual database:
- With regard to format: since there is no requirement to adhere to a standard or com- mon format when publishing information on the WWW (thereby entering it into the database), the document formats vary widely, depending on their authors.
- With regard to content: as opposed to a fixed database where the topic of the contents is known (e.g. medical database), no a priori confining assumptions about the con- tained information can be made. Every topic should be expected to be available and a user may search for anything inside the database, so the system should not constrain the search by limiting it to some topics.
- With regard to existence: a piece of information may be available somewhere, but it still cannot be accessed because its existence is not known. Since the size of the data- base is virtually infinite, there can never be a complete index of its content (as in a closed-world model).
These differences and their consequences will be discussed in the next three sections, followed by some additional considerations. Due to the size of the WWW, the amount of available information cannot be managed manually. Consequently, there is a need for methods that support automated processing.
2.1 The structure of information
Throughout this thesis, the term information will be used as a synonym for HTML3 text documents. Types of data other than text (pictures, sounds, films, etc.) are not taken into account, because they are much more difficult to analyze and process (using automated techniques) and consequently pose a set of altogether different problems. Formatted or typed data from a knowledge- or database (e.g. X500 directory service) is also not taken into account, even though such data can be processed more easily by automatic means. A strict requirement for the data to be formatted or typed limits the application domain. Infor- mation that does not meet such a format criterion cannot be properly processed. In a system where the aim is to be able to process most of the available documents, there should be no such requirements towards format. This does not exclude the possibility to use document format or type to gain additional information about the document content, but it should be optional.
Thus, HTML documents are the main source of information, because they are the most frequently used format to be found on the WWW. Even though most of the documents are in HTML format, an agent system should assume no more than plain text, simply because it is the only common denominator. The fact that HTML contains additional tags is of no con- cern, as they can be filtered out easily thus reducing a document to unstructured plain text. Only some HTML tags define the structure of a document, so they could be used to extract additional knowledge about the information contained, using certain tags as indicator to assign the marked text a higher weight or importance (e.g. the Meta, Title, Heading or Anchor tags). This may lead to errors though, since tags are commonly used to achieve a certain layout and not only to structure content. Layout tags (e.g. Bold, Font, etc.) do not express any semantic value of the contained text. In general it seems to be difficult to auto- matically extract a concept schema from the HTML tags (see [Catar97]). Consequently this means that a system should use document structure tags to extract additional information whenever possible, but that it should not completely rely on it.
For a system that needs to be open to include all the information that exists on the WWW, any prior requirement in terms of format or structure limits the flexibility. The only safe assumption is that each document found can be converted to plain text. As a severe limitation it is still challenging and complex to extract semantic knowledge from unstructured, natural-language text (see [Boone98]).
2.2 The meaning of information
By definition, not only the number of documents available on the WWW but also the number of topics contained therein is virtually infinite. It is not known what the information being dealt with will be about. At the beginning of every search, the information content that is being sought after must be semantically described. Such a semantic description can be given by a set of keywords or using a formal query language or ontology.
A trivial assumption that is usually made when dealing with automated text processing is that the meaning of a text can be deduced from the words contained therein in some abstract way and that a document representative can be calculated automatically. The follow-up of this leads to the idea that two texts dealing with the same topic at least partly share the same semantic description (see chapters 2,3 and 6 in [Rijsb79] for further details).
In a system that has to handle vast amounts of topics that change dynamically, it is practically impossible to implement static content-dependent formal descriptions (like description logics or ontologies) for all contained information (see [Oukse97]). Such descriptions can only be properly applied if the contents of a database are finite and exactly known (e.g. medical or technical databases) and more or less well-structured.
However it is possible to apply such formal descriptions to a certain defined subset of information, a distinguished topic area. Some systems follow this approach and also try to interconnect different sets of formal descriptions into one large system (for example a com- bination or fusion of several medical databases). While this makes sense and seems to work well for related information that is typed, some limitations and losses of flexibility always remain. As a prerequisite, such approaches usually require the information dealt with to be well-categorized and well-formalized. This is rather strict and offers no flexible solution for a general information retrieval system.
For a broader approach, keywords offer a high flexibility, which is necessary for a domain like the WWW where the contents are not exactly known. Keywords have the draw- back of not being interpreted in their context, only as stand-alone objects by themselves. Furthermore, a system based on keywords is subject to synonymy and polysemy problems. Different words may be used to describe the same meaning, and the same words may be used in a different sense, having another meaning. However keywords can be used as a first step to limit the amount of potentially relevant documents to a smaller subset, upon which more sophisticated full text analysis can be performed. These sophisticated methods are usually very demanding in terms of computational complexity, so they require too much time and processing power to directly apply them to large amounts of data.
2.3 Locating information
As an identifier for a document, the WWW uses URLs (Uniform Resource Locators), therefore the notion of indicating the existence of some piece of information to another agent is a synonym for giving an URL to it. It is assumed that the URL is an absolute and exact identifier of a document, that the document can be retrieved and therefore the information contained within it can be accessed.
Due to the structure of the WWW no complete index or map of its contents exist. The following shall be no attempt to construct such a complete index, nor a discussion of approaches or methods dealing with that problem. The idea will be rather to use existing resources, tools and methods and combine them with others to provide a value-added service.
There are basically two possibilities to search for information: by browsing and by a query to a search engine. Browsing is the activity of following hyperlinks (anchors, URLs) in HTML documents that lead to other documents. A search engine is commonly queried with a set of keywords and yields a list of hyperlinks as a result, so the documents can be accessed directly.
The first possibility requires the existence of some kind of “home page” to start with. In the best case, such a page may have a collection of links grouped thematically into catego- ries serving as an introduction to a certain topic. In addition to requiring an appropriate starting point though, a single page can never give a complete overview about a topic, but only reveal the information that is linked from that page. The chance that all available docu- ments related to a topic are completely interlinked is rather slim. Therefore only a subset of the allover information available on the WWW can be found by following hyperlinks to related documents.
The nature of hyperlinks between HTML documents is chaotic, there is no common standard or structure and the criteria by which the links are organized depend entirely on the document’s author. In the worst case, a link leads to a new document that has nothing in common with the one just left. For these reasons browsing as a means to find relevant information is only suited to a limited degree.
The second possibility to find information is to send a query to a search engine to retrieve a list of hyperlinks as a result. A query consists of one or more keywords (some search engines offer boolean logic operators) which are matched with the search engines index. The search engines continually explore the WWW by following all the links they find and then index the found documents. Even though not all documents can be found and indexed, the search engines can be expected to cover a rather large percentage of the WWW, especially if the query results of several of them are combined.
A document entry can therefore be known to exist if it has either been indexed by a search engine or if another document links to it (and is in turn either indexed or linked to, recursively). An author is very likely to indicate a documents existence to a search engine or link to it from somewhere when it is published on the WWW, since he wants the document to be found by the public. It is also assumed that information about the topic the user is searching for is generally available somewhere on the WWW. This is important in terms of the general type of information, if it is known or can be expected that a certain kind of infor- mation simply is not provided anywhere on the WWW (because it is non-public or classi- fied for example), it makes no sense to try to use the WorldWideWeb as a source thereof.
The search engine’s keyword indexing algorithms are rather simple, since they ignore the keyword context4. Either a portion or all the words of a document are indexed, if they have enough significance to be useful in identifying it (stop words like “and”, “the”, etc. are dis- regarded of course). Thus, a document is scored as a match if the query keyword is con- tained anywhere in it, regardless of the word’s actual context in the document.
Consequently, a search engine usually returns a very long list (in the range of hundreds or thousands) of matched documents. Such a list will contain mostly unwanted irrelevant information (noise), because the context-free keyword interpretation returns a lot of out-of- context matches. Even if these search engines would index the keywords in a manner that would take the context into consideration, the query language would need to offer the possi- bility to describe a keyword’s context to actually make use of it. In other words, both parties (the user and the search engine) would need to have the same understanding of a keyword, a common ontology or formal language, some means of defining the context. As already noted in section 2.2 above, this is practically impossible to implement for the vast amounts of information available on the WWW.
An additional problem is to find significant keywords that have two qualities: identifica- tion and discrimination. If the keywords are too specific, the search engine often returns with no results at all. On the other hand if the keywords are too broad, then they match an extraordinary large number of documents. So the keywords should properly identify the searched information and at the same time discriminate it from similar but unwanted infor- mation. This keyword significance is both document-dependent and query-dependent. Depending on the document collection indexed in the database, a good keyword occurs only in some documents, to clearly identify and distinguish them from the rest. Good document- dependent keywords can be globally defined if the contents of the database are static. In case of the WWW that is impossible, good keywords would have to be generated dynami- cally, since for every new query a user defines a new collection of documents (the ones that match the users query). The keywords would have to include exactly the documents a user wants and exclude all others. Since the number of possible topics is very large and the topics are not necessarily distinct, good keywords for a topic cannot be statically calculated in advance (see [Rijsb79] for details).
Additionally, some of the search engine’s knowledge is not passed on to the user, it remains transparent. This is mostly due to the owner’s interests in protecting the technolog- ical know-how, so the exact algorithms are not revealed. For a system that makes use of these search engines, being able to access this knowledge would allow better exploitation of the knowledge provided. This includes for example the ranking/scoring criteria of the returned matches, or the reasons why certain documents where returned in response to a query.
As a result, it is clear that the current method of a search using keywords and a search engine lacks a desired power and expressiveness, which is quite a limitation. That is the main problem, apart from the fact that a document may not even be indexed by a search engine and its existence is therefore virtually unknown.
2.4 The nature of a search query
Using a search engine on the WWW has several implicit properties. First of all, it is clientserver based. That means that the user must actively start a query and gets a single answer in response to it. Since new documents continuously appear on the WWW, the user must repeat the search query from time to time to find out about newly added documents. It is not possible for the search engines to initiate a message informing the user that something new (of possible interest) has arrived.
Furthermore, search engines are aimed at providing an answer to a single topic query. They are not well suited to search for a collection of documents that provide a more-or-less extensive overview over a certain topic. Keywords that fit such a slightly broader range usually result in too much unwanted noise. To satisfy such an information need with a search engine, many queries with a lot of manual filtering are required.
It is important to distinguish between a search query for a topic and a single specific doc- ument. If the user searches for a specific document and knows the discriminating keywords, a query to a search engine usually proves successful. If the document of interest is the FAQ of a newsgroup for example, it can be located with a search engine, because the user knows the keywords that exactly identify the document and discriminate it from most others at the same time (e.g. “FAQ” and “comp.infosystems.www.browsers”). However such knowledge of proper keywords on the user’s part cannot be assumed when searching for a number of documents related to a topic.
2.5 Information gathering and query
An information retrieval method for the WWW can be roughly divided into two parts, indexing and query. The first part consists of creating an index of the documents contained in the WWW (our database), which is more or less the functionality that classical search engines provide. These indexers register known documents and generate a document repre- sentative, usually a set of describing keywords. For our model the WWW’s known contents can therefore be reduced to the information indexed by the search engines being used. The second part consists of matching a search query against the document representatives in the index, to find the relevant documents. As mentioned before, this approach does not discuss the problem of indexing documents on the WWW. The indexing functionality provided by Internet search engines is just used as it is.
The focus will be on what can be done with the results of such a query to a search engine, divided into two steps, a collection step and an analyzing step. In the collection step the sys- tem will try to find all conceivably relevant documents by means of query (finding in this case means finding out about their existence and location on the WWW), while trying to keep the irrelevant noise to a minimum to reduce computation time for the second step. Dur- ing the analyzing step, the document’s contents returned as a result from the first step are being classified as either matching the search query topic or not. This means filtering out irrelevant documents as well as keeping (not accidentally filtering) relevant ones.
Two important measures are defined in classic information retrieval to describe a sys- tem’s effectiveness, document recall and precision. Recall is the ratio of the number of rele- vant documents retrieved to the total number of relevant documents existing. Precision is the ratio of the number of relevant documents retrieved to the total number of documents retrieved.
These two measures cannot be exactly calculated of course, since the relevance of a document is subjective even for humans and (even if considered objective) the number of actual relevant documents on the WWW is unknown. To improve a search on the WWW, the measures can be used as follows:
- Towards high recall: Locating as many documents as possible that are likely to be rel- evant to the queried topic, which will generate a first intermediate set of documents. Then, eliminating as few relevant documents from this collection in the analyzing process (not incorrectly eliminating good documents).
- Towards high precision: This is affected by the analyzing step, depending on the accu- racy of the classification process, weeding out the unwanted noise from the relevant parts.
On the WWW, the core problems of IR remain the same: retrieving a number of relevant text documents from a collection in response to a query. However, the size of the WWW and its dynamic and unstructured nature make it difficult to apply classic IR solutions. Full-text classification or categorization algorithms require too much computation time to employ them in a search engine that dynamically calculates an answer to a single query on-the-fly.
This paper proposes a different approach to satisfy an information need, not as single response to a one-time query (like a search engine), but as a continuous process. It combines automatic and user search efforts and allows the implementation of a more sophisticated information comparison and matching using whole documents instead of simple keywords (e.g. IR methods). The approach is based on the agent paradigm presented in the next chap- ter.
3 The agent paradigm
In this chapter, the software agent paradigm and its possible application to the Internet and information retrieval on the WWW will be explained. Several papers have attempted to define what an agent is, but until now no commonly shared exact definition of agency exists and shall not be attempted here. In addition, many papers describe agents in a very anthro- pological way, with terms and attributes often used to describe humans (like intelligence, autonomy, learning, communication, etc.). One of the reasons for this may be to suggest some kind of similarity between agents and humans. In reality (and many agent implemen- tations including the one discussed in this paper), this is often reduced to rather pragmatic methods and algorithms. Consequently, these terms should be read and understood with caution, they are used here for reasons of familiarity and their intention is not to suggest that currently existing agents exhibit anything near human-like qualities. Using these terms has become common to describe agent attributes, whether that is justified is a different issue omitted in this paper.
In the following overview, some of the properties often associated with agents are listed. These agent properties are not technologies, they rather describe the problem that is to be solved [Petri97]. Instead of arguing whether such a property is a “must have” for a software to justify calling it agent, it will be explained what the property can be good for. The impor- tant points are how a certain agent property is embedded into the agent paradigm and why it may be useful for a particular implementation. Not all possible properties are necessary for every agent, in a particular application some may be more important than others.
The properties of a software agent system, the tasks of “what it does” and “how it does that”, can be described from two points of view, a psychological and a technical [Singh97]. The first is an abstract way of describing an agent, from the point of view of the user to whom the underlying techniques are transparent. Even though such a definition is rather vague and cannot be measured precisely, it can be discussed, compared and evaluated to a certain degree. Such a discussion is still of interest and important, not in order to know whether it is justified to call a piece of software an agent or not, but in terms of usability improvement for the user. Does the agent-oriented approach give the user an advantage in dealing with a problem and what are the differences?
As a second point, even though the exact techniques, mechanisms and algorithms used do not constitute part of a definition of agenthood, they are the means to implement the described properties. Looking at existing projects to see how they actually approach the implementation of agent-like behavior is certainly of interest. This will not necessarily result in a better understanding or definition of what an agent is or should be, but will hope- fully bring in some new leads and ideas of how the ideal agent could be accomplished.
3.1 An agent from the user’s point of view
So why should we call these information-searching software tools agents? Let’s compare this to a classical example, the human travel agent. This agent is an expert that specializes in a knowledge domain (travels) and offers this as a service. He is called up once and then he works autonomously and most likely returns with a result a while later. Maybe the query for a travel connection has to be refined. Thus the agent communicates with the user and other information sources. The agent uses his intelligence to reason and reach the goal (finding the right travel connection), while having other goals with different priorities at the same time. If the user is a frequent customer, the agent will most likely learn some of the users preferences and sometimes pro-actively suggest or assume something without explicitly being asked for it.
The main emphasis here is on the fact that the agent alleviates the user of some work, because a task can be delegated to it. On a lower level (with a simple agent) the user saves time, because he does not have to do it himself even if he could. On a higher level (with a powerful and very capable agent), the user gains some extra possibilities, because he lacks the domain-specific knowledge to do the task himself. Maybe the agent does not employ much intelligence to satisfy a query (e.g. just passing it on to a travel database), but that remains transparent to the user. How the agent carries out its task internally does not matter, as long as it comes up with a qualified answer to the user’s need and thus provides a solution to the problem. A related issue is the interface that is used to communicate with agents. Nat- urally it should be intuitive and easy-to-use (maybe natural language typed into a keyboard or even spoken), but such a definition is very ambiguous.
Humans tend to characterize complex systems like human beings, describing their behavior using attitudes like knowledge, belief, intention and obligation (see [Shoha93]). Thus the agent metaphor may improve the way a user interacts with an agent. Wooldridge [Woold94] takes the approach of defining agents through “mentalistic notions” one step fur- ther, by requiring that agents should not only be describable with such human attitudes, but that they actually need to be based on a formal logical framework that consists of such atti- tudes and allows to reason about them. Attitudes are divided into two groups. Information attitudes store the agent’s knowledge and pro-attitudes store its actions. So the human atti- tudes are not “simulated”, but the core of the agent is actually modeled using these attitudes and a formal logic to represent and manipulate them. However, this view of agents is already an implementation-related issue which rules out many other agent models and is therefore too narrow for this discussion.
Maes[Maes94] mentions two broad problems to be solved with respect to the user, competence and trust. Competence is the knowledge the agent “needs to decide when to help the user, what to help the user with and how to help the user” and trust exists when “the user feels comfortable delegating tasks to the agent”. The acceptance depends mostly on the solution of these two problems, on the other hand such a quality requirement is true for practically every kind of software.
The issues discussed here also apply to an information retrieval agent that finds informa- tion for the user on the WWW. The agent should be a helpful tool that the user can delegate this task to, working autonomously by itself. It should employ intelligence, its specific domain knowledge, reasoning and learning to search and find the information the user wants. The agent should fulfil the user’s information needs both reactively in response to a request, but also pro-actively suggesting new information that it deems interesting. A per- fect IR agent would interface with the user in an intuitive way, maybe understanding the users natural language.
The conclusion is that from a user’s point of view, it makes a lot of sense to apply the agent paradigm to the problem of information retrieval on the WWW, because mentalistic intentional attitudes (belief, knowledge, free will, etc.) are a convenient abstraction for a complex system that humans are comfortable to interact with [Singh97], [Woold94]. So far a description using such terms remains a very colloquial one. The key problem remains as how to implement such agents.
3.2 Agent properties
Several papers give an introduction to what constitutes an agent. At the same time they admit that these properties do not constitute a definition of agenthood, merely an attempt to capture the idea of the agent paradigm. The following part will list some of the properties that are frequently mentioned, while not claiming to be complete or extensive.
As already noted by [Frank96] and [Petri96], agents act, they do something. The task they accomplish or the service they offer is their most basic property. This acting takes place in a given environment and continues over a longer period of time as opposed to a one-time function being called.
The environment the agents “live” (sense and act) in determines their scope, they can sense their surroundings (input) and affect them through their actions (output) [Frank96]. As opposed to a physical robot, a software agent interacts using a direct user interface, by communication using a certain protocol or by calling external commands or functions. On the Internet, agents interact with the WWW using the HTTP protocol and with other agents using an appropriate communication language.
A frequently mentioned part of the agent paradigm is that agents live in a “real” environ- ment. In this case “real” does not have the meaning of “physically real” as used in connec- tion with robots, because the agents are software agents. Just the same, “real” captures an important quality of the environment: uncertainty. This usually results from the fact that the environment is open (very large or virtually unlimited in size) and inhabited by other agents or entities that may act in an undeterminable way. The Internet and the WWW share these qualities, it is impossible for an agent to know in advance what it will encounter. Agents should therefore be fault-tolerant and have a certain robustness. Otherwise, users or agents they interact with in their environment may do “damage” to them (on purpose or uninten- tionally)5. Just the same, an agent should not be the cause of problems or damage to its environment.
Basically there are two approaches in AI to model intelligence: symbolic knowledge-based systems and reactive architectures. In a deliberative symbolic approach, a symbolic model of the environment and a set of action descriptions are created. The agent then uses sym- bolic reasoning to combine a sequence of such actions to a plan. The plan describes a path of action that leads to a predicted desired goal state, which the agent is deliberately trying to reach.
Reactive systems are built on the basic assumption that intelligent behavior can be generated without explicit representation of the world model, but rather emerging from a complex system of behaviors. Essentially there is a set of rules or behaviors from which one is chosen or activated according to the currently sensed environment.
Since both of these approaches are imperfect, hybrid systems have been developed in an attempt to combine the advantages of both. Some properties that are still problematic or dif- ficult to implement are dynamic generation of goals (usually these are predefined and fixed) and dealing with multiple, possibly conflicting goals. A good overview about this topic can be found in [Woold94].
In case several possible ways exist to reach a goal and the result of actions in the given environment is uncertain, a decision-making process is required to find a solution. On the other hand, if there is only a small finite number of possible actions, whose results are exactly known (e.g. more like a function call), the intelligence can also be “hard-coded” into an algorithm. In other words, even though the Internet as environment poses some degree of uncertainty, the results of actions are known more or less exactly. For example if an URL is sent to a WWW server to retrieve a document, it will either return that document or not (because of a time-out or if it does not exist). Apparently an action like that is more like a function call and can very simply be used by an algorithm. Whether such an algorithm can still be called “intelligent” or not is a very debated issue that will not be discussed here.
Learning is related to or sometimes seen as part of intelligence, it can also be described as adaption with the goal of improving or optimizing the own performance [Imam96]. Single agents can learn to improve their knowledge or their problem-solving method. Within a multi-agent system the improvement can also aim towards the interaction or cooperation. The reward function reflects this in a way that agents try to maximize the combined reward of the whole system. In a competitive environment however, agents would usually behave selfish, trying to improve only themselves and maximize their own reward.
Another distinction can be made with respect to the system’s architecture. Either it is fixed and designed exactly to perform its task, in which case learning takes place through knowledge data being modified or the architecture itself is adapted and evolves as part of the learning process.
Autonomy is sometimes used to explain that the agent does something without constant user interaction or supervision, or even more detached, without direct human intervention at all. But this is a slightly inaccurate description, as it does include simple agents who only act in response to a users query. Therefore the emphasis is also on an agent having its own agenda or goals to pursue. It observes its environment to recognize changes and takes action when a certain state or change is observed. Such a behavior can be seen as autonomy, since the agent itself can decide when to do what, it is not following a direct external order. This implies intelligence to a certain degree, some kind of rule set that the agent uses to make these decisions.
In a multi-agent system or an environment with many agents, autonomy also means being separate from other agents. Each of the agents can be clearly distinguished from the others, it is an encapsulated entity, it has its own internal state and goals which may be dif- ferent or even contradictory from those of other agents. Note that this does not rule out a cooperative multi-agent system where some agents depend on others (voluntarily or not), because they still have their own goals and are thus autonomous, yet they may not be inde- pendent. The agent maintains the distinction between itself and its environment.
Agents should be able to interact - with other agents and the user (either directly or via an interface). Genesereth sees the ability to communicate as the most important property, as it gives societies of agents the opportunity to “solve problems that cannot be solved alone” [Genes94].
To be able to communicate, agents need a standardized communication language whose syntax and semantics are clearly defined. Both procedural and declarative languages have advantages. Depending on the underlying agent model and implementation either one will be better suited to communicate the agent’s expressions. Agent implementations based on a speech act, ontology or belief model tend to use a declarative language to express their speech. By virtue of being sent, a message is intended to result in some action being performed (see [DARPA93], [Labro96] and [Genes92]). Contrary to that, a procedural language seems to be useful for mobile agent approaches to send programs or blocks of code to another location (see [Gener96] for an example). In any case, communication is one of the important parts of software agents, since they interact with their environment mainly by sending messages or calling functions in one way or another.
The communication in a community of many agents can be direct, with each agent talk- ing to each other. This requires some self-organizing architecture, each agent has to know about the other agent’s presence and capabilities. It can also be indirect like in a blackboard system, via a message router or a mixed approach in which communication is assisted by specialized agents (e.g. by an agent name server or service broker). With respect to auton- omy, each agent must be able to initiate messages [Petri96]. So the communication must be based on peer-to-peer connections, which rules out client-server based protocols and archi- tectures.
3.2.6 Multi-agent systems
As already partly implied in the communication property, many approaches consist of mul- tiple agents. Two main types may be distinguished, based on competition or cooperation. In a competitive architecture the different agents offer solutions to the same problem and com- pete with each other to offer a better solution than others (e.g. contract net). As opposed to that, in a cooperative architecture the different types of agents each specialize to solve a part of the problem, and work together to combine their capabilities (e.g. specification sharing, blackboard system).
[Sycar95] suggests an approach for a cooperative multi-agent architecture where agents are divided into two groups, task-specific and information-specific agents. Task-specific agents specialize in managing a certain task and usually interact with the user. Informationspecific agents specialize in managing access to an information source and offer this functionality to other agents. Advantages like higher flexibility or reusability are well known from classic ideas of distribution, modularity or object-oriented approaches. Many users can share the functionality of agents, new functionality can be added easily and agent modules can be re-used which allows an easier development of similar agents.
An additional strain of research focuses on mobility as a property of an agent, though it is mainly required for physical agents (robots), as being part of autonomy. For a software agent, mobility means to be able to relocate itself on its own intention, while preserving execution state and data. While mobility may offer some advantages in certain application fields and the Internet is seen as an ideal environment for mobile agents, there seems to be no problem that cannot also be solved by a non-mobile agent system.
Additionally there are a number of new problems that arise when dealing with mobile agents moving to other hosts, mainly related to security issues. For our approach mobility was not required, a further discussion of it will therefore be left out.
While it is not clearly defined what exactly an agent is, the agent paradigm may serve in two ways. It may be used to describe the behavior of a complex software system to make it eas- ier for a user to understand it and it may serve as a guideline for the properties we need to implement to get something that could be called an intelligent autonomous agent.
An agent that searches for information on the WWW fits well into this paradigm. It lives in an open and uncertain environment, the Internet. The agent should be autonomous to a certain degree, searching on his own (unsupervised) for information the user wants and return with a result. It should also pursue this goal with a certain amount of intelligence, especially concerning the quality of the information returned, it should filter out the relevant parts from the large amount of unwanted noise. In this respect the agent specializes in solv- ing a distinct problem, and may therefore serve as an entity to which such problems can be delegated, expecting that the agent can solve them by applying its expertise.
4 The CEMAS information model
The implementation of the CEMAS architecture has the main goal of providing a working prototype. Since the amount of time for the implementation was limited, the objective was to create a platform that provides the necessary minimum requirements for a multi-agent system that handles information. A basic requirement of such a system is a model of how to represent and exchange the information which is available on the WWW.
Once that is provided and defined, additional new agents can always be added and included into the system later on. Moreover, the functionality of existing agents can be enhanced without the necessity to change the whole multi-agent system design. Thus, the result of this work was an open experimental platform that can be used to try out and combine some of the ideas or approaches of IR with a multi-agent system.
This chapter explains what the system’s typical user looks like and how information is modeled.
4.1 Definition of user and search
In this application, the overall problem field of searching information on the WWW will be applied to the scientific domain. The model user is a scientist who wishes to find an extensive overview of a scientific area or topic or has a general interest in the topic. Thus the user is not just searching for a single document or piece of information. Additionally, the search is not thought of as a single action, but as extended over a longer period of time. It should also reveal newly available information, to reflect the user’s continuing interest. Since scientific institutions are generally connected to the Internet and use the WWW to publish scientific documents, it can furthermore be assumed that requested information about the topic will be available somewhere on the WWW. The user normally has an idea of what he is looking for. This does not necessarily mean that he knows the right keywords to start a query, or that he is an expert on the topic in question. Consequently the system should provide some guidance by suggesting several topic categories as reference to start with.
One of the main reasons for using a multi-agent approach was the idea to have each user represented by his own agent in the system. These user agents are the interface to the user and communicate with other agents to acquire the desired information.
Another goal of this design was to share the knowledge about the existence and location of information on the WWW between users with the same interest, thus to re-use their work effort of finding certain information sources, since scientific users are commonly specialists in a certain area and know the available information sources reasonably well. The information model should also be able to support the query for all the information sources added by another user (in addition to a query for a certain concept). So if user A finds that user B has similar interests, user B’s knowledge about information sources on the WWW can be requested.
Information on the WWW is typically viewed with a browser client (Netscape Navigator, Microsoft Internet Explorer). Therefore the user interface should be integrated into the browser. It is assumed that the user has a direct connection to the Internet and knows how to use the WWW and browser clients to search and find information.
4.2 The concept architecture
In CEMAS, agents exchange knowledge6 about information (documents on the WWW). This requires that the agents can communicate several issues between each other. To be able to do this, they need a common form of representation for the knowledge. Two general types of knowledge are used:
- Knowledge about one specific document on the WWW. This identifies precisely one document and will be called a Link.
- Knowledge about a topic, a collection of documents that contain similar or strongly related information. Such a topic of interest will be called a Concept. Multiple con- cepts are organized in a hierarchical concept tree.
4.2.1 Definition of a link
A link represents the knowledge about a single specific document on the WWW. Since each link contains one URL which is by definition unique on the WWW, a link is an exact pointer to the information contained in the document. As long as the original document exists, each agent has access to the same content. A link consists of the following fields:
- Title: The name of the document to which the link points
- URL: The Uniform Resource Locator that identifies a document on the WWW
- Description: A short description of the documents contents
- Origin: Where this link came from (an identifier of the agent that added it to the system)
Description: „...“ Origin: „...“
- Password: To verify access rights of the owner (change/delete) 0
Abbildung in dieser Leseprobe nicht enthalten
Figure 4.1 A Link
The link includes the document’s location and file type (HTML, text, postscript, etc.), which is both captured in the URL. The title and description provide additional information in human-readable form to indicate the contents of the document. The description may be an abstract or a set of keywords. The origin field is used to keep track of the agent that cre- ated this link. In case of a user’s personal agent, this field contains the user’s email address. For other agents it may contain the agent name or the email address of the person running it (the agent administrator). The password ensures that a link can only be modified or deleted by its owner, the agent that created it.
Title: “Abteilung KI”
Description: “Homepage of the AI lab, computer science department, University of Ulm” Origin: “email@example.com”
4.2.2 Definition of a concept
A concept represents the knowledge about a topic area. Here we must distinguish between the general concept model (the data structure) and what will be called the meaning of a spe- cific concept (the topic it is about). The semantics of the model are precisely defined, since agents need to have the same exact definition of the data structure when they exchange concepts. The meaning represented by a specific concept is defined in a flexible way and not by a strict formal language or structure. The differences are described below.
18.104.22.168 The meaning of a concept
As explained in chapter 2, it is difficult to capture the meaning of a concept, in such a way that two communicating agents have the same understanding of it. Keywords are too ambiguous and therefore too weak for an exact definition of meaning. An ontology7 which defines an absolute view of all existing information on the WWW is practically impossible. Therefore a new approach of defining a concept’s meaning was used.
Each concept is associated with a number of documents, that are chosen as being the ref- erence or definition of said concept. The concept’s meaning is ultimately defined by the full text of these reference documents. To test whether a document in question matches a certain concept, it can be compared to the concept’s reference documents. Based on the experiences of Information Retrieval, it is assumed that text documents can be compared by algorithmic means and a similarity measure can be calculated. If the similarity of the document in ques- tion compared to the reference documents is higher than a certain threshold, it matches the definition of the concept and can be considered to contain information about the same topic.
Thus, defining a concept through the full text of reference documents is more expressive than simple keywords. At the same time it preserves the flexibility to define any existing topic, simply by providing a number of reference documents. The number of reference doc- uments is flexible as well, so a concept’s meaning can be refined by extending the list of ref- erence documents.
Documents on the WWW are exactly identified by a link, so agents need only exchange a set of links. Each agent can thereby retrieve the full document text at any time. If agent A wants to communicate a concept’s meaning to agent B, it can simply pass a list of links to the reference documents to that agent. Any other document sufficiently similar to these reference documents can be assumed to match the concept.
In addition, the method of comparing the document’s content (full text) is left open. How an agent actually implements the comparison between documents depends on the agent, so the concept is independent of the algorithms or architecture used to implement the agents. Agents are autonomous to interpret the definition of a concept in their own manner, yet the definition remains clear and it is exactly the same for all participating agents8.
To summarize the above, a concept is a topic or area of interest which is defined by a number of links that point to documents about that topic on the WWW.
1 Meant are “classic” WWW search engines, some of which are listed in Appendix A.
2 A Uniform Resource Locator (URL) identifies a document on the WWW.
3 HyperText Markup Language (HTML), a language used to encode pages or documents on the WWW (see [Ragge95]).
4 This conclusion is derived from the author’s personal experience with search engines (e.g. the “refine search” feature of AltaVista). The actual algorithms used by the search engines are a well-kept secret of the companys that own them. For obvious reasons, these companys want to protect their own investment and prevent abuse of their search engines.
5 Search engines like AltaVista may serve as an example for an Internet-based agent that is being lied to (for this example, we will dis- regard the aspect of whether a search engine should be considered an agent or not). Users deliberately put false keywords into their HTML pages to trick the search engine into showing their page as a result when being queried with those keywords, even though the page’s content has nothing in common with them.
6 The term “knowledge” is used in its general abstract meaning, not referring to a specific formal representation
7 See [Grube93] for a discussion of formal ontologies.
8 This is in fact similar to how humans would handle such an issue, even if a definition is clearly given by a text, the interpretation of each person may vary