Copyright issues surrounding the use of services like Google Instant Preview

European copyright law in a digital challenge

Master's Thesis 2011 68 Pages

Law - Media, Multimedia Law, Copyright


Table of Contents


1 Introduction

2 Basics
2.1 Technical framework
2.1.1 Basic function of search engines
2.1.2 Function of a website preview
2.2 Basics of Copyrights and Author’s Rights
2.2.1 Differences and similarities of the two concepts
2.2.2 Territoriality principle of copyright
2.2.3 International treaties
2.2.4 European legislation

3 Work in European Copyright
3.1 Legal provisions defining original work
3.2 The Infopaq Decision
3.3 Requirement of an original work

4 Copyright protection of webpages
4.1 Technical basics of a webpage
4.2 Copyright protection as a computer program
4.2.1 Definition of computer program
4.2.2 Can a website be defined as a computer program
4.2.3 Graphical interface of a website
4.3 Copyright protection as a literary or artistic work
4.3.1 Website as an original work
4.3.2 Creativity requirement
4.4 Copyright protection as a database
4.4.1 Definition of a database
4.4.2 Website as a database
4.4.3 Website protection under the sui generis right
4.5 Copyright protection of parts of a website
4.5.1 Copyright protection of independent works Copyright protection of texts Copyright protection of images Copyright protection of computer programs Copyright protection of video files Copyright protection of audio files
4.5.2 Effect of works integrated in a website

5 Caching of websites
5.1 Copyright infringement of caching
5.1.1 Infringement in the context of computer programs
5.1.2 Infringement in the context of a literary or artistic work
5.1.3 Infringement in the context of databases
5.2 Cached link
5.2.1 Communication and making available
5.2.2 The public

6 The website preview
6.1 Creation of the preview
6.1.1 Direct or indirect reproduction
6.1.2 Temporary or permanent reproduction
6.1.3 Reproduction by any means and in any form
6.1.4 Reproduction in whole or in part
6.2 Recall of the website preview
6.2.1 Making available to the public
6.2.2 Downsizing of the preview image Reproduction as smaller image Making available as a smaller image

7 Limitations and exceptions
7.1 Caching of the website
7.1.1 Exceptions for the reproduction of computer programs
7.1.2 Exceptions for the reproduction of databases
7.1.3 Exceptions for the reproduction of literary or artistic works Article 5 I InfoSoc Article 5 II (c) Article 5 III (c) Article 5 III (d) Further exceptions
7.2 Making available of the website preview
7.3 Implied consent
7.4 Other measures

8 Conclusion


illustration not visible in this excerpt

1 Introduction

Search engines have, for some years, raised an amount of legal issues surrounding copyright. Since search engines do not own a majority of the content they show, but mostly provide access to the content of third parties, the most common concerns are the reproduction of copyright protected works and the making available to the public thereof. Apart from the standard website search, specialised search functions like Google News[1], Infopaq[2] or Google Image Search[3] have also raised some fundamental copyright questions.

At November 9th 2010 Google introduced a new function for its search engine, named Instant Preview, which also has the potential to raise certain copyright issues[4]. Besides the title of a website, the URL and a snippet[5] appearing on the search result list, the new feature offers a graphic representation of a given search result, in the form of a small screenshot appearing to the right of the results[6]. This feature is useful to quickly evaluate the search results and according to Google is bound to raise the satisfaction with the web search by 5 per cent[7].

This kind of feature is not completely new to web search engines. Yahoo search used this kind of feature years ago[8], and MelZoo.com is a web search engine which inherently used this function from the beginning[9]. Also Bing offers a similar service for some time[10]. Nevertheless, with Google Web search receiving several hundreds of million queries each day, and a worldwide market share of 80 per cent in this field[11], the introduction of Instant Preview is bound to gain a lot of media attention.

There are some consequences of this function for website operators. The process of search engine optimisation[12], which is used to improve the rank and visibility of webpages, can be affected by this website preview function. Normally, a user visits the first few search results given for a certain query, with the assumption that the first results are the most fitting for the search topic. SEO is thus used to improve the rank, for example by adding many additional keywords to a webpage. With the introduction of the website preview in the search result list, users can now directly check if a webpage appears to be what they were searching for. Therefore, websites using an extensive optimisation could lose visitors through this service, and ultimately revenues through fewer advertisements. This is likely to raise resistance against this service.

Consequently, copyright reasons could be used to obstruct or shut down this kind of function. Judging from the impact of earlier cases of copyright issues surrounding search engines, this website preview function might draw legal actions. This would come as no surprise, as the controversial topics are diverse.

A first question in order to assess the connected copyright issues must be how and under which criteria a website could be protected under copyright provisions. There are several possibilities to examine. A website could be protected as a whole, either as a literary or artistic work, or as a computer program or database. Apart from that, parts of the website could be protected, be it images, texts, videos or musical samples. This would raise the question how a copyright protected part interacts with the website as a whole.

The next question is how the preview of the website is created, and what the copyright implications of this process are. It must be surveyed whether the preview is created at the moment of a request, or if the preview is stored as a screenshot on the server of a search engine. Connected with this is that question how a web preview or a screenshot should be defined in the field of copyright law. Is a screenshot a reproduction of an original work, or a making available to the public?

Moreover, the question arises if the fact that the content of the website preview is smaller than the original website – in fact so small that most of the text and many images are unrecognisable – could affect the applicability of copyright protection.

At last it remains to be seen whether there are specific or general exceptions and limitations to the copyright protection granted to the aforementioned problematic issues. In this context there is some european and national case law which could have an impact on the interpretation, scope and usage of copyright exceptions and limitations.

The rise of digital technology has introduced a number of new issues in worldwide copyright law. Not the least of all being the question of how to protect copyright works in a transnational environment. Although copyright law is still shaped by the territoriality principle, multiple international treaties and European Union Directives have internationalised respectively harmonised parts of the national copyright legislation.

Due to the advancing of the Europeanisation[13] of copyright law and the trans-national structure of the digital world, this master thesis will retain a European scope. European legislation has harmonised a great part of the national legal provisions of the Member States, and the European Court of Justice is also ruling in this direction. Thus, the aforementioned problems and issues will be primarily examined on the basis of European legislation and case law.

As a first step to evaluate the copyright issues surrounding the Instant Preview function a basic overview of the technical process involved will be given, so that an accurate analysis of the legal implication can be achieved. Following that, an outline of international and European copyright legislation and principles will create a framework which will allow to precisely answering the problematic questions.

Due to some recent ECJ case law, the next part of the thesis will deal with the subject matter of an original work in a European perspective.

After that, the specific questions dealing with copyright issues of webpages, caching of the same, the implementation of the screenshot of websites in the search engine results list, and finally possible exceptions and limitations to copyright protection for these processes will be addressed in this order.

The aim of this thesis is to analyse whether the instruments provided by European legislation are able to deal reasonable with the challenges of the digital world, especially in connection with newer and more advanced functions provided by search engines and to assess whether the a function like Google Instant preview is in breach of copyright or not.

2 Basics

It is fundamental to give a technical background of the functioning of a search engine in order to adequately analyse the issues and problems arising from the graphic representation of a website through a search function like Instant Preview. Furthermore, the basic principles of international and European Union copyright law must be addressed to create an effective groundwork for the further analysis of the copyright issues in connection with Google’s Instant Preview.

2.1 Technical framework

Although the technical operations of search engine are quite comprehensive, some background knowledge is required to follow the systems processes involved with the creation of a website preview.

2.1.1 Basic function of search engines

A web search engine searches for information on the internet, indexes the retrieved information, and presents the search results connected with a certain term in a list. Generally, web search engines operate with an algorithm to present the most fitting search results to a user’s search query.[14]

Basically, a search engine consists of three components: A web crawler, an index and a web search query.

A web crawler, or spider, is a software agent which automatically searches the internet and retrieves new or changed websites and other documents[15]. A web crawler visits a certain number of webpages, and then follows every hyperlink found on the webpage and retrieves data from every successive website[16]. A web crawler can be prevented from using a hyperlink or retrieving certain data from a web page by a Robot Exclusion Standard, or robots.txt protocol. Robots.txt is normally used to give instructions to visiting web crawlers, for example which specific parts of a website not to retrieve[17]. Thus, a web crawler can be given the instructions to not retrieve certain images, file types or even not to index the whole website[18]. Although it is voluntarily for a web crawler to follow the instructions of a robots.txt protocol, most respectable web search engines program their web crawlers to follow the instructions[19].

The data found by the web crawler is then analysed and stored in an index, also called Information Retrieval System. Indexing can be split up in three separate processes. First the retrieved data is analysed to determine how the content of a specific website should be saved in the index. This is achieved through the use of different filters which are partially run by the web crawler and the Retrieval System. The aim of this process is to define fitting keywords for the index[20]. The keywords are then stored in the index database with, at least in Google’s case, a cached copy of the complete source website[21]. At last the keywords are stored in the Information Retrieval System database in such a way that every keyword points to the sources this keyword can be found in, which simplifies the effort to scan and search the database[22].

The web search query is the interface between the actual search function and the Information Retrieval System. When a query is entered into a search engine by typing in specific keywords, the query processor examines the index, matching the entered keywords with those present in the index database[23]. For example, when two keywords are entered into the search engine, the query processor searches the index for every document containing these keywords. Then the documents are compared, and only those are further processed which enclose both keywords. With the help of an algorithm the fitting websites are ranked after their relevance[24]. The relevancy of a website is measured by many factors, for example how many hyperlinks connect the website with other websites[25]. This list of search results is then displayed on the search engines webpage. Usually the results consist of the webpages title, the sites URL and a small text excerpt from the webpage, called snippet, centred on the most relevant keywords.

2.1.2 Function of a website preview

The search results page is the point where website preview functions like Google Instant Preview are used. Apart from the aforementioned parts of a search result[26], the website preview offers a graphical representation of the search result, in the form of a small screenshot of the source webpage. In this screenshot, the relevant sections containing the keywords of the search query are highlighted[27].

It is important to analyse the technical background of this function in order to identify the copyright issues surrounding it. There are two different ways the search engine generates these preview images. As was already addressed, a web crawler retrieves the data from a website, and the search engine processes the data and indexes part of the website. Especially in the case of the Google search engine, most of the time a complete copy of the webpage is stored as a backup on a Google server. If this is the case, the search engine creates a screenshot from the fetched and cached content, and places it besides the search result[28]. When a user hovers the mouse over the search result, an embedded window pops up, showing a graphical representation of the found source website. In this representation, the text surrounding the queried keyword is highlighted.

There is the possibility that not every searched for website is cached on a server of the search engine, in the majority of cases because there is an instruction in the robots.txt protocol to not fetch and cache the website. In that case, the search engine may decide to render a website preview on-the-fly, after a user requests a preview by clicking on the small lens besides the search result[29]. This happens with the help of another user-agent. Then, the webpage is cached, and a preview is generated and placed on the search result list the same way as if there was a cached version of the website.

There is a possibility for the owner or webmaster of a page to block the generation of a preview. The inclusion of the nosnippet instruction in the Robot Exclusion prevents the search engine to show a snippet of the website in the search result list. This also blocks the webpage preview from showing up in the list[30].

2.2 Basics of Copyrights and Author’s Rights

Copyrights, in common law countries, and author’s rights, in civil law countries[31], are basically exclusive rights granted to the creators of original works to effectively exploit them. These exclusive rights conferred to creators of literary or artistic are a way to encourage them to disclose their creation to the public and thus enriching the culture[32]. Through this, the copyright holder owns inter alia the sole right to copy and disclose his work to the public and furthermore to authorise or licence others to make copies or make the work available to the public[33].

2.2.1 Differences and similarities of the two concepts

There are some differences between the two intellectual property systems, the most prominent being the dissimilar direction of protection. In the copyright system the focus of the protection is given to the work itself, while in the author’s right system the author is primarily protected[34]. Basically the copyright system centres on the protection of the economic rights connected to a copyrighted work, whereas the author’s right system is based on the protection of the moral rights of the author[35]. In common law systems, the rights granted to the author can entirely be sold or transferred to third parties. Moreover, under the so called work made for hire doctrine, an employee’s work belongs originally to the employer[36]. In contrast to that, in most civil law systems the author is the sole owner of the work and it is impossible for him to transfer the copyright to a third party. Only exploitation rights can be sold or otherwise transferred under these systems[37].

Another difference is the disparate interpretation of the term original work, which is the requirement to be granted exclusive rights for one’s creation, in the two systems. Under the copyright system, particularly in the UK, for a work to be original the main requirement is that the work is not a copy of another work[38]. The other requirement is that the work is the result of the use of individual skill, judgement and labour[39]. Under the civil law systems, the prerequisite for a work to be considered original generally seems higher. Just investing skill, time and labour to create something is not enough. A work must have the imprint of the author’s personality under French law[40] or must be a personal intellectual creation according to German Law[41].

Although the basic principles of copyright and author’s right are different, the result, that means whether a work is considered original or not, is mostly the same in both systems.

Furthermore, the ways that exceptions and limitations are handled in both systems are different. While the UK fair dealing and the US fair use doctrine are quite flexible, emphasising the well-being of the society as a whole, limitations and exceptions in continental law systems are quite narrow and interpreted very restrictive, accentuating the protection of the author[42]. These narrow restrictions and limitations give rise to several copyright problems connected with the use of search engines and the internet in general, as can be seen in the German Thumbnails case[43].

2.2.2 Territoriality principle of copyright

The territoriality principle is a keystone of copyright law, and must be taken into consideration when dealing with copyrights in an international level. The core of the principle is that domestic legal protection can only be infringed in that particular country[44]. Ultimately that means that copyright protection is considered to be subject to national law, and that foreign copyrights are not protected by domestic courts. The territoriality principle was one reason that the following international treaties were adopted, and is also capable of interfering with some fundamental freedoms of the EU[45].

2.2.3 International treaties

Due to the territoriality principle, copyrights are still a national legislation. In order to harmonise the different protection regimes, a multitude of international copyright treaties and conventions were adopted. The signatory countries, inter alia all the EU member states thus have to adhere to a basic copyright framework. The most important treaties regarding copyright law are the Berne Convention for the Protection of Literary and Artistic Works[46], the Agreement on Trade-Related Aspects of Intellectual Property Rights[47] and the WIPO Copyright Treaty[48]. The main provisions important to the subject of this thesis are examined here.

The Berne Convention, adopted in 1886, is the oldest copyright treaty in effect. The BC is based on some basic principles. The national treatment principle which rules that works of a national of one member state are given the same protection in every other member state as they give their own nationals[49]. Then there is the no formalities principle, which bars any requirement for a formal registration of copyright protection[50]. Article 2 BC contains a non-exhaustive list of works which are protected under the Berne Convention, while Article 5 I BC guarantees a set of minimum rights found in the Convention. Article 9 I BC grants the exclusive right to authorise reproductions of protected works to the author, and Article 9 II BC restricts the exceptions and limitations to this exclusive right with the three-step-test.

The TRIPS adds some important provision on copyright in Article 9 to 14. Article 9 I TRIPS requires signatory member to comply with the provisions of the Berne Convention. This is insofar important for European Union Law, as the European Community became a member of TRIPS in 1995, and thus both the provisions of the Berne Convention and of TRIPS are now part of the acquis communautaire[51]. Even more, the European Court of Justice[52] is of the opinion that it has jurisdiction to interpret the provision of the BC and TRIPS as far as they are part of the EU Law[53]. Article 10 TRIPS adds computer programs and databases as protected works under the BC and Article 13 TRIPS includes the three-step-test.

The WIPO copyright treaty provides additional protections deemed necessary due to advances in information technology. Article 4 and 5 WCT grant computer programs respectively databases the same protection as literary works under Article 2 I BC. Article 8 WCT establishes a digital transmission right[54], which means the authorisation to communicate a work to the public via digital services. The WCT also mentions the three-step-test in Article 10.

2.2.4 European legislation

The basic notion to begin examining EU copyright law is that there is no completely harmonized European Union copyright law. Contrary to European trademark law, which implemented the Community trademark, EU copyright law is still based on the territoriality principle, which essentially means that every EU member state has its own legal copyright framework[55]. Additionally, the EU has no direct legislative competency in the subject of copyright. Thus, the basis for any harmonisation efforts in the European Union is mainly Article 114 TFEU.

The European Union has enacted several Directives relating to copyright and neighbouring rights. The main fields of harmonisation in the Union are primarily selected subject matters of protection like computer software and databases. Apart from that, the EU legislation is focusing on removing existing differences, like the term of copyright protection, which obstruct the function of the common market.[56] The most relevant Directives to the topic of this thesis are the Computer Programs Directive[57], the Database Directive[58] and the Information Society Directive[59].

The Computer Programs Directive mainly deals with the determination of when a computer program should be considered as a work protected by copyrights, and the effect of such a protection. The Database Directive harmonised the requirements for a database to be granted copyright protection. This was due to the different standards of protection in common land civil law member states[60]. Article 3 introduced the requirement of originality, in the sense of requiring a measure of creativity, for a database to be protected as a work, which sets a higher standard than the common law skill, judgement and labour doctrine. To counter that, Article 7 grants a sui generis right to databases which are the result of substantial investment.

The InfoSoc Directive was adopted in 2001 by the European Union after a lengthy process which began in 1995[61]. Besides implementing some provision of the WCT, the InfoSoc Directive aims to harmonise the reproduction, communication and distribution rights by wire or wireless means and certain basic exceptions and limitations to these exclusive rights[62]. What the Directive does not, in contrast to the other mentioned Directives, is to provide a definition for the term work.

3 Work in European Copyright

For a creation to be granted copyright protection, it must be deemed to be an original work. There is no easy answer to define what exactly an original work is and what not, neither according to international treaties or European Union law, nor according to the legislation of the Member States. Fundamentally, a work which can be protected is an idea of a creative work, expressed by the author and thus transformed in a copyright protected work. Thus, the basic rule for is that an idea cannot be protected, only the expression of the idea[63]. Alas, this is not concrete enough to provide a definition for an original work. As was already mentioned, there are different requirements under common and civil law for a production of any kind to be seen as an original work. Since the term original work is the fundament of any copyright protection, it is important to define this expression as concrete as possible.

3.1 Legal provisions defining original work

Neither TRIPS nor BC or the WCT precisely define an original work. In Article 2 I. the Berne Convention gives a very broad periphrasis of the term literary and artistic works, namely to include every production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression. This expression is rather a foundation of the term work then a comprehensive definition[64].

According to the WIPO for a work to enjoy copyright protection it must be an original creation[65]. Additionally, the WIPO requires an original intellectual creation, but any quality or imaginativeness or inventiveness is not required[66]. This gives the impression to be a definition on a very low level similar to the common law approach with a shallow creativity requirement.

The acquis communautaire seems to be similarly broad and imprecise in the definition of an original work. There are some Directives where the term work is mentioned. The Computer Programs Directive states in Article 3 I that “a computer program shall be protected if it is original in the sense that it is the author's own intellectual creation”. The Term of Protection Directive also says that “Photographs which are original in the sense that they are the author's own intellectual creation shall be protected”[67]. In Article 3 I of the Database Directive, databases are granted copyright protection if “by reason of the selection or arrangement of their contents, constitute the author's own intellectual creation”. The InfoSoc Directive mentions the term work multiple times, but without giving a definition. The works defined in these Directives, that is, computer programs, photographs and databases require an author’s own intellectual creation. Additionally, all of these definitions include the phrase “No other criteria shall be applied to determine their eligibility for that protection”[68].

The definitions given for the term original work, at least in the three mentioned Directives, are very similar. Furthermore, since there are no other definitions of this term found in European Directives, it can be assumed that the phrase author’s own intellectual creation[69] in connection with the requisite that no other criteria shall be applied to determine their eligibility for that protection[70] are suggesting a single standard for European copyright protection[71].

The single standard of European coyright protection for an original work seems persuasive in connection to the three mentioned Directives, notwithstanding the slight differences in the wording. The fact that later Directives, like the InfoSoc Directive, do not define the term work anymore thus leads to the question whether it was deemed unnecessary to define this term again, or whether it was intended that the given definition applies only to these three Directives. Before the InfoSoc Directive was implemented, there were sporadic national cases dealing with this question. One German case[72] dealing with the copyright protection of service instruction argued that, due to the identical definitions of the term work in the European Directives, a uniform level of copyright protection must be intended by the EU legislator. The same applies to decisions of the Austrian Supreme Court, which applied the European definition of an original work to other categories of works[73].

3.2 The Infopaq Decision

The question whether there is a single standard of the definition of original work was addressed by the ECJ in the Infopaq/DDF case, which attracted very contrasting comments in the legal literature[74][75].

Infopaq operates a service searching for certain keywords in Danish newspapers and by means of a data capture process and then notifying users of the found keyword, storing the passage and then printing it out with five words before and after the found keyword[76]. The questions referred to the ECJ by the Danish court were (a) whether the storing and the printing of the passage is a reproduction according to Article 2 InfoSoc Directive and (b) whether the individual steps of the process are temporary acts of reproduction in accordance to Article 5 I InfoSoc[77].

Unsurprisingly, the court decided that the storing and printing out of the eleven words, as well as the other steps involved in the process, are not temporary acts of reproduction[78]. On the other hand the court decided that the act of storing and printing out of an extract could be a reproduction according to Article 2 InfoSoc “if the elements thus reproduced are the expression of the intellectual creation of the author”[79].


Institution / College
University of Hannover – Institut für Rechtsinformatik
Title: Copyright issues surrounding the use of services like Google Instant Preview