Don’t Let the Web Bugs Bite: How to Stop Spidering

Spiders and other automated software agents are sent out by their owners to “search and retrieve” useful information from the internet on behalf of their owners. They are invaluable tools for ensuring search engines are kept up-to-date and for market research purposes. But they are also used by companies to dig the dirt on their rivals.

To what extent is spidering legal and what can be done to prevent it? Solicitor Raffi Varoujian of City law firm Field Fisher Waterhouse discusses the options.

Spiders can be assigned a specific set of tasks to ensure that they only bring back information which is useful to the sender. This is done by providing them with either a URL or directions to a server’s file directory. Once there, it collates data from the chosen location and via any hypertext links and returns them to its owner.

Most businesses actually welcome visits from spiders which are being used to modify search engines. Some even lure them there by using metatags which direct them more easily to their websites in a bid to increase their chances of being favourably ranked by them.

However, the intentions of spiders are not always so honourable. They can be used by competitors to find out everything, from details of a rival’s products and pricing to their annual profits and identifying failing areas of their business – a worrying prospect, when one considers that a competitor can send out thousands of spiders a day.

Antidotes

The issue of exactly what legal measures one can take to prevent spiders entering a website have been discussed at great length in the US where guidance is provided by case law. In the UK, however, it is a grey area and will remain so until, as in the US, a relevant dispute is brought to court and a decision is made. That said, existing legislation does offer a certain amount of protection in the meantime.

Database rights

The Council Directive 96/9/EC on the legal protection of databases aims to protect the investment in the creation and contents of a database. It applies to databases which are collections of independent works, data or other materials arranged in a systematic or methodical way. Providing they have a home page complete with index and separate web pages, systematically arranged and containing additional data, most websites will qualify as databases and will therefore be offered protection.

To date, there has only been one case in the UK in which the issue of database rights was addressed, that of British Horseracing Board Limited v William Hill Organisation Limited. Here, Mr Justice Laddie interpreted the Directive very widely and, as such, most websites will be protected by database rights providing there has been a substantial investment in obtaining, verifying or presenting the contents of the database (ie in terms of having collected, checked and formatted data).

Infringement will take place if either the whole, or a “substantial” part, of the database contents are “extracted” or “reutilised” without the database creator’s permission.

In this context, “extraction” applies to the temporary or permanent transfer of all, or a substantial part of, the contents to another medium by any means or in any form. “Re-utilisation” is any form of making available to the public all, or a substantial part of, the database contents. The assessment of whether or not “substantial” parts have been taken is carried out either on a qualitative or quantitative basis, or both.

However, repeated and systematic extraction, re-utilisation, or both, of insubstantial parts of the contents of the database are also illegal if they are seen to conflict with normal exploitation of the database, or unreasonably prejudice the legitimate interests of the database creator.

A spider permanently transfers information from the database it is searching back to the server from which it was originally sent. Theoretically, this could qualify as “extraction” of data under the Directive. That said, it could also be argued that, as spiders are merely compilers of data, whether the information collected is then “re-utilised”, for the purposes of the Directive, will depend on how it is used subsequently.

This would certainly be the case if the information was published on a website to show comparison between different companies' products and prices as the website owner would clearly be guilty of “making available to the public” information collated and would therefore be more likely to offend the principles which the Directive was enacted to protect.

However, it is worth noting that “extraction” or “re-utilisation” in isolation will not automatically constitute infringement. It will depend on whether or not “substantial” parts of the website have been extracted or re-utilised.

A spider can visit the same website several times a day. If, on each occasion, it were to take the prices and product descriptions of all of a company's products, this would probably constitute a “substantial” extraction from the database as a whole. However, if the spider only looked at certain products, then the extraction may not be “substantial” and would not therefore be considered to be infringing. That said, if a spider were to only extract at certain products but, say, as often as once a day, it might still also be construed as “substantial” in the sense that it was repeated and systematic. A German court took this view in the Berlin Online case in which repetitive use of a meta-search engine on a particular online database was found to amount to repeated and systematic extraction of “insubstantial” parts of the database.

The meaning of “substantial” will also be assessed in relation to how important the data in question would be to the extractor and what portion of the website’s data content as a whole is being used.

Just how far reaching the scope of the Directive will be remains to be seen as we still await the European Court of Justice’s views on the matter following the referral of certain aspects of the BHB case to it by the Court of Appeal. In the meantime, it can be said that the use of spiders to extract information from websites is capable of infringing database rights.

Breach of Contract

Spiders will not, of course, read on-line terms and conditions or “click-accept” them before entering a website and are unlikely, in any case, to enter a website via its home page. Its senders could therefore argue that they were not aware of any terms and conditions of entry to a website and were not therefore bound by them.

The “Robots Exclusion Protocol” has been designed to help website owners and spider programmers agree on a way to catalogue websites. There is an understanding between website owners and search engines that, while websites are under construction, certain pages will not be ready for immediate access by spiders. By placing a metatag, or a small text file (“robot.txt”) in the root directory of their main site, website owners can ensure that spiders programmed to recognise the protocol will not to visit and log pages with that text.

In order to use breach of contract as an effective method of deterring spiders, website owners should consider the actual damage which they could claim in the event of suffering such a breach. This issue has not yet been considered in the UK, but there have been several cases in the US where website owners can bring actions for:

· trespass

· copyright infringement

· unfair business practices

· misappropriation

· false advertising

· federal trade mark dilution

· injury to business reputation

· interference with effective economic advantage

· unjust enrichment

Loss of capacity to computer systems

In the US case of eBay v Bidder's Edge, the damages considered included loss of capacity to computer systems.

Spiders can consume some of the processing and storage resources of servers and, in doing so, can impair and slow down the ability of genuine website browsers to visit the site. In the case, spidering had affected a mere 1.53% of eBay's capacity by making 100,000 hits per day, yet the court held that Bidder's Edge had nevertheless deprived eBay of the ability to use that portion of its personal property for its own use.

In the case of Ticketmaster v Tickets.com, the court took a different view. Although it was recognised that there damage from loss of capacity was possible, it was not considered to be sufficient enough to support a claim that it had damaged the website’s ability to carry on its business as the use was minimal. In the more recent case of Oyster v Forms Processing, the court agreed with the decision originally reached in eBay case. However, it also suggested that whether or not intrusion by spiders caused substantial interference was not the real point – the tort was committed by unauthorised use of the website owner’s property.

Computer Misuse Act 1990

The Computer Misuse Act predates the invention of spiders, but it could arguably be applied to their use. This might be the case if, for example, it could be proved that a spider had been used to access a website where access was unauthorised, in itself a criminal offence.

A website owner wishing to rely on the Act as a way of stopping spiders entering its website might wish to expressly prohibit their entry under its terms and conditions of use. The owner would also have to ensure that metatags were removed as these are primarily used to attract search engine spiders to a site.

Copyright

Websites are protected as databases or compilations under the Copyright, Designs and Patents Act 1988, although actually proving that use of spiders had infringed copyright could prove quite a challenge.

Conclusion

Only time will tell how the UK courts will view the use of spidering. In the meantime, however, website owners can at least fall back on existing legislation by way of protection, although there is by no means any guarantee that they will succeed.

Those responsible for sending spiders would be wise to seek legal advice as to what extent their intended extraction and re-utilisation of data, and the regularity of its extraction, can be legitimately continued before it runs the risk of landing them in court.

The content of this article does not constitute legal advice and should not be relied on in that way. Specific advice should be sought about your specific circumstances.

Don’t Let the Web Bugs Bite: How to Stop Spidering

Contributor

Intellectual Property

Contributor

United Kingdom