United States: Data For The Taking: Using The CFAA To Combat Web Scraping

Last Updated: July 25 2014
Article by Aaron P. Rubin and Tiffany Hu

"Web scraping" or "web harvesting"—the practice of extracting large amounts of data from publicly available websites using automated "bots" or "spiders"—accounted for 18% of site visitors and 23% of all Internet traffic in 2013. Websites targeted by scrapers may incur damages resulting from, among other things, increased bandwidth usage, network crashes, the need to employ anti-spam and filtering technology, user complaints, reputational damage and costs of mitigation that may be incurred when scrapers spam users, or worse, steal their personal data.

Though sometimes difficult to combat, scraping is quite easy to perform. A simple online search will return a large number of scraping programs, both proprietary and open source, as well as D.I.Y. tutorials. Of course, scraping can be beneficial in some cases. Companies with limited resources may use scraping to access large amounts of data, spurring innovation and allowing such companies to identify and fill areas of consumer demand. For example, Mint.com reportedly used screen scraping to aggregate information from bank websites, which allowed users to track their spending and finances. Unfortunately, not all scrapers use their powers for good. In one case on which we previously reported, the operators of the website Jerk.com allegedly scraped personal information from Facebook to create profiles labeling people "Jerk" or "not a Jerk." According to the Federal Trade Commission (FTC), over 73 million victims, including children, were falsely told they could revise their profiles by paying $30 to the website.

Website operators have asserted various claims against scrapers, including copyright claims, trespass to chattels claims and contract claims based on allegations that scrapers violated the websites' terms of use. This article, however, focuses on another tool that website operators have used to combat scraping: the federal Computer Fraud and Abuse Act (CFAA).

The CFAA imposes liability on "whoever...intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains...information from any protected computer..." While the CFAA is primarily a criminal statute, it also provides for a civil remedy where a plaintiff suffers more than $5,000 in aggregate losses during any one-year period arising from a violation of the CFAA. For large website operators asserting CFAA claims against scrapers, the $5000 damages requirement has not proven to be a difficult obstacle to overcome. For example, in CollegeSource, Inc. v. AcademyOne, Inc., the District Court for the Eastern District of Pennsylvania found that the plaintiff's cost of initiating an internal investigation of the defendant's website, hiring a computer expert to analyze the scope of the defendant's actions and implementing increased security measures were well in excess of $5,000. Similarly, in Facebook, Inc. v. Power Ventures, Inc., the District Court for the Northern District of California found that the plaintiff's expenditures made in response to defendant's specific acts, which included three to four days of engineering time, $75,000 in outside counsel costs and the costs of responding to a minimum of 60,000 instances of spamming by defendant, were well in excess of the statutory threshold. The more difficult question is whether scraping violates the CFAA at all.

The CFAA was originally intended as an anti-hacking statute and its application to scraping—which, after all, usually involves accessing publicly-available data on a publicly-available website—is not always a foregone conclusion. Does a scraper access a website "without authorization" or "exceed authorized access" when it harvests publicly available data on a publicly available website? Plaintiffs often argue that scrapers act without authorization because the websites' online terms of use prohibit scraping and/or prohibit the scrapers' use of the data that they harvest. As discussed below, such claims have met with success in some cases, but courts have been less willing to find a CFAA violation in other scraping cases.

In Cvent, Inc. v. Eventbrite, Cvent sued Eventbrite for scraping Cvent's website to obtain venue information and using the information in Eventbrite's "Venue Directory." Cvent claimed that this was a violation of the CFAA because Cvent's terms of use specifically stated that such activities were unauthorized. The District Court for the Eastern District of Virginia held that Eventbrite's actions did not constitute "hacking" in violation of the CFAA because the information was publicly available; Cvent's website did not require any login, password or other individualized grant of access; and Cvent's terms of use were difficult to locate. Therefore, the court granted Eventbrite's motion to dismiss, concluding that Eventbrite was authorized to access the information on Cvent's website, and that the mere allegation that Eventbrite used the information inappropriately was not grounds for relief under the CFAA.

Power Ventures, the defendant in Facebook, Inc. v. Power Ventures, Inc., operated a social media account integration site. As part of a promotion to gain new members, Power Ventures provided users with a list of their Facebook friends, which Power Ventures obtained through scraping the Facebook website, and asked users to select friends to invite to use the Power Ventures site. Facebook notified Power Ventures that its access was unauthorized and blocked Power Ventures' IP addresses. However, Power Ventures' scraping technology was designed to circumvent such technological measures and the scraping continued. The District Court for the Northern District of California held that Power Ventures' accessing of Facebook was without authorization and violated the CFAA and accordingly, granted summary judgment to Facebook on the CFAA claim.

CollegeSource, the plaintiff in CollegeSource, Inc. v. AcademyOne, Inc., maintained an archive of college course catalogs in PDF format and a hyperlink service called CataLink, both of which it made available to paying subscribers. AcademyOne, a CollegeSource subscriber, hired a third party to download college catalogs directly from college websites in order to compile a course description database. However, the third party instead copied some of the PDF documents from CollegeSource through CataLink. AcademyOne removed the CollegeSource documents from its system after receiving a cease and desist letter from CollegeSource, but CollegeSource nonetheless proceeded to bring a number of claims against AcademyOne, including CFAA claims based on the argument that AcademyOne accessed the documents without authorization and exceeded authorized access. The court held, however, that AcademyOne did not access the documents without authorization because those documents were available to the general public. CollegeSource's argument that AcademyOne exceeded authorized access was based on AcademyOne's alleged violation of CollegeSource's terms of use. The Court acknowledged that accessing a website in violation of the applicable terms of use has been held to support a CFAA claim in some cases, but was unconvinced by CollegeSource's argument here because CollegeSource's subscription agreement did not cover CataLink. Accordingly, the court granted summary judgment to AcademyOne on the CFAA claims.

In Craigslist Inc. v. 3Taps Inc., 3Taps allegedly scraped Craigslist's website and republished Craigslist ads on its own site, craiggers.com.  In response, Craigslist sent 3Taps a cease and desist letter revoking 3Taps's authorization to access Craigslist's website for any purpose, and reconfigured the website to block 3Taps.  When 3Taps allegedly continued its scraping activities by using different IP addresses and proxy servers to conceal its identity, Craigslist brought suit under the CFAA.  Even though Craigslist's website was publicly available, the District Court for the Northern District of California declined to grant 3Taps' motion to dismiss the CFAA claim.  According to the court, while Craigslist may have granted the world permission to access its website, it retained the power to revoke that permission on a case-by-case basis, a power it exercised when it sent the cease and desist letter and blocked 3Taps's IP addresses. Therefore, 3Taps' continued access was without authorization. The court also rejected 3Taps' attempt to invoke the Ninth Circuit's decision in United States v. Nosal.  In Nosal, the Ninth Circuit had held that an employee's use of information in violation of an employer's policies did not constitute a CFAA violation where the employee's initial access to the employer's computer system was authorized.  The court in 3Tap's concluded, however, that the "calculus is different where a user is altogether banned from accessing a website," as was the case with 3Taps.

Fidlar, the plaintiff in Fidlar Technologies v. LPS Real Estate Data Solutions, Inc., provides its Laredo program to governmental agencies, such as county clerks' offices, which use Laredo to make public records available for viewing over the Internet. Laredo prevents users from downloading or electronically capturing the documents they view. Users who want a copy of a public record must pay the county a print fee. LPS, a real estate analytics company, contracted with many counties to access their public records using Laredo, but used a scraping program to capture documents electronically without paying any fees. Fidlar sued LPS for violating section 1030(a)(5)(A) of the CFAA, which imposes liability on anyone who "...knowingly causes the transmission of a program, code, or command, and as a result...intentionally causes damage without authorization, to a protected computer." The District Court for the Central District of Illinois denied LPS's motion to dismiss the CFAA claim, holding that Fidlar's complaint properly alleged that LPS undertook intentional actions that, among other elements of damage, compromised the integrity of Laredo.

In light of the cases discussed above, it seems that plaintiffs are likely to have more success asserting CFAA claims against scrapers where they clearly and unambiguously revoke authorization to access their websites and take affirmative steps to block the scrapers, as in 3Taps and Power Ventures. In contrast, when the scraper ceases scraping after access is revoked and takes remedial action, as in CollegeSource, courts may be less willing to impose CFAA liability. As seen in Cvent, a mere terms of use violation, particularly where the scraper may not have actual notice of the terms of use, may not support a CFAA claim. Whether the scraper is simply using software to collect publicly available information more efficiently or to do something else—such as to avoid paying fees for the information, as seen in Fidlar—may also be relevant. In any event, in an era when data is expensive to collect, valuable to have and cheap to take, the CFAA, when properly used, remains a viable tool to combat scrapers.

Because of the generality of this update, the information provided herein may not be applicable in all situations and should not be acted upon without specific legal advice based on particular situations.

© Morrison & Foerster LLP. All rights reserved

To print this article, all you need is to be registered on Mondaq.com.

Click to Login as an existing user or Register so you can print this article.

Authors
Aaron P. Rubin
 
In association with
Related Video
Up-coming Events Search
Tools
Print
Font Size:
Translation
Channels
Mondaq on Twitter
 
Register for Access and our Free Biweekly Alert for
This service is completely free. Access 250,000 archived articles from 100+ countries and get a personalised email twice a week covering developments (and yes, our lawyers like to think you’ve read our Disclaimer).
 
Email Address
Company Name
Password
Confirm Password
Position
Mondaq Topics -- Select your Interests
 Accounting
 Anti-trust
 Commercial
 Compliance
 Consumer
 Criminal
 Employment
 Energy
 Environment
 Family
 Finance
 Government
 Healthcare
 Immigration
 Insolvency
 Insurance
 International
 IP
 Law Performance
 Law Practice
 Litigation
 Media & IT
 Privacy
 Real Estate
 Strategy
 Tax
 Technology
 Transport
 Wealth Mgt
Regions
Africa
Asia
Asia Pacific
Australasia
Canada
Caribbean
Europe
European Union
Latin America
Middle East
U.K.
United States
Worldwide Updates
Check to state you have read and
agree to our Terms and Conditions

Terms & Conditions and Privacy Statement

Mondaq.com (the Website) is owned and managed by Mondaq Ltd and as a user you are granted a non-exclusive, revocable license to access the Website under its terms and conditions of use. Your use of the Website constitutes your agreement to the following terms and conditions of use. Mondaq Ltd may terminate your use of the Website if you are in breach of these terms and conditions or if Mondaq Ltd decides to terminate your license of use for whatever reason.

Use of www.mondaq.com

You may use the Website but are required to register as a user if you wish to read the full text of the content and articles available (the Content). You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, link, display, or in any way exploit any of the Content, in whole or in part, except as expressly permitted in these terms & conditions or with the prior written consent of Mondaq Ltd. You may not use electronic or other means to extract details or information about Mondaq.com’s content, users or contributors in order to offer them any services or products which compete directly or indirectly with Mondaq Ltd’s services and products.

Disclaimer

Mondaq Ltd and/or its respective suppliers make no representations about the suitability of the information contained in the documents and related graphics published on this server for any purpose. All such documents and related graphics are provided "as is" without warranty of any kind. Mondaq Ltd and/or its respective suppliers hereby disclaim all warranties and conditions with regard to this information, including all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement. In no event shall Mondaq Ltd and/or its respective suppliers be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of information available from this server.

The documents and related graphics published on this server could include technical inaccuracies or typographical errors. Changes are periodically added to the information herein. Mondaq Ltd and/or its respective suppliers may make improvements and/or changes in the product(s) and/or the program(s) described herein at any time.

Registration

Mondaq Ltd requires you to register and provide information that personally identifies you, including what sort of information you are interested in, for three primary purposes:

  • To allow you to personalize the Mondaq websites you are visiting.
  • To enable features such as password reminder, newsletter alerts, email a colleague, and linking from Mondaq (and its affiliate sites) to your website.
  • To produce demographic feedback for our information providers who provide information free for your use.

Mondaq (and its affiliate sites) do not sell or provide your details to third parties other than information providers. The reason we provide our information providers with this information is so that they can measure the response their articles are receiving and provide you with information about their products and services.

If you do not want us to provide your name and email address you may opt out by clicking here .

If you do not wish to receive any future announcements of products and services offered by Mondaq by clicking here .

Information Collection and Use

We require site users to register with Mondaq (and its affiliate sites) to view the free information on the site. We also collect information from our users at several different points on the websites: this is so that we can customise the sites according to individual usage, provide 'session-aware' functionality, and ensure that content is acquired and developed appropriately. This gives us an overall picture of our user profiles, which in turn shows to our Editorial Contributors the type of person they are reaching by posting articles on Mondaq (and its affiliate sites) – meaning more free content for registered users.

We are only able to provide the material on the Mondaq (and its affiliate sites) site free to site visitors because we can pass on information about the pages that users are viewing and the personal information users provide to us (e.g. email addresses) to reputable contributing firms such as law firms who author those pages. We do not sell or rent information to anyone else other than the authors of those pages, who may change from time to time. Should you wish us not to disclose your details to any of these parties, please tick the box above or tick the box marked "Opt out of Registration Information Disclosure" on the Your Profile page. We and our author organisations may only contact you via email or other means if you allow us to do so. Users can opt out of contact when they register on the site, or send an email to unsubscribe@mondaq.com with “no disclosure” in the subject heading

Mondaq News Alerts

In order to receive Mondaq News Alerts, users have to complete a separate registration form. This is a personalised service where users choose regions and topics of interest and we send it only to those users who have requested it. Users can stop receiving these Alerts by going to the Mondaq News Alerts page and deselecting all interest areas. In the same way users can amend their personal preferences to add or remove subject areas.

Cookies

A cookie is a small text file written to a user’s hard drive that contains an identifying user number. The cookies do not contain any personal information about users. We use the cookie so users do not have to log in every time they use the service and the cookie will automatically expire if you do not visit the Mondaq website (or its affiliate sites) for 12 months. We also use the cookie to personalise a user's experience of the site (for example to show information specific to a user's region). As the Mondaq sites are fully personalised and cookies are essential to its core technology the site will function unpredictably with browsers that do not support cookies - or where cookies are disabled (in these circumstances we advise you to attempt to locate the information you require elsewhere on the web). However if you are concerned about the presence of a Mondaq cookie on your machine you can also choose to expire the cookie immediately (remove it) by selecting the 'Log Off' menu option as the last thing you do when you use the site.

Some of our business partners may use cookies on our site (for example, advertisers). However, we have no access to or control over these cookies and we are not aware of any at present that do so.

Log Files

We use IP addresses to analyse trends, administer the site, track movement, and gather broad demographic information for aggregate use. IP addresses are not linked to personally identifiable information.

Links

This web site contains links to other sites. Please be aware that Mondaq (or its affiliate sites) are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of these third party sites. This privacy statement applies solely to information collected by this Web site.

Surveys & Contests

From time-to-time our site requests information from users via surveys or contests. Participation in these surveys or contests is completely voluntary and the user therefore has a choice whether or not to disclose any information requested. Information requested may include contact information (such as name and delivery address), and demographic information (such as postcode, age level). Contact information will be used to notify the winners and award prizes. Survey information will be used for purposes of monitoring or improving the functionality of the site.

Mail-A-Friend

If a user elects to use our referral service for informing a friend about our site, we ask them for the friend’s name and email address. Mondaq stores this information and may contact the friend to invite them to register with Mondaq, but they will not be contacted more than once. The friend may contact Mondaq to request the removal of this information from our database.

Emails

From time to time Mondaq may send you emails promoting Mondaq services including new services. You may opt out of receiving such emails by clicking below.

*** If you do not wish to receive any future announcements of services offered by Mondaq you may opt out by clicking here .

Security

This website takes every reasonable precaution to protect our users’ information. When users submit sensitive information via the website, your information is protected using firewalls and other security technology. If you have any questions about the security at our website, you can send an email to webmaster@mondaq.com.

Correcting/Updating Personal Information

If a user’s personally identifiable information changes (such as postcode), or if a user no longer desires our service, we will endeavour to provide a way to correct, update or remove that user’s personal data provided to us. This can usually be done at the “Your Profile” page or by sending an email to EditorialAdvisor@mondaq.com.

Notification of Changes

If we decide to change our Terms & Conditions or Privacy Policy, we will post those changes on our site so our users are always aware of what information we collect, how we use it, and under what circumstances, if any, we disclose it. If at any point we decide to use personally identifiable information in a manner different from that stated at the time it was collected, we will notify users by way of an email. Users will have a choice as to whether or not we use their information in this different manner. We will use information in accordance with the privacy policy under which the information was collected.

How to contact Mondaq

You can contact us with comments or queries at enquiries@mondaq.com.

If for some reason you believe Mondaq Ltd. has not adhered to these principles, please notify us by e-mail at problems@mondaq.com and we will use commercially reasonable efforts to determine and correct the problem promptly.