Meta accuses data scrapers of taking more than their fair share • The Register

Facebook’s parent company Meta openly collects data from its billions of users, but when other companies collect that data, it can be a problem, judging by a pair of lawsuits filed today.

Jessica Romero, director of platform enforcement and litigation at Meta, said the US tech giant has started two federal prosecutions: one against the scraping company Octopus, and one against Ekrem Ateş, a Turk who scraped Instagram data for use on a clone site.

Scraping involves extracting data from publicly available sources, such as profile pages and, in some cases, private data stored behind login pages. According to Romero, part of the problem with companies like Octopus is that they provide automated scraping services to anyone, no matter who they’re targeting or why, and — crucially — without permission from the source site.

Romero said Octopus is “an American subsidiary of a Chinese national high-tech company that claims to have over a million customers.” Its scraping software, Octoparse, is offered online and could scrape information from sites such as those owned by Meta, Amazon, Twitter, Google and LinkedIn.

According to Romero, users compromised their accounts themselves when signing up for Octopus services by handing over login credentials to the company. Octoparse was designed “to retrieve data accessible to the user when logged into their account”. Data retrieved included email addresses, phone numbers, gender, date of birth, likes/comments, etc.

The lawsuit against Octopus alleges terms of service and violations of the US Digital Millennium Copyright Act for engaging in automated scraping without Meta’s permission, as well as attempting to conceal its activity. Facebook seeks a permanent injunction against Octopus to prevent its operations on one of its sites.

We reached out to Meta to find out more about Octopus and its claims.

As for Ateş, Meta is affirming it harvested, without the internet giant’s blessing, the data of more than 350,000 Instagram users to repost on a “clone site” called MyStalk that displays Instagram profile information and posts. Romero said Meta has taken several actions against Ateş since 2021, including deactivating his accounts, giving him a cease and desist, and revoking his access to Meta services.

Facebook has already been scraped. Over the course of nearly two years from the start of 2018, a Ukrainian national by the name of Alexander Alexandrovich Solonchenko mined data on 178 million Facebook users. Facebook sued Solonchenko in October 2021.

Meta expanded its bug bounty program to include scraping attacks a few months later, but the language used in the lawsuit speaks volumes about Meta’s stance on the sanctity of the data it is responsible for.

“The goal of this program is to find bugs that attackers use to circumvent scraping limitations to access data at a larger scale than the intended product,” said Dan Gurfinkel, Engineering Manager of security at Facebook. said. Romero’s essay echoes some of those sentiments, calling Octopus scraping “unauthorized,” not expressing his annoyance that he was scraping data in the first place.

These carefully chosen words should not be ignored: Meta doesn’t seem to mind people extracting data from their sites – as long as they do so in a company-approved way. ®

Comments are closed.