Data Scraping Is Not a Crime
South Carolina's NAACP and ACLU are challenging the state's ban on automated data collection.
South Carolina has the highest eviction rate in the country, and the state chapter of the NAACP wanted to find out why. Given the difficulty of tracking down every case by hand, the organization hoped to use a software program called a "scraper" to collect data from South Carolina's online repository of legal filings.
Researchers, academics, and investigative journalists frequently use scrapers to automate this kind of laborious, large-scale project. But the South Carolina Court Administration categorically bans such automated data collection.
Now the American Civil Liberties Union (ACLU) of South Carolina and the South Carolina NAACP are challenging the state's scraping ban in federal court. In a lawsuit they filed in the U.S. District Court for the District of South Carolina in March, the groups argue that the policy unreasonably restricts their First Amendment rights. "This case is about ensuring core First Amendment principles, like the right to access public court filings, are applied in a way that meets our rapidly expanding digital reality," Allen Chaney, the ACLU of South Carolina's legal director, said in a press release.
The NAACP says collecting eviction filings would allow it to research the issue and contact affected tenants to ensure they have meaningful access to the courts. But scraping has numerous other legitimate uses.
In 2018, for example, I wanted to find out how often Texas police used a loophole in the state's public record law to hide information on deaths in custody. So I wrote code to scrape more than 300,000 pages of public-record rulings that the Texas Attorney General's Office had posted on its website. Then I filtered the results for those that cited the specific provision I was investigating.
That would have been impossible without a bot to do the heavy lifting. By scraping data, I identified more than 80 cases in which Texas police withheld information about deaths in custody from families, lawyers, and journalists.
The South Carolina lawsuit is the latest challenge to state anti-hacking laws and the federal Computer Fraud and Abuse Act (CFAA). The U.S. Court of Appeals for the 9th Circuit issued a landmark ruling in April that scraping publicly available data from websites does not constitute "unauthorized access" under the CFAA. While it's true that scrapers can bog down websites, ethical coders add courtesy delays to their programs that avoid that problem and include identifying information in their HTTP requests to government website administrators.
Banning scrapers is not about preventing unauthorized hacking. It just makes it harder for the public to know what the government is doing.