Data scraping is the automated process of extracting large amounts of data from a website or program into a different format, channel, or file that can be saved locally onto a computer.
Web scraping is conducted for a range of purposes and industries, including:
- Price and product comparison websites, such as for online travel agencies and product review blogs
- Market research and social listening
- Product and service review websites
- Keyword research tools for search engine optimisation (SEO)
- Data harvesting to train AI applications
Is data scraping legal?
Besides getting hit with a huge fine and legal action, you could be severely limited on what you’re allowed to do with scraped data. For instance, you won’t be able to reproduce, share, or sell this information without the owner’s consent or authorisation.
There has been much legal debate around what constitutes good, bad, legal or illegal scraping activities, but generally speaking, businesses who utilise scraping tools should be mindful of fair use laws, a website’s terms and conditions for use, the scraping method used (eg. a bot that circumvents login processes would be a big no-no), and whether they plan to use the scraped data in a way that is legal and ethical.
What are the privacy implications of web scraping?
One of the biggest concerns that privacy advocates have is mass harvesting of email addresses with intent to share or sell this information to third parties without the owner’s consent or knowledge. Often, this means more spam and malicious emails make their way into people’s inboxes, which violates multiple anti-spam and privacy laws around email marketing and unsolicited communications.
Another insidious threat is data scraped from people’s social media profiles. As we’ve seen in the past with Facebook’s Cambridge Analytica scandal, scraper bots have the ability to harvest vast amounts of personal information about us, build profiles, and weaponise it through political advertising and other unsavoury ends, like fake profiles and online impersonation.
Keep this in mind: While data scraping offers a wealth of new opportunities in business intelligence, academic research, e-commerce, and other niche industries, it also generates a range of privacy and fair use challenges for everyday users and regulators.