Data visualisation is a powerful tool for showcasing research insights in a clear and engaging format. Incorporating graphs and charts into your articles can help draw readers in, simplify complex information, and increase your credibility as a subject matter expert.
Whether you write thought leadership content for your business or run a news site, you may choose to showcase data from user behaviour analytics and polls to highlight interesting trends.
Before you publish your next article or whitepaper, however, you will need to make sure that any data you disclose does not put people’s privacy at risk or violate data protection laws like the General Data Protection Regulation (GDPR).
In this article, we’ll discuss how organisations can de-identify data using techniques like data anonymisation and aggregation.
To overcome these restrictions and leverage valuable insights that can be gleaned from a given dataset, many companies conduct a process known as “data anonymisation” — a data de-identification technique in which all personally-identifiable information is edited or removed from a dataset.
For example, a restaurant that wants to share data about customers’ online ordering habits with an app development company can remove names, credit card details, and substitute home addresses with generalised locations to anonymise the data.
While this is common approach to anonymisation, it is not always effective.
Even if a dataset has been anonymised within your own internal database, existing technologies and external datasets could be used to look up, cross-reference and re-identify a data record.
This is a mistake that many organisations and even government institutions have made, which can seriously threaten the privacy and safety of all people involved. The GDPR only considers data to be “truly anonymised” if it is impractical or impossible for someone to re-identify it.
From there, an organisation can technically do what they like with the data, given that the GDPR no longer applies to anonymous information.
Data aggregation is a data mining technique where data from different sources are gathered and presented in a summarised format, such as in tools like Google Analytics and Facebook Insights. The datasets produced through this technique offers some degree of privacy around personal data, given that aggregate data only shows information at a group level.
However, the more that someone can “drill down” into aggregate data using certain filters, the more likely it is that a piece of data could be used to re-identify someone. While there is no perfect process for de-identifying personal data, they can still offer some protection against most people who may not have the skills or knowledge necessary to re-identify data that has been anonymised or aggregated.
Generally speaking, the risk of re-identification is quite low for data that is presented in an aggregated visual format, such as a simple infographic on your blog or website. Nevertheless, you should always consider whether you’ve taken all possible measures to anonymise people’s data and, in the event that it is re-identified, what the consequences could be for their privacy and your legal liability.