Supposedly anonymous web browsing data can reveal far more than it seems. German researchers showed that it was possible to identify specific people from browsing histories obtained through the data broker ecosystem.
The information uncovered included highly sensitive habits, including intimate preferences and health-related clues. The case shows why anonymisation is not always enough when behavioural patterns are unique.
The uncomfortable question is simple: what would you think if someone had your full browsing history, day by day, hour by hour, page by page? In this case, no illegal system access was needed. The researchers created a fictitious marketing company and requested raw browsing and clickstream data from industry providers.
They built a corporate website, a professional-looking presence and a supposed artificial intelligence advertising platform. They then contacted several companies asking for clickstream data. Eventually, a data broker provided information so they could test the hypothetical advertising platform.
A small number of URLs can be enough to identify someone
Although the dataset was presented as anonymous, users could be reidentified by combining signals such as employer, bank, hobbies, preferred newspaper, digital services or mobile provider. Each combination creates a kind of browsing fingerprint.
A similar privacy risk appeared in the Netflix dataset case, where researchers compared anonymous ratings with public IMDB profiles and identified specific users. The case later resulted in a privacy lawsuit.
Why this matters under the GDPR
The GDPR requires organisations to assess whether data is truly anonymous or whether it can be linked back to an individual using additional information. If reidentification is reasonably possible, the data may still be personal data.
Companies working with analytics, advertising, browsing data or behavioural profiles should apply data minimisation, privacy by design, risk assessments and strict contractual controls with providers and data brokers.
This is not only a technical issue. It is also a legal and ethical one: browsing patterns can reveal interests, health, ideology, relationships, approximate location and routines. Treating that data as harmless can create a serious risk for people’s rights.
Post based on information originally reported by The Guardian.