Each day, we access the internet for any number of reasons, uploading and downloading information which captured on its own or read in context with other data could be used to provide a comprehensive profile of our identities and activities to personal or business rivals, spies and investigators, or cyber-criminals of all kinds.
It’s for this reason that security professionals so often advise us to mask or “anonymize” our activities online, so that our personal and financial information becomes either unavailable to the software we use to access the web, or is scrambled by such strong encryption that it couldn’t be deciphered, even if it is intercepted.
With a growing market in Virtual Private Network (VPN) solutions and anonymous browser settings available, you might assume that we’re pretty much covered in this regard. But recent tests conducted by a team of German researchers have revealed how easy it is for anonymous browsing data to be exposed.
The Truth About Anonymous Browsing Data
Information on users powers (and quite often, finances) the apps and platforms we use to gain access to services on the internet. It’s that simple. Numbers of clicks, who clicks on what, names, addresses, and thoughts typed into text fields – all of this and more come into the mix.
There’s huge potential and actual value in data like this, which is why the permissions requested by so many mobile apps and web services include the right to gather information on the user – typically with some bland assurance of anonymity.
How sincere they are about preserving user privacy – and how specific they are, in spelling out exactly what data is collected, and how it will be used or shared – really depends on the integrity of the developer or online resource making the request. Once that information is in their hands, we’re pretty much out of the loop.
For example, in 2008 a set of anonymized ratings published by Netflix to help researchers improve their recommendation algorithm were compared against the public profiles of users who left film ratings on the Internet Movie Database (IMDb), resulting in a number of matches which revealed the identities of contributors to the Netflix poll.
And the methodology used by the two researchers in the German study was even more audacious than this.
Setting the Stage for Anonymous Browsing Data
Concerned about the misuses of anonymous browsing data collected from various sources, journalist Svea Eckert and data scientist Andreas Dewes teamed up to discover how easily they could acquire supposedly anonymous personal data – and what they could reveal from it.
Advertising networks and corporate marketing divisions are the greatest customers for harvested browsing data, so the duo created a fake marketing company which they claimed had developed a revolutionary new marketing algorithm based on artificial intelligence (AI). They cooked up a website for the platform (complete with stock marketing images, and text stuffed with buzzwords), and set up a LinkedIn page for its chief executive, plus a careers site which received several applications from potential staffers.
The only problem they really encountered was the lack of data sets available from users in Germany, where the study was based. Any concerns about how difficult it might be to pull in information were soon put to rest, when a data broker came through with a set of anonymous records from German internet users – free of charge.
And the Result?
Eckert and Dewes presented their findings at the July 27th-30th DefCon hacking conference in Las Vegas. Over a 30-day period of examination, they had managed to secure a database of 3 billion URLs, obtained from 3 million German internet users, spread over 9 million different sites. Of those, some users had visited only a couple of dozen websites, while several others had contributed tens of thousands of data points – enough to virtually reconstruct their digital lives.
Among the more newsworthy findings were the internet porn preferences of a sitting judge, and details on the medication used by a German Member of Parliament (MP).
The process of “de-anonymizing” previously anonymous data was accomplished on several fronts.
Clues from Social Media
Social media platforms in general often provide clues or outright evidence, as to the activities of their subscribers. For the German study, Twitter proved the most fertile ground. This stems from the fact that anyone who visits their own analytics page on Twitter gets a URL in their browsing record which contains their Twitter username – effectively matching up an anonymous data record with an actual user. A similar situation exists for the German social networking site Xing.
A statistical analysis on a data set consisting of only ten URLs (workplace logins, banking website, mobile phone provider, hobbies, etc.) was discovered to be enough to draw up a sufficiently accurate “fingerprint” of a user’s browsing habits to identify them as an individual. Comparing this with activities on other sites (YouTube, social media, etc.) increases the probability of finding a match.
Found in Translation
Each time you use Google Translate on your mobile phone, the text associated with each query you make is stored in terms of its URL. Using this loophole, the German researchers were able to uncover operational details about an ongoing German cyber-crime investigation, as the lead detective on the case had been translating requests for assistance to foreign police forces.
How You Can Protect Yourself
Clearly then, anonymous browsing data can be analyzed to extract its origins (and the identity of the person who contributed it), with the application of some judgment and the proper techniques – if it can be read.
So your best protection is still the strong encryption of a secure VPN application or service. Even if your browsing data does find its way into the wrong hands (and with app permissions and the hidden marketing tactics of online resources, it’s not impossible), the time, effort, and resources required to break the encryption will discourage anyone but the most determined hacker from even bothering.
Share this Post