2020: A Facebook user's Odyssey?

Join us on a journey to try to solve the mystery behind the advertisers who have uploaded our personal data to Facebook but with whom we've never interacted with before.

Key findings
  • In summer 2019, we noticed that "unknown" companies had been uploading our data to Facebook and we decided to send Data Subject Access Requests (DSARs) to find out more.
  • This ended up being a lengthy and tedious process, involving requests to fill in unnecessary forms or being asked to provide more data than needed as well as other obstacles to the proper and smooth exercise of our data access rights.
  • Eventually, we managed to shed some light on the "Facebook advertisers mystery" by finding out more about the involvement of third parties in the process.
  • However, our investigation demonstrates the need to continue challenging this opacity overall and Facebook's less than adequate transparency
  • As a result of this investigation some companies reviewed their practices and we have written to Facebook to demand changes.
Long Read

Introduction

In August 2019, when Facebook announced a few new features for advertisers such as ads in search, PI decided to take an in-depth look at what features the company offers its users when it comes to understanding its advertising practices. One of these features, which caught our attention is Facebook Ads Preference, a tool that among other things, lists businesses/advertisers that have uploaded your personal data to target you with (or exclude you from) ads on the platform.

 

When we decided to check some of our own Facebook profiles, we interestingly noticed that these advertisers and businesses ranged from well-known bands or artists like Led Zeppelin or James Blake to the dating app Happn, to well-known companies such as Volkswagen and Cisco. For some of us it included, hundreds of U.S. based car rental companies... even though the people targeted had never been to US or don't have a driving licence.

While this page already revealed companies we had never heard of and raised questions, it only listed companies which had uploaded data in the past seven days. To get a more comprehensive list, we used Facebook's "Download your Information" tool which, despite flagrant limitations we have already highlighted, offers a longer and somewhat more comprehensive list of advertisers who uploaded our data. Obviously, it is not that we don't know who Led Zeppelin are. However, we were pretty sure we did not have a contractual or customer relationship with them, so this made us wonder: how did they come to have our personal data?

Screenshot of an example of part of a list of advertisers that Facebook indicated had uploaded personal data on the platform

Being aware of the huge opacity the AdTech world is shrouded in, we suspected that there was something going on here. How did they obtain our data in the first place? Could it be possible that some of these companies actually had nothing to do with us? Might a third party (such as a marketing agency or data broker...) be involved, mixing up data and targeting us to seek to increase their client's popularity?

We decided to embark on what turned out to be a long and tedious quest to find out how all these brands got our data? Yes, including Led Zeppelin and James Blake, although we doubt the actual artists themselves have anything to do with data practices.

Note: Our DSAR template along with some tips can be found here, and is available for everyone to use in case you wanna embark on your own FB Brands Odyssey!

Warning to readers: this is gonna be a long and frustrating journey!

Chapter I: Whole lotta... data?

As it wasn't clear from Facebook what kind of personal data the companies listed had uploaded, we visited Facebook's business site, where the social media platform is trying to sell itself to advertisers and businesses. The platform, among its various ways of targeting users, provides the opportunity for businesses to target ads at individuals, with whom they already have an existing relationship. They can also use this tool to exclude these people from ads. Fun fact: You will notice that Facebook seems to interpret "people who already know your business" as pretty much everyone that may visit a website, based on its custom audience advice to businesses.

Presentation of Custom Audiences feature on Facebook

Note: Facebook also prompts app developers to use the Facebook Software Development Kit (SDK), a set of software development tools that help developers build apps for a specific operating system. As Pl's research revealed in December 2018, this can raise serious questions regarding unnecessary and unwanted personal data collection.

Focusing on the "contact lists" categories, we discovered a series of data categories, i.e. identifiers, that businesses can upload to create a "custom audience" list. These can then used by Facebook to trace the profiles of individuals and display advertisements to them. According to Facebook's custom audience terms, advertisers do not get to see the actual profile of the users and Facebook does not use the uploaded personal data for any other purpose, although it suggests businesses upload as much data as possible.

FB Custom Audience Best Practices (emphasis in red underlining added)

If you're wondering what these identifiers might be, Facebook also provides businesses with an example custom audience excel file. As the screenshot below illustrates, the identifiers Facebook suggests businesses provide are not just limited to names, surnames and email addresses, but may go as far as gender, Facebook id, address, phone number, data or year of birth, mobile device identifier or age.

This made us a lot more curious to see what data the "unknown" companies had on us.

Screenshot of a series of identifiers contained in the file template provided by Facebook

Chapter II: Babe I'm gonna... DSAR you

Fortunately, the General Data Protection Regulation (GDPR), which is the legal instrument governing data protection in the EU, provides us with a series of rights to seek to ensure that we have adequate control over our data and that we are able to verify the lawfulness of the processing at all times. Sounds cool, right? Well, it is. But enforcing it continues to prove challenging and illustrates how much companies still have to learn and do to demonstrate that they comply with the law.

We started off by submitting dozens of Data Subject Access Requests (aka DSARs) to the companies that we thought we had never interacted with before but appeared as having uploaded our data to Facebook (you can do the same following our model message). We asked them to confirm whether they hold any personal data on us, as well as to be provided with a copy of the personal data, together with information regarding their sources and transfers to Facebook and other third parties for advertisement-related purposes. We also provided the companies with a list of identifiers, such as our Facebook id or email we used to sign up with Facebook, that would help them locate any personal data, as we weren't sure whether they would be able to identify us just by using our name since we did not think we had engaged with them in the past.

Part of a DSAR we sent to Led Zeppelin

At this point, we should note that finding the contact details of the companies was quite a hassle. The only thing Facebook provided us with was the name of the advertiser or business and a link to their Facebook page. However, going through each company's privacy policy and finding the email addresses of their data protection officer or department required some extensive searching skills. For example, Facebook was telling us that a comedian had uploaded our personal data, but this actually meant that their management company was the one behind this.

Some of the questions contained in the DSARs we submitted to companies

Chapter III: Trampled under... no-response

Several of these companies did not even acknowledge receipt of our access requests, despite our reminders and insistence. Led Zeppelin and the Dissect Podcast, a serialized music podcast, were among those who did not respond. Although Dissect Podcast did later respond to follow up in relation to this piece.

A good number of companies we contacted are based in the United States. However, this does not mean that they must not abide by EU data protection laws. Under the GDPR, if a company is offering its services to people in the EU and/or monitoring their behaviour, then they are also covered by GDPR. We find the failure to respond to our DSARs deeply disappointing. It is unacceptable that companies ignore their legal obligation to respond to these requests and thus avoid scrutiny.

Chapter IV: Communication Breakdown

With the companies that did acknowledge receipt of our access requests, things were still far from easy.

First, some companies seemed to initially treat our access requests as erasure requests. Similar to Data Subject Access Requests, individuals can also submit erasure requests and ask for their personal data to be deleted. However, as the name suggests, an erasure request has a completely different purpose; its purpose is to have data deleted. And, under UK data protection law, for instance, it is an offence for a company to delete personal data, after receiving an access request by an individual.

An example of a company that replied to the clear access request with reference to an erasure request was Volkswagen UK.

Excerpt from the email response we received from Volkswagen UK

They weren't the only ones that appear to have been confused between an access request and an erasure request...

Dr. Oetker, for example, a German multinational company that sells baking products, replied asking us to fill in a form to facilitate our access rights. Once we opened the form, we were surprised to see that it was actually a form for an erasure request.

Excerpt from the form we received from Dr. Oetker

Both Volkswagen and Dr. Oetker apologised for the confusion, after we wrote back to them emphasising that we had submitted an access request, and proceeded with treating our request under the correct provisions.

Second, you will notice that both Volkswagen and Dr. Oetker asked us to fill in a form. Under GDPR, you can ask but you cannot oblige individuals to fill in forms or ask them to provide reasons for exercising the right of access (as was the case with some of the companies). An access request could be as simple as just a message to the company saying "I would please like to receive a copy of my data. thank you". This is also in line with the Information Commissioner's Office (the UK data protection authority) and many other regulators' guidance on dealing with DSARs.

Third, in our dealings with several other companies, we were confronted with disproportionate requests to verify our identity. Under GDPR, controllers (companies) may ask individuals to verify their identity. This can be an important step to ensure that data is handled in a secure way and that only the correct individuals will be granted access to their personal data. However, this does not mean that companies can ask for a huge amount of information with no justification. Data protection laws require companies to "use all reasonable measures" to verify individuals' identities without placing an unnecessary or disproportionate burden on them. In other words, if you've only used your email and nickname to sign up for an online service, why would that service ask for a copy of your passport to verify your identity, if they didn't have that data in the first place?

This was something we raised with the popular dating app Happn, for example. Some of us might have used the app a very very long time ago. So, we decided to include it among the lists of companies as we were surprised to see it was still uploading our personal data to Facebook, considering also that, according to its privacy policy, after one year of inactivity, personal data are saved and kept for one further year in an archive.

Initially, Happn asked for a passport copy to be able to verify our ID. We wrote back, asking them to explain why a copy of our passport or ID is necessary to confirm our identity, considering that we never provided a copy of these documents when signing up for the service. Eventually, Happn said we no longer had to send them a copy of our ID because they had no personal data relating to us. Happn explained the reasoning for this in detail in their response to PI.

Controllers, namely companies, are the ones obliged to make sure that they properly and securely verify the identity of individuals requesting access to their personal data. This is important to prevent data breaches; however as we already mentioned, it should be proportionate - by using the least intrusive means, minimise any unnecessary data gathering and not be used as an obstacle to people's access.

Chapter V: DSARed and... Confused

And now, let's cut to the chase. Eventually, we managed to get a few of the companies to actually reply to our DSARs. The responses are definitely interesting, if not confusing, and, in some cases, a bit creepy.

Most of the companies that responded to our DSARs confirmed our initial hypothesis: they had no personal data on us.

Cisco, for example, claimed to have no personal data on us and is unable to explain why it was listed as one of the companies that have uploaded our personal data to Facebook.

Cisco directed us to Facebook policies and tools, which, as mentioned above, do not provide any more information about either the company or the personal data that they have uploaded to the platform.

Moreover, according to Facebook's Custom Audiences terms, Facebook merely processes the data provided by advertisers for the sole purpose of reaching individual profiles so that they can be targeted with ads. Control remains with the advertisers who must ensure they have "all necessary rights and permissions and a lawful basis to disclose and use the Hashed Data in compliance with all applicable laws, regulations and industry guidelines" (emphasis added).

Facebook Custom Audiences' Terms

Consequently, our question still remains: Why was Facebook telling us that these companies had uploaded our personal data to the platform?

The responses we received from Dr. Oetker, Telford Homes, a house building company, and Universal Music Group might shed some light on this question, even though they do not provide the full answer. (NOTE: Dr Oetker and Universal Music Group response for comments can be found attached at the bottom of this page)

Dr. Oetker replied to our access request stating that no personal data is shared with third parties and, particularly, no personal data of ours have ever been shared "with Facebook and/or any of its companies and/or subsidiaries and/or partners, both within and outside the European Union". So, we followed up and asked them to explain the following screenshot.

Screenshot of what we saw on our Facebook profile in October 2019

In October 2019 and just a few days after Dr. Oetker underlined that no personal data is shared with Facebook, Dr. Oetker and Chicago Town (a company belonging to Dr. Oetker) appeared to have targeted us with ads within the past 7 days!

Dr. Oetker, once again, underlined that no personal data of ours was shared with Facebook and provided the following explanation:

Parts of the explanation provided by Dr. Oetker when we asked why they would still show up as having uploaded our data and advertised to us

What Dr. Oetker seems to suggest is that Facebook has classified us as "pizza lover", which could have accordingly been used as an interest to look for potential audiences to be targeted with pizza-related advertisements. The problem with that explanation is that what Facebook seemed to be telling us is that not only did Dr. Oetker target us with ads, but that they had also shared our personal data with Facebook. In relation to that point, Dr. Oetker first stated that they would investigate the matter and that they had contacted Facebook for an explanation as to why it is even suggested that they have used our first party data. Before publication they returned to us with more information as to why this was happening, you will find their response attached at the end of this piece. The letter suggests the use of a third party, information not provided by Facebook. This highlights the need for Facebook to provides more details about the data uploaded so that users have more transparency regarding what data was used to target them and who uploaded it, something we are asking in our letter to them.

Several other companies, including Happn and Cisco indicated that they are or will be following suit by carrying out similar investigations into why Facebook showed them as targeting us with ads. But, still. We wanted answers. And we wanted them within the timeframe the law sets out. It should not be this difficult.

Chapter VI: A Stairway to... transparency?

And that is when we received a 4-page explanation by Telford Homes. In summary, Telford Homes explained its use of cookies and third parties. Telford Homes uses cookies, including by a third party, to identify 'common qualities' of visitors. The common qualities are collated by the marketing agency they contract with in order to specify a general target audience segment. The marketing agency then partners with the credit referencing agency/ data broker Experian to identify the specific target audience and a segment profile is then provided to Facebook. In follow up, Telford Homes emphasised that neither they nor their marketing agency upload any personal data to Facebook and nor do they provide data to Experian.

The response highlights the wider issues of the use of cookies, the difficulties of truly anonymising data and the use of third parties.

First, a company may use marketing cookies on their website to track visitors (not just customers!). This is not news to us. A few months ago, for example, we showed how mental health websites may even use tracking cookies to share users' answers to depression tests with third parties.

Second, even if cookies are provided by a third party, like in the case of Telford Homes, and even if the company claims that they merely collect common quality or statistical information, which are not meant to identify the individual, this is not always the case.

There is a fine line between pseudoanonymous and anonymised data. The first can still render an individual identifiable. For example, journalists from the German public broadcaster NDR were able to identify the sexual preference and medical history of judges and politicians, using online identifiers. This is just one example, that serves to illustrate the insights that can be gleaned from seemingly mundane and pseudonymous data and the value it might have.

Even if it is not a company’s intention to directly identity an individual, this is still possible, due to the vast amount of data it might collect and generate. And, even when data may seem to be truly anonymised by companies, and consequently exempt from the protection guaranteed by the General Data Protection Regulation, this might not be the case. In a recent study, researchers were able to demonstrate that, despite the anonymisation techniques applied, “data can often be reverse engineered using machine learning to re-identify individuals".

Third, we come across our old friend, Experian. Experian is a data-broker that developed from credit referencing to also offer marketing data services. It holds and manages marketing data on 700 million people around the world. According to Experian, its tools are able to provideAccess to data about circa 51 million individual UK consumers living at residential addresses, with circa 30 million consumers available for prospecting purposes”. The data includes “500+ variables” meaning that individuals identities are linked to demographic, socio-economic and behavioural characteristics.

We are concerned by Experian's Audience Insights and Audience Extension which use online and device identifiers so that clients can recognise and target individuals, as well as the use of these products and others by the public sector and in political campaigns. This is why, in November 2018, we filed a series of complaints against data-brokers, ad-tech companies and credit referencing agencies, including Experian, challenging their data exploitation practices.

As the Facebook custom audience terms state:

If you are providing Hashed Data on behalf of an advertiser, you represent and warrant that you have the authority as agent to the advertiser to disclose and use such data on their behalf, and will bind the advertiser to these Terms.

Similarly, when we posed the same question to Universal Music Group (UMG), which is the company that we thought would be responsible for James Blake, who had appeared as one of the advertisers, UMG said that they are James Blake's record label, operating his official website, but have no control over his Facebook page or advertiser account. They also confirmed that they hold no personal data on us.

This left us wondering whether we had sent a DSAR to the wrong entity. We were sure we had not interacted with James Blake's social media or Facebook page, but we were also sure we had nothing to do with UMG either. And Facebook was not telling us anything more than that James Blake was among the advertisers who uploaded our personal data to the platform. This shows how tricky it can be to identify the company responsible for uploading data, and that's why we're asking Facebook to make changes.

Chapter VII: Ramble on, Facebook. Ramble on.

But then, something else happened. While checking out the new off-facebook activity tool, some of these companies appeared again as having reported our off-facebook activity to the social media platform. Here's the UMG off-facebook activity, for example:

Screenshot from Facebook's off-facebook activity tool, showing that UMG had reported our offline activity to Facebook

UMG seemed to have reported our off-facebook activity dated 1 August 2019, just over 20 days before we actually submitted our DSAR (and after some internal investigation we found out that one of us had actually visited the website to check when Lana del Rey's -no judgment- latest album was gonna be released). UMG's response to PI below, provides an explanation of the role of the Facebook pixel.

This and other examples, where companies deny having a requester’s data and/or uploading it to Facebook raise many questions, including how companies are collecting our data; whether they consider they are doing this in an anonymised manner; if, how and why data may end up in the hands of third parties; whether this data is then combined with other data; and what ultimately happens to lead to the data being uploaded to Facebook to target individual profiles with ads?

Unfortunately, we are not able to tell. Despite Facebook's attempts to bring more transparency for users, and we note that more changes have been made during the course of this research, it is doubtful whether the tools provided are actually fully effective and helpful. Most of these companies seem to have no idea they are engaging in a game of data exploitation, although they may be the ones to face the music.

Facebook merely provides users with a list of advertisers that have uploaded our personal data to the platform. Not much more information is provided. It's a starting point but without being able to know the exact identifiers each advertiser might have uploaded and the exact contact details of the company and/or data protection officer, users are faced with great difficulty in exercising their rights. It's so hard to know what kind of data a company holds (including what identifiers so that a company can match you with the data in their database) and may be a struggle to contact the correct department or person. This needs to change. Now.

If you've managed to make it until the end of this piece, congratulations! Also, this might mean that you are as concerned as we are about AdTech's nasty data exploitation practices. Here's what you can do to hold the hidden data ecosystem to account!