How Apps on Android Share Data with Facebook - Report
A video presentation of the finding of this report can be found here, as presented at 35th Chaos Computer Congress (35C3)
Previous research has shown how 42.55 percent of free apps on the Google Play store could share data with Facebook, making Facebook the second most prevalent third-party tracker after Google’s parent company Alphabet. In this report, Privacy International illustrates what this data sharing looks like in practice, particularly for people who do not have a Facebook account.
This question of whether Facebook gathers information about users who are not signed in or do not have an account was raised in the aftermath of the Cambridge Analytica scandal by lawmakers in hearings in the United States and in Europe. Discussions, as well as previous fines by Data Protection Authorities about the tracking of non-users, however, often focus on the tracking that happens on websites. Much less is known about the data that the company receives from apps. For these reasons, in this report we raise questions about transparency and use of app data that we consider timely and important.
Facebook routinely tracks users, non-users and logged-out users outside its platform through Facebook Business Tools. App developers share data with Facebook through the Facebook Software Development Kit (SDK), a set of software development tools that help developers build apps for a specific operating system. Using the free and open source software tool called "mitmproxy", an interactive HTTPS proxy, Privacy International has analyzed the data that 34 apps on Android, each with an install base from 10 to 500 million, transmit to Facebook through the Facebook SDK.
All apps were tested between August and December 2018, with the last re-test happening between 3 and 11 of December 2018. The full documentation, including the exact date each app was tested, can be found at https://privacyinternational.org/appdata.
- • We found that at least 61 percent of apps we tested automatically transfer data to Facebook the moment a user opens the app. This happens whether people have a Facebook account or not, or whether they are logged into Facebook or not.
- • Typically, the data that is automatically transmitted first is events data that communicates to Facebook that the Facebook SDK has been initialized by transmitting data such as "App installed” and "SDK Initialized". This data reveals the fact that a user is using a specific app, every single time that user opens an app.
- • In our analysis, apps that automatically transmit data to Facebook share this data together with a unique identifier, the Google advertising ID (AAID). The primary purpose of advertising IDs, such as the Google advertising ID (or Apple’s equivalent, the IDFA) is to allow advertisers to link data about user behavior from different apps and web browsing into a comprehensive profile. If combined, data from different apps can paint a fine-grained and intimate picture of people’s activities, interests, behaviors and routines, some of which can reveal special category data, including information about people’s health or religion. For example, an individual who has installed the following apps that we have tested, "Qibla Connect" (a Muslim prayer app), "Period Tracker Clue" (a period tracker), "Indeed" (a job search app), "My Talking Tom" (a children’s’ app), could be potentially profiled as likely female, likely Muslim, likely job seeker, likely parent.
- • If combined, event data such as "App installed”, "SDK Initialized" and “Deactivate app” from different apps also offer a detailed insight into the app usage behavior of hundreds of millions of people.
- • We also found that some apps routinely send Facebook data that is incredibly detailed and sometimes sensitive. Again, this concerns data of people who are either logged out of Facebook or who do not have a Facebook account. A prime example is the travel search and price comparison app "KAYAK", which sends detailed information about people’s flight searches to Facebook, including: departure city, departure airport, departure date, arrival city, arrival airport, arrival date, number of tickets (including number of children), class of tickets (economy, business or first class).
Facebook places the sole responsibility on app developers to ensure that they have the lawful right to collect, use and share people’s data before providing Facebook with any data. However, the default implementation of the Facebook SDK is designed to automatically transmit event data to Facebook.
Since May 25, 2018 – the day that the EU General Data Protection Regulation (GDPR) entered into force - developers have been filing bug reports on Facebook’s developer platform, raising concerns that the Facebook SDK automatically shares data before apps are able to ask users to agree or consent. On June 28, 2018, Facebook released a voluntary feature that should allow developers to delay collecting automatically logged events until after they acquire user consent. The feature was launched 35 days after GDPR took effect and only works on the SDK version 4.34 and later.
In response to this report, Facebook has stated in an email to Privacy International on 28 December 2018: “Prior to our introduction of the “delay” option, developers had the ability to disable transmission of automatic event logging data, except for a signal that the SDK had been initialized. Following the June change to our SDK, we also removed the signal that the SDK was initialized for developers that disabled automatic event logging.” (emphasis added).
This “signal” is the data that we observe in our findings. We assume that prior to the release of this voluntary feature, many apps that use Facebook SDK in the Android ecosystem were therefore not able to prevent or delay the SDK from automatically collecting and sharing that the SDK has been initialized. Such data communicates to Facebook that a user uses a particular app, when they are using it and for how long.
Without any further transparency from Facebook, it is impossible to know for certain, how the data that we have described in this report is being used. This is particularity the case since Facebook has been less than transparent about the ways in which it uses data of non-Facebook users in the past.
Our findings also raise a number of legal questions. As this research was conducted in the UK we have focused on the relevant EU framework, namely EU data protection (“GDPR”) and ePrivacy law (the ePrivacy Directive 2002/58/EC, as implemented by Member State laws) as well as Competition Law. An underlying theme is the responsibility of the various actors involved, including Facebook.