No boundaries: Exfiltration of personal data by session-replay scripts


Websites have long used third-party analytics scripts to collect information about how visitors use their sites. In November 2017, researchers at Princeton found that an increasing number of sites use "session replay" scripts that collect every action the user performs while on the site, including mouse movements, keystrokes, scrolling behaviour, and the complete contents of pages loaded. Users logically expect the sites to receive typed data only after they're pressed the "submit" button, but all keystrokes are collected without any indication to the user that this is happening. The collected data is sent to third-party services that can replay the session as if it were live; the data cannot be anonymised. Sites claim that their purpose is to gain insights into users' interactions with their sites and to identify broken or confusing pages. 
The study went on to examine seven of the top session replay companies, including Yandex, FullStory (used by US drugstore giant Walgreens), Hotjar, UserReplay, Smartlook, Clicktale, and SessionCam. These are in use on 482 of the Alexa top 50,000 sites. 
These recordings can leak an enormous amount of data, some of it sensitive information such as credit card details, passwords, and medical conditions, to third parties. Some of the services technically require the redaction of personal data before sessions are submitted to them, but the researchers find that automated submissions mean the redactions are imperfect and partial. In their tests, all displayed page content leaked. Ad-blocking lists such as EasyList and EasyPrivacy do not block FullStory, Smartlook, or UserReplay; EasyPrivacy has rules to block Yandex, Hotjar, ClickTale, and SessionCam. The researchers scanned the configuration settings of the Alexa top 1 million publishers that use UserReplay, the only one that allows publishers to disable data collection from users with the Do Not Track flag set in their browsers, and found that none honoured the DNT signal.

Related learning resources