Improperly anonymised taxi logs pose reidentification risks


In 2014, NYC Planning Labs Chris Whong was sent and made public a complete a complete dump of historical trip and fare logs from New York City taxis in response to a Freedom of Information request. The more than 20GB of uncompressed data comprising more than 173 million individual trips included pickup and drop-off locations and times and other metadata - but also personally identifiable information about the driver. Careful analysis enabled researchers to deanonymise the entire dataset, showing the importance of exercising care when making such releases.

Writer: Vijay Pandurangan
Publication: Medium

Related learning resources