Researchers release a dataset of 70,000 users' OkCupid profiles and call the data "public"


In 2016, Danish researchers Emil Kirkegaard and Julius Daugbjerg Bjerrekær released a dataset onto the Open Science Framework that included details of almost 70,000 users of the online dating site OkCupid. The researchers created the dataset themselves by using software to scrape information from OkCupid's site including user names (though not real names), ages, gender, religion, and personality traits, along with the answer to the questions the sites asks new users in order to help identify potential matches. They did not ask either OkCupid or its users permission to do this. A day after Vox published the dataset's existence, the Open Science Framework, a forum for researchers to share raw data in order to increase transparency and collaboration in social science, removed the posting.

Scraping and uploading the data violated the standard ethical code that social scientists typically follow. The researchers, challenged on Twitter, argued that the data was already public because it had been posted to OkCupid. OkCupid disagreed, calling the posting a clear violation of the site's terms of service as well as the US Computer Fraud and Abuse Act. During the brief period the dataset was available, it was downloaded more than 500 times.

Writer: Brian Resnick
Publication: Vox

Related learning resources