17 July 2015
1.7 billion "anonymous" comments from 5% of the internet
A sends:
1.7 billion "anonymous" comments from 5% of the internet
All in one searchable database:
https://archive.org/details/2015_reddit_comments_corpus_sqlite
Cryptome: What will this be used for?
A:
It depends on who the user is. Law enforcement and private investigators
will use the information to try to:
1. Identify individuals based on behavioral analysis of comments, etc.
2. De-anonymize individuals and leverage this information on other platforms,
i.e. checking identical/similar usernames and using the behavioral analysis
to predict other (online or offline) hangouts and activities in order to
build a more complete picture.
Sociologists and psychologists will use it to build behavioral models for
individuals acting as individuals and for ad-hoc groups of individuals without
any external organization, goal, etc.
Members of the public and historians will use it to look at and for public
figures and to better understand them. More importantly, the public should
use this database as a wake-up call that the driving force behind Big Data
isn't Big Brother - it's the masses. Between this and the Dark Net Market
archives and some other releases in the last few weeks, it's becoming more
apparent that the "right to be forgotten" may be recognized by some governments
but private individuals and researchers, not just megacorps, remain major
obstacles to it.
This is, simply put, the biggest example of open source SIGINT to date. The
fact that it was done legally and openly, and not as the result of a hack
or data leak, may make it seem less newsworthy - but if anything, it
makes even more alarming to privacy advocates. It's not a one-off either,
it's just one of the biggest signposts we've seen so far.
|