TODO: Organize these somehow
theinfo
http://www.cs.technion.ac.il/~gabr/resources/data/ne_datasets.html
4 Universities Data Set
20Newsgroups
UniProt Netflix Prize/Dataset WordNet DBPedia UCI Machine Learning Repo