Recommended: SMS Spam Collection Data Set

This page is moving to a new website.

If you are interested in text mining, this is a good data set to start with. It is a bunch of text messages, each one line long, that have been classified by a human as either spam or ham (ham is a legitimate message).

Tiago A. Almeida, Jose Maria Gomez Hidalgo. SMS Spam Collection Data Set. Part of the UCI Machine Learning Repository. Available at https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection.