Tag Archives: Text mining

Recommended: Cleaning Words with R: Stemming, Lemmatization & Replacing with More Common Synonym

This page is moving to a new website.

In many text mining or natural language processing applications, you will have problems with words that are very similar, but which are counted separately. An example might be the words win, winner, and winning. You can combine these words into a single category using stemming. This blog post gives a nice overview of stemming. Continue reading

PMean: Sentiment analysis of A Christmas Carol

This page is moving to a new website.

I was at an interesting talk about sentiment analysis and decided to try something simple myself. Sentiment analysis is a text analytics method that compares text data with a list of words with positive or negative sentiments. The relative frequency of the positive or negative words is a crude measure of the general sentiment of the text item. I ran a sentiment analysis on the text of the famous Charles Dickens novel, A Christmas Carol. Continue reading