Monthly Archives: February 2018

Recommended: A sampling of outstanding women in analytics

This page is moving to a new website.

This is a list (with single paragraph descriptions) of 186 women who have accomplished great things in the area of Analytics. There is a brief accompanying article at the Forbes magazine website, but it is very brief. The author of this list, Meta S. Brown, defines Analytics quite broadly, so the women have very diverse backgrounds and interests. I only recognized one name off the bat, Grace Wahba, an excellent researcher, but someone, unfortunately, that I haven’t met. If I get a chance, I’ll include in a separate blog post a list of outstanding women in Analytics that I HAVE met. Meta Brown’s list includes links so you can find out more about these talented women. Continue reading

PMean: Mixed up variable names in SAS

This page is moving to a new website.

Some of my students in the Introduction to SAS class were having trouble reading in a tab-delimited text file, and it’s not too surprising, because some of the student in the Introduction to R class were having problems with the same file. Here’s some details about the data set, what problems it caused, and a couple of ways that you could fix it. Continue reading

Recommended: The history of Hadoop

This page is moving to a new website.

If you want to understand big data, you need to understand Hadoop. Hadoop is the technology underlying many big data efforts. But most of the descriptions of Hadoop are jargon laden and impenetrable to newcomers. Well, maybe just impenetrable to this newcomer. But one great revelation to me was a historical note as to WHY there was a need to develop Hadoop. It was all those pages that had to be indexed by search engines at Google and Yahoo. So I went out to try to find more details. This article, with a ton of references throughout, is an excellent introduction to the precursors to Hadoop, the development of Hadoop itself, and the explosion of systems that used Hadoop as their foundation. Continue reading

Recommended: Cleaning Words with R: Stemming, Lemmatization & Replacing with More Common Synonym

This page is moving to a new website.

In many text mining or natural language processing applications, you will have problems with words that are very similar, but which are counted separately. An example might be the words win, winner, and winning. You can combine these words into a single category using stemming. This blog post gives a nice overview of stemming. Continue reading

Recommended: Adherence to Methodological Standards in Research Using the National Inpatient Sample

This page is moving to a new website.

I normally don’t recommend articles that are stuck behind pay walls, but this is an important article. It shows how 85% of a sample of research studies using the National Inpatient Sample database failed to follow at least one of seven well documented practice recommendations of the Agency for Healthcare Research and Quality. Continue reading

PMean: Sentiment analysis of A Christmas Carol

This page is moving to a new website.

I was at an interesting talk about sentiment analysis and decided to try something simple myself. Sentiment analysis is a text analytics method that compares text data with a list of words with positive or negative sentiments. The relative frequency of the positive or negative words is a crude measure of the general sentiment of the text item. I ran a sentiment analysis on the text of the famous Charles Dickens novel, A Christmas Carol. Continue reading

Recommended: An introduction to implementation science for the non-specialist

This page is moving to a new website.

I’ve done a lot of work with Evidence-Based Health, but one big and largely unsolved problem is how to get health care professionals to change their practices once the evidence for these changes becomes obvious. If no one changes in the face of evidence, then all the effort to produce and critically appraise the evidence becomes worthless. A new field, implementation science, has been developed to get at methods to encourage the adoption of new evidence-based practices. This paper outlines how implementation science is supposed to work and offers two real world examples of implementation science studies. Continue reading