I was at an interesting talk about sentiment analysis and decided to try something simple myself. Sentiment analysis is a text analytics method that compares text data with a list of words with positive or negative sentiments. The relative frequency of the positive or negative words is a crude measure of the general sentiment of the text item. I ran a sentiment analysis on the text of the famous Charles Dickens novel, A Christmas Carol. Continue reading

# Category Archives: Statistics

# PMean: Reading text data into SAS

Some of my students in my Introduction to SAS class are having trouble with a particular data set. Here are some screen shots showing how SAS can read this file in several different ways. Continue reading

# PMean: Do you need to name your function arguments in R?

If you program anything in R, you’ll end up calling a lot of functions. You pass your data or your constants to these functions, and you can do it in one of two ways. You can either pass the data/constants in the order in which the function expects the arguments or you can match each data/constant value with a particular argument name. This came up in the context of a question: do I need to save everything using

save.image(file=”foo.RData”)

or can I save it with

save.image(“foo.RData”)? Continue reading

# PMean: Possible webinars on consulting

I’ve gotten an inquiry about teaching a couple of webinars. Nothing’s official yet, but let me outline the these webinars here on my blog. If the offer becomes official, I will update on this blog post or on a new post. Continue reading

# PMean: My work on a CTSA grant

I’m on a Clincal and Translational Science Award (CTSA) research grant (5UL1TR000001-05, formerly 1U54RR031295-01A1), which is pretty cool. My name is even mentioned a few times in the grant. I thought that as I plan what I would do for this grant, I would see what the grant promised and write down what, exactly, that those promises mean. As I talk with various people (especially Russ Waitman, who is supervising my work on this grant), I will revise and update my plans. Still, I thought it would be valuable to put some thoughts down now, both to help me focus on what I should be doing and to offer an early draft of those ideas to the various people that I will end up interacting with. Continue reading

# PMean: Charging by the hour versus charging by the project

I was talking about pricing models for consulting in an email exchange and I thought I’d extract some of those comments for this blog. When you are an independent consultant, you need to decide whether you will charge by the hour or charge a flat fee for the entire project. Continue reading

# PMean: Syllabus, Introduction to SAS, Spring 2018

I am teaching a class, Introduction to SAS (MEDB 5507). Here is the syllabus for Spring Semester 2018. Continue reading

# PMean: Syllabus for Introduction to R, Spring 2018

I am teaching a class, Introduction to R (MEDB 5505). Here is the syllabus for Spring Semester 2018. Continue reading

# PMean: SAS University. It’s SAS and it’s free

I am teaching a class, Introduction to SAS, that I helped design, but one where another faculty member did all the heavy lifting. I used to teach SAS classes, and I even helped organize a regional SAS conference, but stopped abruptly in 1998. So I’m relearning SAS and one thing that is helping a lot is a product called SAS University which allows you to use SAS for non-commercial purposes for free. Here’s how SAS University works. Continue reading

# PMean: Another big data publication

I dislike the term “big data” because it implies a class of problems that are immune from normal statistical considerations. I will admit that certain concepts such as the p-value become meaningless when you have millions of observations. But other concepts, like selection bias become even more important for big data.

Anyway, I now have a second publication that is directly tied to the big data movement. Continue reading