Category Archives: Statistics

PMean: A biased sample of car speeds

Dear Professor Mean, I read a newspaper report about speed limits and how few people obeyed them. A reporter decided to collect some hard data and drove exactly at the speed limit (55 mph in this particular setting). The reporter noticed that nine cars passed his car for every car that he passed, and concluded that most people are breaking the speed limit. I’m wondering if this is really a valid way to collect data. Continue reading

PMean: Sample size for a study of reproducibility

Dear Professor Mean: I am using a risk stratification tool for patients presenting to the ED with chest pain. This has been a well validated tool in the ED, but I want to show that the scores are reproducible irrespective of the grade of doctor or assessment nurse calculating the score. I’m going to collect a convenience sample of patients presenting to the ED, and after I get informed consent, I will have those patients assessed separately by a triage-trained nurse, an intern doctor, a registrar and a consultant. I will calculation agreement using the intraclass correlation coefficient (ICC). My question is: How do I calculate the sample size in this context? Continue reading

PMean: History of SPSS

I’m helping to put together three separate classes, Basic data management and analysis with R [SAS / SPSS]. As part of these classes, I need to discuss the history of these programs, because understanding that history will help you better understand the strengths and weaknesses of each statistical package. Here’s a brief history of SPSS. Continue reading

PMean: What should go into a data codebook

Before you start your data entry, you should create a data codebook. If you don’t have a data codebook when you hand your data  over to someone else, take the time to create one for their benefit and yours. The data codebook contains a description of your data set. There’s no standard form for a data codebook, and what you describe may depend on a variety of factors, such as the complexity of your data set, the number of people involved in data collection and data entry, and the number of people that you are likely to share your data with. Here are some of the elements that you should think about putting in a data codebook. Continue reading

PMean: History of R

I’m helping to put together three separate classes, Basic data management and analysis with R [SAS / SPSS]. As part of these classes, I need to discuss the history of these programs, because understanding that history will help you better understand the strengths and weaknesses of each statistical package. Here’s a brief history of R. Continue reading

PMean: Course proposal on setting up an independent consulting practice

I was asked to prepare a proposal on a short course about consulting for an upcoming Statistics conference. I had talked about this in an earlier blog post. Here is the official submission which includes the course description, outline and objectives, some information about my qualifications to teach the class, and a summary of how this class fits into the theme of the conference. Here’s what I wrote. Continue reading

PMean: Using BUGS within the R programming environment

I am giving a talk today for the Kansas City R Users group about BUGS (Bayes Using Gibbs Sampler). I have already written extensively about BUGS and the interface to BUGS from within the R programming environment, and you can find these on my category page for Bayesian statistics. Here is a quick overview of why you might want to use BUGS and how you would use it. I’ve included links to the relevant pages on my website so you can explore this topic further on your own. Continue reading