Someone wrote asking me about a variation of the “Rule of Three”. This rule says that if you observe zero events out of n, an upper 95% confidence limit for n is approximately 3/n. So suppose you operated on 10 patients and none of them died after surgery. Then you would be 95% confident that the mortality rate would be 30% (3/10) or less. This person asked “Suppose I repeatedly sample from a population and every patient in the sample was a G. What is the how likely is it that the entire population is Gs?” This flips the problem around, and is equivalent to saying that the probability of survival is 97% or greater. But this person wanted an estimate of the probability that the probability in the population is 1. Continue reading

# Monthly Archives: July 2014

# PMean: An example of a simple sample size justification

Someone asked me for a sample size justification for a study involving a historical control group of 30 patients and a treatment group of unspecified size. I thought it would be nice to document the mechanics of this calculation here, as an example for future clients. It uses a program, Piface, developed by Russ Lenth for sample size calculations. Continue reading

# PMean: A brief abstract of my research interests

I attended a working group meeting for researchers interested in Chem bio Defense applications. They asked each attendee to write a brief abstract of their research interests and expertise to share with the others. This will help in identifying areas of possible collaboration. Here is what I wrote: Continue reading

# Recommended: Special issue–Using Big Data to Transform Care

The July 2014 issue of Health Affairs is devoted entirely to “big data”. The articles provide a general overview to big data, several applications of big data, big data and genomics, use of electronic health records, and ethical issues including privacy concerns. For now, at least, the articles are available for free to any user. Continue reading

# PMean: Simple longitudinal data sets to illustrate data management

I am working on a class that will teach basic data management and graphics using the R programming language with parallel classes in SPSS and SAS. On the third or fourth day of the class, we will look at managing longitudinal data sets, as these require special skills. I wanted to find a couple of reasonably simple longitudinal data sets that were available on the web and which had at least a few missing values in them. Here’s a couple of data sets that might work. Continue reading