# Pmean: The IRB questions my sample size calculation

I got a question today from someone submitting a research protocol to the Institutional Review Board (IRB). The IRB had some concerns about the power calculation. In particular, they said “The IRB would like to know, how you set the parameters for the power calculation, such as effect size, alpha level. For effect size, you need to have some data to justify or should choose a conservative one.”

Part of this was due to an error in the submission of the protocol. We had specified a paired t-test rather than an independent samples t-test, which is a major gaffe on my part. But they were pushing into some tricky territory and I wanted to clear things up. Here is the response that I suggested that we share with the IRB. Continue reading

# Recommended: The random risks of randomised trials

This is an overview of some of the ethical controversies associated with randomization. It includes an interesting story of an early trial by Archie Cochrane that raised a lot of fuss at the time because it was attacking one of the “sacred cows” of medicine. Continue reading

# PMean: Stretching an already borderline sample size

I was working with a client who had a limited population of medical residents to choose from, and it would be a struggle to get even 60 of them. The primary outcome was binary: whether a certain medical procedure was run properly in a test setting. The intervention was special training on a model; the control was normal training without the model. I got a phone call back that said, what would the power be if I used three groups rather than two?  I thought to myself “Good grief!” You can’t say that to a client, of course, so here’s what I said. Continue reading

# PMean: Two data sets illustrating the analysis of continuous variables

I am looking for good teaching examples that will illustrate methods for univariate and bivariate analysis of continuous variables. Here are two publicly available data sets that I like. Continue reading

# PMean: Using statistical design principles to plan a Monte Carlo analysis – part 2

I’ve been working more on a Monte Carlo study of various Bayesian estimators and it makes me think about certain principles that we statisticians use in experimental design that could help us not just with other people’s laboratory studies, but with Monte Carlo studies, which are our own laboratories. This is a continuation of an earlier blog post. One important principle is variable transformation. We almost always conceptualize and analyze proportions using the logit transformation, and this transformation can help a lot with Monte Carlo studies as well. Continue reading

# PMean: Post hoc sample size calculations

Someone asked about references on post hoc power calculation on the MEDSTATS email discussion group and, as we all know, this is a very bad idea. But someone offered a setting where a post hoc power calculation might make sense. It’s worth discussing, because what you really would want in that setting is a post hoc sample size calculation. Continue reading

# Recommended: More on Big Data Training for the Scientific Workforce

The United States National Institutes for Health is very interested in big data and has developed a working group, Big Data to Knowledge (BD2K). This blog post from Sally Rockey, the Deputy Director for Extramural Research, summarizes some of the recent activity of BD2K. Continue reading

# Recommended: Predicting clinical trial results based on announcements of interim analyses

If you’ve ever been involved with interim reviews of clinical trials on a DSMB (Data Safety and Monitoring Board), you will be warned about the importance of confidentiality. There are two big reasons for this. First, leaking of interim trial results could lead to insider trading. News that the trial is going well would lead to a jump in stock prices and news that the trial is going poorly would lead to a dip in stock prices. If someone gets early news from the DSMB, they could profit from that inside information. Continue reading

# Quote: “Excel’s graphics can be great. The problem occurs …

…when people assume that the Excel output is enough. I think of all the research papers in economics where the authors must have spent dozens of hours trying all sorts of different model specifications, dozens of hours writing and rewriting the prose of the article, . . . and 15 minutes making the graphs.” Andrew Gelman, quoted at http://andrewgelman.com/2009/04/22/more_on_data_vi/.