Author Archives: pmean

PMean: Stretching an already borderline sample size

This page has moved to a new website.

PMean: Two data sets illustrating the analysis of continuous variables

This page has moved to a new website.

PMean: Using statistical design principles to plan a Monte Carlo analysis – part 2

I’ve been working more on a Monte Carlo study of various Bayesian estimators and it makes me think about certain principles that we statisticians use in experimental design that could help us not just with other people’s laboratory studies, but with Monte Carlo studies, which are our own laboratories. This is a continuation of an earlier blog post. One important principle is variable transformation. We almost always conceptualize and analyze proportions using the logit transformation, and this transformation can help a lot with Monte Carlo studies as well. Continue reading →

PMean: Post hoc sample size calculations

Someone asked about references on post hoc power calculation on the MEDSTATS email discussion group and, as we all know, this is a very bad idea. But someone offered a setting where a post hoc power calculation might make sense. It’s worth discussing, because what you really would want in that setting is a post hoc sample size calculation. Continue reading →

Recommended: More on Big Data Training for the Scientific Workforce

The United States National Institutes for Health is very interested in big data and has developed a working group, Big Data to Knowledge (BD2K). This blog post from Sally Rockey, the Deputy Director for Extramural Research, summarizes some of the recent activity of BD2K. Continue reading →

Recommended: Predicting clinical trial results based on announcements of interim analyses

If you’ve ever been involved with interim reviews of clinical trials on a DSMB (Data Safety and Monitoring Board), you will be warned about the importance of confidentiality. There are two big reasons for this. First, leaking of interim trial results could lead to insider trading. News that the trial is going well would lead to a jump in stock prices and news that the trial is going poorly would lead to a dip in stock prices. If someone gets early news from the DSMB, they could profit from that inside information. Continue reading →

Quote: “Excel’s graphics can be great. The problem occurs …

…when people assume that the Excel output is enough. I think of all the research papers in economics where the authors must have spent dozens of hours trying all sorts of different model specifications, dozens of hours writing and rewriting the prose of the article, . . . and 15 minutes making the graphs.” Andrew Gelman, quoted at http://andrewgelman.com/2009/04/22/more_on_data_vi/.

PMean: Estimating the efficiency of a completely randomized block design

I needed to look up a formula for the estimating the relative efficiency to a completely randomized block design to a design without blocking. Continue reading →

PMean: Using statistical design principles to plan a Monte Carlo analysis

I want to run a Monte Carlo analysis of various Bayesian estimators to see how they perform when the prior distribution is “wrong”. I’m like everyone else–I just plunge in and start. But halfway through the Monte Carlo analysis, I realized that I could make my life easier and produce a better quality Monte Carlo analysis if I used basic statistical design principles. Here’s a brief outline of some of these design principles. Continue reading →

PMean: Comparing two vectors with possible missing values

I want to compare two vectors in R that have the same length and identify where, if anywhere, that the two vectors don’t match. It sounds like an easy thing to do, but missing values muddy things up quite a bit. Continue reading →