Tag Archives: Sample size

Recommended: Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: a review

The sample size justification for a cluster randomized trial is messy. It requires the use of an intra-class correlation or something similar (the authors use the term within-cluster correlation). In a review of 300 cluster randomized trials, the authors found that in only about a third of the trials did the authors specify the within-cluster correlation. Even fewer compared this to the observed within-cluster correlation observed in the data. We need to do better. Continue reading →

Recommended: The number of subjects per variable required in linear regression analyses

There are several rules of thumb out there about how many subjects that you need for a multiple linear regression model. Most of these rules look at the ratio of subjects per variable (SPV). If you have 100 subjects and 20 independent variables in your regression model, then the SPV is 5. This article comes to the surprising conclusion that an SPV of 2 is just fine. In other words, you could have 40 subjects and 20 independent variables and still be okay. This is independent of power considerations, by the way, but it still seems rather small to me. Read the paper yourself and let me know what you think. Continue reading →

PMean: Simulating power for a test of association in a two by two table

In an earlier blog post, I slogged through the calculation of power for a test of association in a two by two table. You can also approximate power using a simulation. It is done quite easily in R, but I want to show it in SPSS. Why? Just because. Continue reading →

PMean: Calculating power for a test of association in a two by two table

A colleague was curious to see the formulas behind the power calculations done by many statistical software programs and online calculators. In particular, she wanted to see the formula used for power of the Chi-squared test of association in a two dimensional contingency table. It gets pretty messy for anything larger than a two by two table, but even a two by two table is a bit tricky. Here ins one mathematical approach that you can choose for a power calculation. Continue reading →

Recommended: PS: Power and Sample Size Calculation

Someone stopped by today with a power calculation and I asked what software they used. They showed me something I had not seen before, a program developed by the Department of Biostatistics at Vanderbilt University (more specifically, William Dupont and Walton Plummer). The Vanderbilt Biostatistics Department is run by Frank Harrell, so you can be pretty sure that anything that they develop will be high quality. Continue reading →

PMean: Validating a test of diabetes

Dear Professor Mean, I have a simple algorithm that determines whether a person is diabetic or not. I am planning on validating this algorithm, and I need to know how many patients I need to sample. Is there a formula I could use? Continue reading →

Recommended: Requiring fuel gauges. A pitch for justifying impact evaluation sample size assumptions

This blog entry from the International Initiative for Impact Evaluation talks about the deficiency in many research proposals sent to that organization. They rely too much on standardized effect sizes, which are impossible to interpret and often misleading. The authors also criticize the Intraclass Correlation Coefficients (ICCs) that are included in the sample size justification for many cluster based or hierarchical research designs. The ICCs, they say, often seem to be pulled out of thin air. It is a hard number to get sometimes and they suggest that you consider a range of ICCs in your calculations or that you run a pilot study. Continue reading →

Recommended: Sample size of 12 per group rule of thumb for a pilot study

This study is (sadly) not available for free on the Internet, but it is still worth highlighting here. Steven Julious provides some justification for the use of twelve patients per group in a pilot study. This is a useful starting point for discussion, and it may serve as a useful lower bound. I would suggest that you consider the size of the larger trial that you are piloting. For a larger study that might require thousands or tens of thousands of patients, a pilot study of 12 patients per group is woefully inadequate. Continue reading →

PMean: An example of a simple sample size justification

Someone asked me for a sample size justification for a study involving a historical control group of 30 patients and a treatment group of unspecified size. I thought it would be nice to document the mechanics of this calculation here, as an example for future clients. It uses a program, Piface, developed by Russ Lenth for sample size calculations. Continue reading →

Recommended: MLPowSim software

This site provides description of a free software package, MLPowSim, that calculates power for complex random effects models. It was developed by the Centre for Multilevel Modelling, the same group that developed the LMwiN package for analysis of complex random effects models. Continue reading →