Category Archives: Recommended

Recommended: Tessera. Open source environment for deep analysis of large complex data

I have not had time to preview this software, but it looks very interesting, It takes large problems and converts them to a form for parallel processing, not by changing the underlying algorithm, which would be very messy, but by splitting the data into subsets, analyzing each subset, and recombining these results. Such a method “Divide and Recombine” should work well for some analysis, but perhaps not so well for others. It is based on the R programming language. If I get a chance to work with this software, I’ll let you know what I think. Continue reading →

Recommended: Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement

If you are writing up a paper that uses a complex regression model (complex meaning multiple independent variables), you need to document information that allows the reader to assess the quality of the predictions that your model would produce. This paper provides a checklist of things that you need to document in such a paper, and is an extension of the CONSORT guidelines to this particular type of research. Continue reading →

Recommended: In search of justification for the unpredictability paradox

This is a commentary on a 2011 Cochrane Review that found substantial differences between studies that were adequately randomized and those that were not adequately randomized. The direction of the difference was not predictable, however, meaning that there was not a consistent bias on average towards overstating the treatment effect or a consistent bias on average towards understating the treatment effect. This leads the authors of the Cochrane review to conclude that “the unpredictability of random allocation is the best protection against the unpredictability of the extent to which non-randomised studies may be biased.” The authors of the commentary provide a critique of this conclusion on several grounds. Continue reading →

Recommended: Requiring fuel gauges. A pitch for justifying impact evaluation sample size assumptions

This blog entry from the International Initiative for Impact Evaluation talks about the deficiency in many research proposals sent to that organization. They rely too much on standardized effect sizes, which are impossible to interpret and often misleading. The authors also criticize the Intraclass Correlation Coefficients (ICCs) that are included in the sample size justification for many cluster based or hierarchical research designs. The ICCs, they say, often seem to be pulled out of thin air. It is a hard number to get sometimes and they suggest that you consider a range of ICCs in your calculations or that you run a pilot study. Continue reading →

Recommended: What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum

Tim Hesterberg has been a long time advocate of the use of the bootstrap. In this article, he provides a nice general overview of the bootstrap with examples of how it works in several common settings that might be covered in an introductory level statistics class. Continue reading →

Recommended: Communicating Statistical Findings to Consulting Clients Operating in a Decisionmaking Climate: Best and Worst Practices

There were a large number of excellent talks at the 2014 Joint Statistics Meeting. This session discusses practical issues associated with communication. Although I did not attend this session, it looks pretty good and the speakers have all placed their slides in a single location. Continue reading →

Recommended: Special issue–Using Big Data to Transform Care

The July 2014 issue of Health Affairs is devoted entirely to “big data”. The articles provide a general overview to big data, several applications of big data, big data and genomics, use of electronic health records, and ethical issues including privacy concerns. For now, at least, the articles are available for free to any user. Continue reading →

Recommended: MLPowSim software

This site provides description of a free software package, MLPowSim, that calculates power for complex random effects models. It was developed by the Centre for Multilevel Modelling, the same group that developed the LMwiN package for analysis of complex random effects models. Continue reading →

Recommended: Comparisons within randomised groups can be very misleading

In studies with a baseline, examining the decline exclusively within the treated group, or examining the decline in the treated group and then separately examining the decline in the control group is a bad idea, notes two famous statisticians in the British Medical Journal. They explain why you need to look first at comparisons between the two groups, ideally with analysis of covariance. Continue reading →

Recommended: FDA: R OK for drug trials

This blog post reviews a presentation by Jae Brodowsky, a statistician with the U.S. Food and Drug Administration that put to bed the rumor that FDA will only accept submissions where the data analysis was done by SAS. The summary does mention that FDA has certain regulatory requirements for R (or any other statistical package, including SAS). Continue reading →

PMean

A blog about statistics, evidence-based medicine, and research ethics