Author Archives: pmean

Recommended: Why R is Bad for You

Arguing about R versus SAS often takes on a religious fervor, so I normally hesitate to recommend articles that trash one package or the other. But this one raises an interesting point which makes it worth reading. Note that “recommended” does not mean that I endorse these conclusions. But rather than bias you with my perception of the issue, just read this on your own. Continue reading

PMean: Extremely imbalanced multi-center trials

There was some recent discussion of issues with multi-center trials where one center dominates, contributing as much as 94% of all the patients. What does this do to the generalizability of the study. I wanted to summarize these comments here, because it relates to some of the issues I’m looking at right now in accrual models for multi-center trials. Continue reading

Recommended: ROSE: A package for binary imbalanced learning

Logistic regression and other statistical methods for predicting a binary outcome run into problems when the outcome being tested is very rare, even in data sets big enough to insure that the rare outcome occurs hundreds or thousands of times. The problem is that attempts to optimize the model across all of the data will end up looking predominantly at optimizing the negative cases, and could easily ignore and misclassify all or almost all of the positive cases since they consistute such a small percentage of the data. The ROSE package generates artificial balanced samples to allow for better estimation and better evaluation of the accuracy of the model. Continue reading

PMean: How big is the stuff I’m working on

I have been working part-time on a project for the Great Plains Collaborative (GPC) under the direction of Russ Waitman and the gentle guidance of Dan Connolly, both at Kansas University Medical Center. I hoping to submit a paper soon on the work I’ve done, but if you are curious about the size and scope of the electronic health records that I’ve been slinging around, this blog entry might help. Continue reading