I am working on a project that has a series of files in R and R Markdown, and I need to track down where a particular function was originally defined. In Unix based systems, this is pretty easy to do. Continue reading
Category Archives: Statistics
PMean: Business essentials that you need to know before starting your career as an independent statistical consultant
I am giving a presentation on the business essentials that you need to know before you start a career in independent statistical consulting. Here is a summary of what I’ll be talking about. Continue reading
PMean: Simulating power for a test of association in a two by two table
In an earlier blog post, I slogged through the calculation of power for a test of association in a two by two table. You can also approximate power using a simulation. It is done quite easily in R, but I want to show it in SPSS. Why? Just because. Continue reading
PMean: Calculating power for a test of association in a two by two table
A colleague was curious to see the formulas behind the power calculations done by many statistical software programs and online calculators. In particular, she wanted to see the formula used for power of the Chi-squared test of association in a two dimensional contingency table. It gets pretty messy for anything larger than a two by two table, but even a two by two table is a bit tricky. Here ins one mathematical approach that you can choose for a power calculation. Continue reading
PMean: A rant about the “reader pays” model of publication
Something came up in our department about a predatory pay for access journal that was soliciting support. All the appropriate warnings were made (there’s a nice explanation of predatory open access publishing at Wikipedia, if you’re curious). But I felt that I had to made a strong defense of the value provided by legitimate open access publishers. Here’s a summary of what I wrote. Continue reading
PMean: How detailed should I make my data analysis plan.
Dear Professor Mean, I could use some advice on defining and following analysis plans for research proposals. I can see how in well-trodden research, where the nuances of the data are well understood, and reasonable distributional assumptions are already identified, a detailed analysis plan may be straightforward to develop. But what about cases where you collect observational data, and though you may have specific hypotheses in mind prior to collecting it, you can’t really pin down the most appropriate analysis until after you’ve done some exploration? Is there generally an acceptable amount of leeway in cases like these, where your analysis doesn’t follow the original plan to the dot, but it’s still designed to address the same question? Or must one be as specific and detailed in the plan as possible, and consider every contingency (e.g., if the distribution is noisy, switch to this non-parametric test)? Or does it really vary with whoever the reviewer is? Continue reading
PMean: Information about the Great Plains Collaborative Breast Cancer Study
This page is moving to a new website.
I’m starting a new project as an independent consultant. Normally, I talk very little on my blog about specific projects that I work for, but this work is for a group, the Great Plains Collaborative (GPC) that is open about almost every aspect of the work they do. That’s music to my ears. Anyway, the GPC is involved with several projects and one of the ones I might do some of my work on is the Great Plains Collaborative Breast Cancer Study. Here is some information about this study, culled from sources available to anyone on the Internet. Continue reading
PMean: Two grumpy editorials at NEJM
I don’t have time to follow the journals as closely as I should, but I was shocked to find two very nasty editorials in recent issues of the New England Journal of Medicine. They are sharply critical of open sharing of data and of quality improvement efforts. Continue reading
PMean: My research interests
I got an email from someone at UMKC with the title, Director of Undergraduate Research. She was
“looking through the abstract booklet from the Faculty Research Symposium sponsored by Lawrence Dreyfus’s office at the end of last semester, and I was really intrigued by your presentation on the likelihood ration slide rule. That’s just the kind of innovative work that undergraduate students would like to be involved with, and you would be an awesome mentor for undergraduate researchers.”
Flattery always works with me, so I took her suggestion of setting a faculty profile that undergraduate students at UMKC could review. Here’s what I put on that profile. Continue reading
PMean: Odds ratio with a zero in the table
Dear Professor Mean, I have a data set where 94 out of 100 patients with cancer have activation of a gene, while 0 out of 50 of the controls have activation. When I compute the odds ratio, I get (94*50) / (6*0) = 4700 / 0 = ???. What should I do? Continue reading