Category Archives: Statistics

PMean: So you’re thinking about a pilot study

If you are designing a retrospective chart review, you should talk to a statistician early in the process. There are lots of statistical issues that you must think about during the concept development phase of your research. Here is a broad overview of these issues. Continue reading →

PMean: Can you recommend an introductory book on Bayesian Statistics

I got an email asking for a recommendation for an introductory book on Bayesian Statistics from someone who recently graduated from our program. It’s kind of a difficult request because the mathematical demands needed to understand Bayesian statistics are not trivial. Here’s what I recommended. Continue reading →

PMean: Introduction to R, Monday, August 8 through Friday, August 12, 2016

I will be teaching a one semester hour class at UMKC, Introduction to R (MEDB 5505) on Monday, August 8, 2016 through Friday, August 12, 2016. It runs from 9am to noon on all five days. This is part of a series of classes that cover a basic introduction to statistical packages: data import, data management, simple graphs, and simple descriptive statistics. The other classes (MEDB 5506 and MEDB 5507) cover SPSS and SAS.

Here are some details about this class. Continue reading →

PMean: Too many different prior choices for the hierarchical beta binomial model

I’m interested in studying how Bayesian hierarchical models work and I want to start with what seems like the simplest case, the hierarchical beta-binomial model. It’s actually not that simple, it seems. There are too many choices for the hyperprior that you use in this setting. Continue reading →

PMean: Minimum standards for a github repository

Using a github repository is new to me, and I need to document some things that may be obvious to more experienced programmers. Here are, as far as I understand it, the minimal documentation standards for a github repository. Continue reading →

PMean: Understanding the lasso regression model

I wrote a program in R Markdown that shows how the lasso regression model works. It has too many pictures to be easily ported to this blog, so I’ll share a link to a pdf file instead. You can also find the R Markdown code at my github repository.

PMean: Pretty good work for a part-timer

I had to provided some documentation for my supervisor on what I had done in the past year (May 2015 through April 2016). It’s a very brief summary, but it’s still pretty impressive, if I may say so. Continue reading →

PMean: Trying to save time in R ends up costing me time

Sometimes you end up trying to be so clever that you out-clever yourself. This happened to me today in R. I tried something that was supposed to save me time and it ended up costing me more time instead. Continue reading →

PMean: Examining the storage format for sparse matrices in R

I’ve been working with sparse matrices a bit for my work with the Greater Plains Collaborative. They are a very useful way of storing matrices where most of the entries are zero. This occurs quite often in medical data. There are thousands of medical procedures that you can torture your patients with, so any matrix that has indicator variables for every medical procedure will be quite big. Fortunately, both for us and for the patients, the number of procedures that a particular patient has to endure is quite a bit smaller. So for each row of the matrix, the number of non-zero entries will be very small, probably in the single digits. A sparse matrix will be much smaller because it stores only the location of the non-zero entries. Here’s some R code that shows how this works. I have the code available at my new github site. Continue reading →

PMean: A megabyte is not a million bytes

Sometimes you forget things. Here’s an example. Continue reading →