Monthly Archives: May 2016

PMean: Too many different prior choices for the hierarchical beta binomial model

I’m interested in studying how Bayesian hierarchical models work and I want to start with what seems like the simplest case, the hierarchical beta-binomial model. It’s actually not that simple, it seems. There are too many choices for the hyperprior that you use in this setting. Continue reading →

PMean: Minimum standards for a github repository

Using a github repository is new to me, and I need to document some things that may be obvious to more experienced programmers. Here are, as far as I understand it, the minimal documentation standards for a github repository. Continue reading →

PMean: Understanding the lasso regression model

I wrote a program in R Markdown that shows how the lasso regression model works. It has too many pictures to be easily ported to this blog, so I’ll share a link to a pdf file instead. You can also find the R Markdown code at my github repository.

PMean: Pretty good work for a part-timer

I had to provided some documentation for my supervisor on what I had done in the past year (May 2015 through April 2016). It’s a very brief summary, but it’s still pretty impressive, if I may say so. Continue reading →

PMean: Trying to save time in R ends up costing me time

Sometimes you end up trying to be so clever that you out-clever yourself. This happened to me today in R. I tried something that was supposed to save me time and it ended up costing me more time instead. Continue reading →

PMean: Examining the storage format for sparse matrices in R

I’ve been working with sparse matrices a bit for my work with the Greater Plains Collaborative. They are a very useful way of storing matrices where most of the entries are zero. This occurs quite often in medical data. There are thousands of medical procedures that you can torture your patients with, so any matrix that has indicator variables for every medical procedure will be quite big. Fortunately, both for us and for the patients, the number of procedures that a particular patient has to endure is quite a bit smaller. So for each row of the matrix, the number of non-zero entries will be very small, probably in the single digits. A sparse matrix will be much smaller because it stores only the location of the non-zero entries. Here’s some R code that shows how this works. I have the code available at my new github site. Continue reading →

PMean: A megabyte is not a million bytes

Sometimes you forget things. Here’s an example. Continue reading →

Pmean: Which R package should I use?

Working with R is great in that if anything has been done in Statistics, there is an R package that will do it. The problem is that sometimes there are four packages that will do it. So when this happens what do you do? I want to outline what I did recently when I needed to find a package to calculate Cronbach’s alpha. Continue reading →

PMean: So you’re thinking about a retrospective chart review

If you are designing a retrospective chart review, you should talk to a statistician early in the process. There are lots of statistical issues that you must think about during the concept development phase of your research. Here is a broad overview of these issues. Continue reading →