Category Archives: Statistics

PMean: What do you hate most about independent consulting

Someone on the Statistical Consulting forum mentioned that she is going to become an independent consultant when she graduates and wanted to find out from people who are currently in that position what the one thing is that they hate most. This email drew a lot of responses including several people who cautioned this women about the difficulties for a young person to become an independent consultant. Here are the thoughts I shared on the thing I hate most and what the issues are with embarking out on your own as an independent consulting early in your career. Continue reading →

PMean: Those darn commas in SQL

I should know better, but I made a rookie mistake with SQL that took a long time for me to fix. It’s one of those detail oriented things and if you aren’t detail oriented, you can’t call yourself a programmer. Continue reading →

Recommended: A Tutorial on Loops in R

This is a very clear, but also very detailed explanation of the for, while, and repeat loops along with the concept of vectorization. A great resource for beginners. Continue reading →

Recommended: Oracle Dates and Times

I’m working with R and SQL, and some of the work uses SQLite, and some of the work uses Oracle. There are subtle differences between the two, and for that matter between any two database programs. While there are SQL standards, most packages have minor deviations, or enhancements. Dates in Oracle represent one deviation. In particular, Oracle does not use the ISO 8601 standard date format (yyyy-mm-dd) by default. Here’s a nice overview of how to work with Oracle dates. Continue reading →

PMean: What greedy means to a geek

I’ve run across the term “greedy” in several work related contexts, so I thought it might be worth explaining what it means. Continue reading →

PMean: One small grant for me, one giant leap for Biostatisticians

I’m so busy these days that it is silly to take on anything new, but I found an opportunity for a small research grant that I might want to submit a proposal for. Continue reading →

PMean: By the skin of my teeth

I have to brag a bit. I’m working part-time at Kansas University Medical Center (along with a couple other part-time jobs) and my boss asked me two weeks ago if I was interested in writing a paper on the data analyses I had been working on. It would be submitted to the AMIA 2017 Joint Summit on Translational Research and I’d be the first author. Continue reading →

PMean: Turning off large blocks of an R Markdown document

When you’re running a large and complicated program using R Markdown, you can use the CACHE option to save a lot of time. CACHE will notice if a program chunk has stayed the same and avoid running it again. I tend to avoid using the CACHE option, though, because sometimes it fails to execute something that you want executed, even though it looks on the surface like nothing has changed. So I created some simple program chunks that allow me to explicitly turn off parts of the R Markdown program that I don’t need to evaluate at the time. Think of it as a manual cache.

It’s a very simple thing, but one which confounded me for a while, so I am writing about it here. That way I won’t forget six months down the road. Continue reading →

PMean: Merging in dplyr is a lot faster

At the Joint Statistics Meetings, I found out that the advantages of some of the new libraries for data manipulation (like dplyr and tidyr) go beyond just the flexibility of the new methods of data manipulation. These libraries produce code that is easier to read and which also runs a lot faster. I did not appreciate how much faster until I tried a test today. Continue reading →