Monthly Archives: September 2016

Recommended: Published methodological quality of randomized controlled trials does not reflect the actual quality assessed in protocols

When evaluating a series of research articles, you often have to assess the quality of the individual papers based on the type of blinding, for example. What do you do if the paper does not discuss these items? I have usually advocated a “no news is bad news policy.” If a paper does not mention blinding, assume that no blinding was done. It seems reasonable, but the paper by Mhaskar et al provides empirical evidence that sometimes authors leave out information that would strengthen the credibility of their study. A similar paper is at https://www.ncbi.nlm.nih.gov/pubmed/22424985 Continue reading →

PMean: By the skin of my teeth

I have to brag a bit. I’m working part-time at Kansas University Medical Center (along with a couple other part-time jobs) and my boss asked me two weeks ago if I was interested in writing a paper on the data analyses I had been working on. It would be submitted to the AMIA 2017 Joint Summit on Translational Research and I’d be the first author. Continue reading →

PMean: Turning off large blocks of an R Markdown document

When you’re running a large and complicated program using R Markdown, you can use the CACHE option to save a lot of time. CACHE will notice if a program chunk has stayed the same and avoid running it again. I tend to avoid using the CACHE option, though, because sometimes it fails to execute something that you want executed, even though it looks on the surface like nothing has changed. So I created some simple program chunks that allow me to explicitly turn off parts of the R Markdown program that I don’t need to evaluate at the time. Think of it as a manual cache.

It’s a very simple thing, but one which confounded me for a while, so I am writing about it here. That way I won’t forget six months down the road. Continue reading →

Recommended: Diverse Perspectives on a Flipped Biostatistics Classroom

This article is a synthesis of a panel discussion at the 2014 Joint Statistical Meetings on the flipped classroom. The article discusses it solely from the perspective of Biostatistics classes, though they offer some references for the flipped classroom in a more general setting. A flipped classroom is a course where the traditional didactic lectures are recorded and watched at home and the homework that would normally be done at home is done instead in the classroom. This homework in a Biostatistics class often takes the form of active learning in small groups, such as critiquing published research studies or conducting analyses on real world data sets. The key component, according to the authors, is the in class interactions during these assignments. Students learn from each other as they work in groups.

Now you could do active learning in a traditional course format. What a flipped classroom does is increases the emphasis and the amount of time spent in active learning.

The common theme of the paper is that the flipped classroom has been successfully applied in a variety of settings. It is not a “one size fits all” approach, but rather can be adapted to the needs of the particular class. Some students may not like the flipped classroom format, and you shouldn’t underestimate the amount of time needed to prepare the videotaped lectures (one rule of thumb is ten hours of work for every hour of video). Still the student reactions and the instructors perceptions of the flipped classroom are generally positive. Continue reading →

PMean: Merging in dplyr is a lot faster

At the Joint Statistics Meetings, I found out that the advantages of some of the new libraries for data manipulation (like dplyr and tidyr) go beyond just the flexibility of the new methods of data manipulation. These libraries produce code that is easier to read and which also runs a lot faster. I did not appreciate how much faster until I tried a test today. Continue reading →

Recommended: Organizing data in spreadsheets

I have some guidance on how to organize data (written back in 1999), but these guidelines are far superior. To be honest, you should use a database for anything more than half complicated, but for those simple data sets where you can use a text file or a spreadsheet, Dr. Broman’s comments are very helpful. Continue reading →

Recommended: What I need from statisticians

This interview with Nate Silver was conducted shortly after his keynote address at the 2013 Joint Statistical Meetings. I was at those meetings, but was stuck in a class (a very good class by the way, but I still felt stuck) on software engineering for statisticians. This article summarizes the main points of Mr. Silver’s keynote address and adds some extra insights through an interview after the speech. The best part was the quote at the end.

When asked that “Data science is the term of the day. Do you think there is a difference between data science and statistics? Silver replied, “I think data-scientist is a sexed up term for a statistician”, the reaction from the audience was for most, one of instantaneous laughter and applause. “Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician.”

If Nate Silver can say something this controversial, then maybe I shouldn’t be so bashful. Continue reading →