This page shows R code to handle the tricky data sets where the response is known to be inside some interval. Continue reading
Michael Fay gave a short course on interval censored data at the 2010 useR! meeting. The slides from this short course provide a nice overview of the complexities of this type of data. Continue reading
This is a nice overview of how to use R to adjust survival curves in an observational study. It covers weighting and modeling with covariates and criticizes several approaches. Continue reading
This package was mentioned at the most recent meeting of the Kansas City R Users Group and was too cute not to mention. Continue reading
I’ve been using a version of LaTeX (MikTeX) for a couple of years, and it’s not bad. But when I heard about Yihui Xie’s R package, tinytex, I jumped at the opportunity to try it. Dr. Xie is the author of knitr, a package that makes it easy to create well documented R programs where the code and the output are gracefully merged. He created this new package, tinytex, because he felt that the current versions of LaTex had complex installation processes and forced you to choose between a minimal installation that couldn’t do anything useful and a full installation that was bloated with features you’d never use. I can’t say too much about the package yet except that he is right in that it is very easy to install. If I find out more, I’ll let you know. Continue reading
I’m teaching a couple of classes, Introduction to R and Introduction to SAS, and I’m finding that students will turn in homework a variety of different ways. I’m fine with this up to a point, but I think that I should encourage a simple uniform approach, because out in the real world, your boss or your clients will not appreciate a haphazard and disorganized approach. Here’s a suggested format for homework assignments that will (hopefully) get you in the practice of turning into things in an organized fashion. Continue reading
I’m teaching an online workshop for The Analysis Factor on survival analysis. It’s not announced yet, and I have a LOT of work to do before it is ready. One thing that will save me time is that I am taking many of my examples from the excellent textbook, Applied Survival Analysis Second Edition. One nice perk of this book is that the helpful folks at UCLA have taken every textbook example, and written up code (with comments!) to reproduce the book’s results. With the exception of a few advanced methods in later chapters, where only one or two software packages have the right capability, the code is written in parallel in R, SAS, SPSS, and Stata. They also have links to the raw data at the publishers website, and datasets stored in SAS format and SPSS format. How nice! Browse around and you’ll find software code for all the examples in other popular statistics textbooks as well.
Warning! The R examples look like they are from the first edition, not the second edition. A small nitpick for an otherwise very nice resource. Continue reading
I’ve gotten some helpful feedback that I need to encourage more interactions among students in the on-line classes, Introduction to R, Introduction to SPSS, and Introduction to SAS. No just interactions of the students with the teacher, but interactions between the students.
In many online classes this is done by encouraging online discussion of the material in the class. This is not so easy, however, for these three classes. I can just imagine myself posting the following on Blackboard. “Tell me what you think about the read.csv function in R.”
There are a couple of ways, however, that make sense for technical classes like these. Continue reading
I have helped develop and have taught (along with other faculty in our department) three one credit hour pass/fail classes: Introduction to R, Introduction to SPSS, and Introduction to SAS. These classes were developed back in 2014-2015 and they are in need of some serious updates. I will try to outline some of the updates that I think these classes need in this blog post. Continue reading
In many text mining or natural language processing applications, you will have problems with words that are very similar, but which are counted separately. An example might be the words win, winner, and winning. You can combine these words into a single category using stemming. This blog post gives a nice overview of stemming. Continue reading