I’m teaching an online workshop for The Analysis Factor on survival analysis. It’s not announced yet, and I have a LOT of work to do before it is ready. One thing that will save me time is that I am taking many of my examples from the excellent textbook, Applied Survival Analysis Second Edition. One nice perk of this book is that the helpful folks at UCLA have taken every textbook example, and written up code (with comments!) to reproduce the book’s results. With the exception of a few advanced methods in later chapters, where only one or two software packages have the right capability, the code is written in parallel in R, SAS, SPSS, and Stata. They also have links to the raw data at the publishers website, and datasets stored in SAS format and SPSS format. How nice! Browse around and you’ll find software code for all the examples in other popular statistics textbooks as well.
Warning! The R examples look like they are from the first edition, not the second edition. A small nitpick for an otherwise very nice resource. Continue reading
I got this recommendation from a friend. IBM has a large number of free resources explaining things like cloud computing and blockchain. I’m most interested in their section on analytics. There’s a nice introduction, for example, to natural language processing. Continue reading
I attended several talks about R at the Joint Statistics meetings and noted some interesting packages and other resources during these talks. I lost track of that list until recently, but they are still relevant, so here they are. Continue reading
This is a series of videos and homework exercises that you can work on at your own pace. I have only viewed the outline for this, but anything from DataCamp comes highly recommended. Continue reading
This is the github repository of Ben Baumer. He is one of the co-authors of “Modern Data Science with R” and the data and code from that book is available here. He also provides code and data for OpenWAR, an open source method for calculating a baseball statistic, Wins Above Replacement. Finally, there is an R library for extracting, transforming, and loading “medium” sized datasets into SQL. Medium here means multi-gigabyte sized files. Related to this are a couple of “medium” sized data sets from the Internet Movie Database and from the NYC CitiBike dataset. Continue reading
This paper talks about how to get students to think about large databases in an introductory class that normally uses “toy” problems with a few dozen rows of data. Continue reading
Through the effort of a team of statisticians with the American Statistical Association, the New York Times is producing a new resource for educators called “What’s Going On in This Graph?”. This is similar to another New York Times effort called “What’s Going On in This Picture?”
Every month the New York Times will publish a graph stripped of some key information and ask three questions: What do you notice? What do you wonder? and What do you think is going on in this graph?
The content will be suitable for middle school and high school students, but I suspect that even college students will find the exercise interesting.
The first graph will appear on September 19 and on the second Tuesday of every month afterwards. Continue reading
The Teaching of Statistics in the Health Sciences (TSHS) section of the American Statistical Association has put together a set of resources for teachers including several very interesting datasets. some of the resources are open to anyone, but others require a registration. Continue reading
As a community, we statisticians have known for a long time that we do not teach that introductory level class in Statistics as well as we should. This guideline list the things and ways we SHOULD teach as well as things that we might think about leaving out. Continue reading
This is a very clear, but also very detailed explanation of the for, while, and repeat loops along with the concept of vectorization. A great resource for beginners. Continue reading