# Recommended: Practical advice for analysis of large, complex data sets

This is a nice compilation of issues that you should be concerned. The examples are mostly from things that interest Google, but you will find this advice itself is useful no matter what type of data you work with. The advice is split into three broad categories: technical (e.g., look at your distributions), process (e.g., separate validation, description, and evaluation), and communication (e.g., data analysis starts with questions, not data or a technique). Continue reading

# PMean: About those “awful” election predictions

If you were on Mars for the past few days, you may not have noticed that Donald Trump has won the election. There has been a lot of commentary lately about how badly the predictions about the U.S. election have been and someone mentioned that even Nate Silver at the fivethirtyeight website had a predicted probability of a Clinton win at 71%. I wrote a brief comment that predicting an event with 71% probability does not mean that your prediction was “wrong” if the other event occurs. Continue reading

# PMean: A simple example of pipes in R

At the Joint Statistical Meetings this year, I learned a lot about recent developments in R, and not so recent developments that I was totally clueless about. One of those developments was the use of pipes in R. I wanted to show a simple example of how pipes can simplify your code. Continue reading

# PMean: Misunderstanding autism

A friend of mine posted an inspiring story published in the Washington Post. Unfortunately, it did not inspire me, but rather made me worried about how often we misunderstand autism and how much trouble this causes. It’s not statistics, per se, but rather represents an example of how research on new approaches for patients with autism can end up being abusive. Continue reading

# PMean: Measuring pixels in an R graph

I have an R cheat sheet, How Big Is Your Graph, that explains how to measure the size of various features of your graph in R. This blog post illustrates unit conversions. If you want to measure the length of a diagonal line segment in an R graph, you need to calculate the size of the plotting region in pixels, compare that to the range of the plotting region in the x and y directions, and then apply the Pythagorean Theorem. Continue reading