Tag Archives: Big data

Quote: In a world where the price of calculation continues to decrease rapidly…

“In a world where the price of calculation continues to decrease rapidly, but the price of theorem proving continues to hold steady or increase, elementary economics indicates that we ought to spend a larger and larger fraction of our time on calculation.” John Tukey, as quoted in “Sunset Salvo”, The American Statistician 1986; 40(10): 72-76.

Recommended: How a Feel-Good AI Story Went Wrong in Flint

Building a great statistical model does no one any good if it doesn’t pay attention to non-statistical issues. This story talks about a machine learning model to identify which houses in Flint Michagan that were the best candidates for removal of lead pipes. The model worked fairly well, but came up against problems like individual city council members wanting to assure their constituents that enough was being done in their district. I’m not sure what the actual moral of this story is, but it does serve as a warning to be careful when you are modeling data in a contentous area. Continue reading

Recommended: Microsoft is creating an oracle for catching biased AI algorithms

Artificial Intelligence (AI) algorithms that are used for crime detection, loan approvals, and employee evaluations are considered by many to be objective, but they can sometimes have many of the same prejudices and biases that human evaluators have. Given the opacity of many black box approaches to AI, this could lead to serious problems with fairness and equity. This article discusses an admittedly imperfect approach by Microsoft to evaluate these AI algorithms using (surprise!) an AI algorithm. It flags situations where an algorithm appears to have problems with unfair differential treatments  based on race, gender, or age. Continue reading

Recommended: Statistical and Machine Learning forecasting methods: Concerns and ways forward

At first glance, you might think that this article looks like a vindication of traditional statistics. Classical time series models (methods that were available in the 1960′s) outperform newer machine language forecasting models. Then, you might worry that the comparisons were unfair. But neither viewpoint is accurate. The classical time series models have certain structural advantages for certain types of problems, but you might be better off with machine learning if you use classical time series as a preprocessing step, such as de-seasonalizing your data. If nothing else, this article provides a nice overview of some of the major machine learning methods. Continue reading