“The most challenging thing in the world is not to learn fancy technologies, but control your own wild heart.” Yihui Xie, as quoted in Appendix C of Authoring Books wtih R Markdown.
“The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from some data.” John Tukey, as quoted in “Sunset Salvo.” The American Statistician 1986, 40(1), 72-76.
Building a great statistical model does no one any good if it doesn’t pay attention to non-statistical issues. This story talks about a machine learning model to identify which houses in Flint Michagan that were the best candidates for removal of lead pipes. The model worked fairly well, but came up against problems like individual city council members wanting to assure their constituents that enough was being done in their district. I’m not sure what the actual moral of this story is, but it does serve as a warning to be careful when you are modeling data in a contentous area. Continue reading
I’m giving a talk on Wednesday, August 22, titled “Business essentials for starting an independent consulting practice.” Here are a few details about this talk. Continue reading
I’m planning to give a talk on “The Dark Side of Data Science” and I’m hoping to get some interesting references and articles from my colleagues. Here is a first draft of my abstract, with a few references that I am already familiar with. Continue reading
This article starts with a nice anecdote about being dismissive about what someone else is saying ends up hurting you. It also provides a nice structure, POWER, for organizing consulting meetings. POWER stands for Prepare, Open, Work, End, and Reflect. This article was a basis for some of the content in an interesting webinar on consulting. Continue reading
Someone asked me about what sort of contract to use with a new client. This person did not need a very detailed contract, but said that a handshake would not suffice. Here’s what I suggested. Continue reading
An email discussion about the appropriate percentage effort on research grants has produced a lot of interesting discussions. One person raised an interesting question. The typical data analysis, he claimed, might involve a few hours reviewing the input data set, a few hours conducting the analysis and a few hours preparing a statistical summary, but even after a generous estimate of the work at each of the time points, he could only come up with 22 hours of effort, which corresponds roughly with a 1% FTE. I wrote back describing some of the things that might occur before the data analysis that might add time to this effort. Continue reading
I heard a story a long time ago, and I don’t remember who told it to me and I’m probably getting all the details wrong, but I wanted to try to recreate the story from memory because it illustrates one of the perils of blind reliance on statistical models to identify “important” variables. Continue reading
I’ve gotten an inquiry about teaching a couple of webinars. Nothing’s official yet, but let me outline the these webinars here on my blog. If the offer becomes official, I will update on this blog post or on a new post. Continue reading