PMean: Calculating statistics that have limitations

Someone asked what you should do if you get a request to compute a p-value on a post hoc analysis. In general, any statistic computed on a post hoc analysis is likely to be biased, but a p-value is especially troublesome as it is much more prone to misuse or misinterpretation. Should you refuse to calculate this p-value? Here’s what I said.

As I get older, I get more mellow about things like this. Calculate the p-value and then warn the client to be sure to mention in any report or publication that the hypothesis was generated post hoc. Stress that failure to state that this was a post hoc hypothesis leaves him/her open to a charge of fraud. Be sure that your client knows and appreciates that anything that mentions your name without your prior review also constitutes fraud.

The sin in research is not in publishing something with lots of limitations. The sin is in pretending that it is something more than it really is. As long as the report clearly states the post hoc nature of the hypothesis and discusses the limitations associated with it, you should be fine. Use phrases like “this is an exploratory effort” or “the results need to be verified in a separate data set” or whatever. Some naive people will ignore your limitations, but if we all refused to publish articles that might be misinterpreted by naive people who ignore limitations, nothing would get published.

If you find out after the fact that something was published with your name on it and without you reviewing it, you have plenty of options to pursue. Write to the journal and talk to the ethics or compliance officer at the place where your client works.

I’d only turn down the work if you thought this person was a loose cannon who wouldn’t mind getting in trouble with everyone involved in the process. Most people are comfortable with reasonable requests to review things. Most people will also respect your request to leave your name out if you can’t agree on appropriate language in the report.