<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PMean &#187; Human side of statistics</title>
	<atom:link href="http://blog.pmean.com/tag/human-side-of-statistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.pmean.com</link>
	<description>A blog about statistics, evidence-based medicine, and research ethics</description>
	<lastBuildDate>Sat, 22 Jun 2024 17:10:26 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=3.7.41</generator>
	<item>
		<title>Recommended: 12 things I wish I’d known before starting as a Data Scientist</title>
		<link>http://blog.pmean.com/twelve-things/</link>
		<comments>http://blog.pmean.com/twelve-things/#comments</comments>
		<pubDate>Wed, 06 Mar 2019 21:04:07 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Recommended]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1782</guid>
		<description><![CDATA[This page is moving to a new website.]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/twelve-things/">new website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/twelve-things/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quote: The most challenging thing in the world&#8230;</title>
		<link>http://blog.pmean.com/fancy-technologies/</link>
		<comments>http://blog.pmean.com/fancy-technologies/#comments</comments>
		<pubDate>Fri, 01 Feb 2019 22:27:57 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Quote]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1736</guid>
		<description><![CDATA[This page is moving to a new website.]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/fancy-technologies/">new website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/fancy-technologies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quote: The combination of some data &#8230;</title>
		<link>http://blog.pmean.com/combination-of-some-data/</link>
		<comments>http://blog.pmean.com/combination-of-some-data/#comments</comments>
		<pubDate>Sun, 20 Jan 2019 15:06:08 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Quote]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1708</guid>
		<description><![CDATA[This page is moving to a new website. &#8220;The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from some data.&#8221; John Tukey, as quoted in &#8220;Sunset Salvo.&#8221; The American Statistician 1986, 40(1), 72-76.]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/combination-of-some-data/">new website</a>.</p>
<p>&#8220;The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from some data.&#8221; John Tukey, as quoted in &#8220;Sunset Salvo.&#8221; The American Statistician 1986, 40(1), 72-76.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/combination-of-some-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recommended: How a Feel-Good AI Story Went Wrong in Flint</title>
		<link>http://blog.pmean.com/ai-story/</link>
		<comments>http://blog.pmean.com/ai-story/#comments</comments>
		<pubDate>Wed, 09 Jan 2019 20:03:22 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Recommended]]></category>
		<category><![CDATA[Big data]]></category>
		<category><![CDATA[Ethics in research]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1690</guid>
		<description><![CDATA[This page has moved to a new website.]]></description>
				<content:encoded><![CDATA[<p>This page has moved to a <a href="http://new.pmean.com/ai-story/">new website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/ai-story/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PMean: Business essentials for starting an independent consulting practice</title>
		<link>http://blog.pmean.com/business-essentials-2/</link>
		<comments>http://blog.pmean.com/business-essentials-2/#comments</comments>
		<pubDate>Fri, 17 Aug 2018 17:31:26 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Human side of statistics]]></category>
		<category><![CDATA[Professional details]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1581</guid>
		<description><![CDATA[This page is moving to a new website.]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/business-talk-2018/">new website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/business-essentials-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PMean: The Dark Side of Data Science</title>
		<link>http://blog.pmean.com/dark-side/</link>
		<comments>http://blog.pmean.com/dark-side/#comments</comments>
		<pubDate>Mon, 13 Aug 2018 20:18:54 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Big data]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1570</guid>
		<description><![CDATA[This page is moving to a new website.]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/dark-side-talk-2018/">new website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/dark-side/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recommended: How to be more effective in your professional life</title>
		<link>http://blog.pmean.com/effective-consulting/</link>
		<comments>http://blog.pmean.com/effective-consulting/#comments</comments>
		<pubDate>Tue, 10 Jul 2018 18:06:11 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1547</guid>
		<description><![CDATA[This page is moving to a new website.]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/effective-consulting/">new website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/effective-consulting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PMean: What goes into a contract for a consultation</title>
		<link>http://blog.pmean.com/consulting-contract/</link>
		<comments>http://blog.pmean.com/consulting-contract/#comments</comments>
		<pubDate>Sun, 24 Jun 2018 22:06:20 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1532</guid>
		<description><![CDATA[This page is moving to a new website. Someone asked me about what sort of contract to use with a new client. This person did not need a very detailed contract, but said that a handshake would not suffice. Here&#8217;s what I suggested. I have never used a contract, unless the client draws one up. If [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/consulting-contract/">new website</a>.</p>
<p>Someone asked me about what sort of contract to use with a new client. This person did not need a very detailed contract, but said that a handshake would not suffice. Here&#8217;s what I suggested.<span id="more-1532"></span></p>
<p>I have never used a contract, unless the client draws one up. If you want a contract, talk to a lawyer, and make sure it addresses the following issues.</p>
<p>- your consulting rate and payment schedule</p>
<p>- the scope of work</p>
<p>- who maintains copyright and/or patient rights over any material produced</p>
<p>- how liability is handled if a third party sues both you and your client</p>
<p>- how disputes over contract details should be handled</p>
<p>- conditions under which either party can sever the contract</p>
<p>- what terms of the contract remain in force after severance</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/consulting-contract/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PMean: What are we doing to justify all that time we&#8217;re budgeting?</title>
		<link>http://blog.pmean.com/budgeting-time/</link>
		<comments>http://blog.pmean.com/budgeting-time/#comments</comments>
		<pubDate>Thu, 17 May 2018 06:29:00 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Human side of statistics]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1512</guid>
		<description><![CDATA[This page is moving to a new website. An email discussion about the appropriate percentage effort on research grants has produced a lot of interesting discussions. One person raised an interesting question. The typical data analysis, he claimed, might involve a few hours reviewing the input data set, a few hours conducting the analysis and a [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/budgeting-time/">new website</a>.</p>
<p>An email discussion about the appropriate percentage effort on research grants has produced a lot of interesting discussions. One person raised an interesting question. The typical data analysis, he claimed, might involve a few hours reviewing the input data set, a few hours conducting the analysis and a few hours preparing a statistical summary, but even after a generous estimate of the work at each of the time points, he could only come up with 22 hours of effort, which corresponds roughly with a 1% FTE. I wrote back describing some of the things that might occur before the data analysis that might add time to this effort.<span id="more-1512"></span></p>
<p>I have to respond to the estimate that a typical effort on a research grant might involve, at most, 22 hours of effort. To be honest, there are settings where this happens for me, but more often than not, I put in a lot more than 22 hours of effort. This includes:</p>
<p>* helping write documents for IRB, and responding to questions from the IRB,</p>
<p>* preparation of data collection systems,</p>
<p>* review of questionnaires,</p>
<p>* development of randomization tables,</p>
<p>* regular and on-going reviews of data as it is being collected,</p>
<p>* discussions about small refinements and sometimes major changes in the protocol during the study.</p>
<p>This all comes before the data analysis. Once the data analysis is done, however, there is still more work to prepare the manuscript. I find that the estimates of 4 to 6 hours are a gross underestimate, mostly because the people I work with have a lot of trouble writing and I end up reviewing and revising their work a lot.</p>
<p>It is also fairly common to see two or three distinct publications coming from the same research grant. I&#8217;m sure that I left out a lot of things here, but even with just the things that I did mention, it would certainly require much more than 22 hours of effort.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/budgeting-time/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PMean: And the least important variable is&#8230;</title>
		<link>http://blog.pmean.com/least-important-variable/</link>
		<comments>http://blog.pmean.com/least-important-variable/#comments</comments>
		<pubDate>Thu, 29 Mar 2018 01:13:03 +0000</pubDate>
		<dc:creator><![CDATA[pmean]]></dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Human side of statistics]]></category>
		<category><![CDATA[Linear regression]]></category>

		<guid isPermaLink="false">http://blog.pmean.com/?p=1438</guid>
		<description><![CDATA[This page is moving to a new website. I heard a story a long time ago, and I don&#8217;t remember who told it to me and I&#8217;m probably getting all the details wrong, but I wanted to try to recreate the story from memory because it illustrates one of the perils of blind reliance on statistical [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>This page is moving to a <a href="http://new.pmean.com/least-important-variable/">new website</a>.</p>
<p>I heard a story a long time ago, and I don&#8217;t remember who told it to me and I&#8217;m probably getting all the details wrong, but I wanted to try to recreate the story from memory because it illustrates one of the perils of blind reliance on statistical models to identify &#8220;important&#8221; variables.<span id="more-1438"></span></p>
<p>A statistician was asked to analyze some data about an industrial process and there were about a dozen or so independent variables that affect the outcome. So the statistician did some sort of stepwise regression or R-squared calculation and came up with an ordering for all the independent variables. The most important variable was the one with largest correlation or the first variable entered in the stepwise model (I&#8217;m not sure which, but the point is the same either way). The second most important variable was the one with the second largest correlation or the second variable entered in the stepwise model.</p>
<p>The statistician reviewed each variable in order starting with the most important variable. It was rather dull, of course, until the statistician got to the bottom of the list. He proclaimed &#8220;and the least important variable is the amount of water in the raw material.&#8221;</p>
<p>At this point the engineers in the room burst into laughter. It turns out that water was the most important variable. If you had even a small amount of water in the raw material, the entire production process would explode. The engineers spent a huge amount of effort to keep the water down to a level that was barely measurable.</p>
<p>If a variable has very little variability in it by design, you cannot expect to see a large correlation. This is sometimes called a restriction of range problem. The SAT test for college applicants has an upper bound of 2400 and for some high end Universities, they may end up admitting only students scoring 2350 or higher. That&#8217;s a very narrow range, and if it turns out that the SAT scores at this place are a poor predictor of future performance (like GPAs or graduation rates), that may be more a function of the very narrow range of students that were admitted than anything else.</p>
<p>Now whenever I hear a story like this, I think of the preachers quote &#8220;There but for the grace of God go I&#8221; that has been attributed to John Bradford. I bet I&#8217;ve said stuff even stupider than what this statisticians is supposed to have said. It is just dumb luck, or God is looking out for me, or something else that keeps me from the one being publicly humiliated.</p>
<p>How do you avoid saying something so stupid that everyone laughs at you? Well, the obvious answer is to talk to one of the engineers first and show them what you are going to say. Better for that one Engineer to laugh at you in private than having a whole room of Engineers laugh at you in public.</p>
<p>What do you do if you recognize that have a restriction of range problem? Well, first, drop the correlations in favor of a regression model. A linear regression model is not perfect, but it is a lot better than a correlation coefficient in this situaiton. Second, draw lots of graphs. Third, talk to the experts. Fourth, disclose the restriction of range as a possible limitation to your findings. A restriction of range means that you may be trying to extrapolate beyond the range of your data (your graphs will help show this) and this type of extrapolation will often require making untestable assumptions.</p>
<p>And if anyone knows the source of this story or can point me to a reference, I would be forever in your debt.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.pmean.com/least-important-variable/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
