Editing 1347: t Distribution

{{comic
| number    = 1347
| date      = March 26, 2014
| title     = t Distribution
| image     = t_distribution.png
| titletext = If data fails the Teacher's t test, you can just force it to take the test again until it passes.
}}

==Explanation==
The {{w|Student's t-distribution}} is a class of {{w|probability distribution}} used in statistics to model small sample sizes. "Student" was the pseudonym of {{w|William Sealy Gosset}}, an employee of Guinness Brewery who discovered it.

A Student's t distribution is similar to a normal symmetric bell curve distribution, but has "fatter tails"; thus, the one shown in the comic is roughly the right shape.  A "Teacher's" t-distribution is a joke (pun) made up by Randall.

The comic is a play on the name "Student", the pseudonym of the creator, versus the "Teacher". The idea is that a "teacher's" distribution would be more complex, and that it would be used for fitting data when the student's distribution wasn't sophisticated enough. Of course, in actuality, such a complex distribution as the one shown in the comic would have many parameters, and in practice would probably lead to overfitting and/or bias. Thus, the comic (and the title text) can be seen as making fun of the idea that more complex is always better, or perhaps of the idea that a statistician's job is to use more and more sophisticated tools to force the data to yield a "publishable" result, rather than to use the simplest appropriate tool and let the chips fall where they may. 

[[Cueball]] tries to "fit" a distribution to the data on the paper. This is the usual jargon for when a statistician is trying to model his/her data as coming from some underlying probability distribution, and the comic makes a pun with the physical meaning of "fit". In the second panel, Cueball decides that the Student's T distribution does not fit his data well (the data failed the Student t-test), and decides to pull out the more complex Teachers t-distribution instead (the teachers t-test - which the data is not allowed to continue to fail). Note that "test" is what statisticians do to data to see if it fits some distribution, but it is also another word for "examination."

The Students t distribution relates the average of a small sample to the "true" population average, under the assumptions, unobjectionable in many contexts, that there is such a "true" value, and that the samples are independent and normally distributed with equal variance. As such, unless the data on Cueball's paper contain many small groups which radically violate these assumptions somehow, there is no way Cueball's data could falsify the t distribution. In particular, a single number (for the average of one group) or a small set of numbers (for the averages of several numbers) will never make a nice smooth curve, but an average statistician would see that as normal statistical noise that would even out over time, not as a reason to prefer a complex, spiky curve such as the supposed "teacher's" distribution. But of course, Cueball's access to a secret, cooler-looking distribution makes them more badass than a mere average statistician... or does it?

Ironically, the Teacher's T Distribution shows equal variance, itself proving the appropriateness of the Student's T Distribution.

The title text plays on the word "test". The first part of the sentence refers to a potential "Teacher t-test" which would be used in a statistical context to test for the significance of some observation, as opposed to the real "Student's t-test" which is used to determine if two sets of data differ by a statistically significant amount. On the other hand, the second part of the sentence refers to the possibility for students to take tests (or exams) until they pass - or to teachers who forces students to take the test again and again until they pass. The resulting sentence may refer to statistical fallacy, or the (conscious or unconscious) action of manipulating observations or misconducting experiments to give statistical significance to a false fact.

==Transcript==
:[A physical bell-curve-shaped object labeled "Student's t distribution" is resting on a table. Cueball is working with it and a piece of paper.]
:Cueball: hmm 
:[Cueball looks at the piece of paper.]
:Cueball: ...nope.
:[Cueball picks up the object and begins to walk off the panel with it.]
:[Cueball comes back onto the panel, now carrying an object shaped like a much more complex curve, with many symmetric spikes and dips, labeled "Teacher's t distribution".]

{{comic discussion}}
[[Category:Comics featuring Cueball]]
[[Category:Statistics]]
[[Category:Puns]]