Talk:2533: Slope Hypothesis Testing
In the line "Randall has repeatedly made comics about this hopeful error", should specific examples be provided? I know /882 is one, but I'm blanking on any others. 18.104.22.168 10:21, 26 October 2021 (UTC)
- Definitely, otherwise it would not be very useful. --22.214.171.124 13:10, 26 October 2021 (UTC)
I imagine that the problem here is that the errors are not independent. I can't find anything else wrong with this, but I feel like there's something obvious I'm not seeing. They might revoke my statistics degree if I miss something big here, hehe.--Troy0 (talk) 03:06, 26 October 2021 (UTC)
- The scores are clearly the one score they originally (sometime prior to the expanded test) received. Either that or multiple tests with the same exam questions without having given them enough feedback to change their answer-scheme at all. The volumes are probably a "good go at screaming" on demand, belying any obvious "test result -> thus intensity of scream" (what might be expected if the scream(s) of shock/joy/frustration were recorded immediately upon hearing a score).
- What they have here is a 1D distribution of scream-ability/tendency (which was originally a single datum), arbitrarily set against test scores. (Could as easily have been against shoe-size, father's income-before-tax, a single dice-roll, etc.)
- Whether there was an original theory that grades correlated with intensity of vocalisation is perhaps a valid speculation, but clearly the design of the test is wrong. Too few datum points, in the first instance, and the wrong way to increase them when they find out their original failing.
- The true solution is to recruit more subject. (And justify properly if it's intensity of spontaneous result-prompted evocations or merely general ability to be loud that is the quality the wish to measure. Either could be valid, but it's not obvious that the latter is indeed the one that they meant to measure.) 126.96.36.199 04:21, 26 October 2021 (UTC)
- It's pretty straightforward. This is a simple linear regression, Y = α + βX + ε, where α and β are parameters and ε is a random variable (the error term). Their point estimations for α and β are correct. But their confidence intervals (and thus p-values) are wrong, because they are based on a false assumption. They constructed their intervals assuming ε was normally distributed, which it clearly is not. ε will always be approximately normally distributed if the central limit theorem applies, but it does not apply here. The central limit theorem requires that the samples be independent and identically distributed. Here, they are identically distributed, but they are not remotely independent. After all, the same people were selected over and over again. Therefore ε will probably not be randomly distributed (it isn't even close), and the confidence intervals (and so p-values) are wrong. 188.8.131.52 09:10, 26 October 2021 (UTC)
- I agree. the second speaker starts to say "I said, are you sure--", this is the start of Cueball's last line. I think this is intended to be Cueball and Megan trying to talk about the results while the students are still screaming. TomW1605 (talk) 06:45, 26 October 2021 (UTC)
- It could also be the case that their hypothesis was true and they failed so badly at statistics, that their voices are inaudible now.