# Talk:882: Significant

Those lazy are playing minecraft instead of curing cancer! Lynch 'em! __Davidy__^{22}`[talk]` 00:35, 11 January 2013 (UTC)

- But I heard that Minecraft cures cancer... ! Investigate! <off: cheers from active group, boos from the control group> 178.99.81.144 19:31, 30 April 2013 (UTC)
- You know this experiment isn't conducted properly when you know you're in the control group. Troy (talk) 05:24, 4 March 2014 (UTC)

Um, I take it that whoever explained this comic can't tell the difference between < and >, as the fact that the confidence was changed wasn't mentioned in the article... 76.246.37.141 23:19, 20 September 2013 (UTC)

- Yes, I also figured out this today, green is lower than 0.05, on other colors there is just a confidence that it's NOT lower than 0.05. The newspaper did add this remaining 19 panels to 95%. The article is marked as incomplete, it needs a major rewrite.--Dgbrt (talk) 19:12, 3 October 2013 (UTC)

This explanation seems to misinterpret α. α is the chance of rejecting a true null hypothesis, a false positive. The 5% here is α. The correct interpretation of it is that if the null hypothesis is true, there is a 5% chance that we will mistakenly reject it. P in "P<0.05" is the chance that, if the null hypothesis is true, a result as extreme as, or more extreme than, the result we get from this experiment. **α is not the chance that, given our current data, the null hypothsis is true. We wish to know what that is, but we do not know.**108.162.215.72 08:52, 16 May 2014 (UTC)

In layman's terms, the comic appears to misrepresent what "95% confidence" (p <0.05) means. The statistic "p < 0.05" means that when we find a correlation based on data, that correlation will be a false positive fewer than 5 percent of the time. In other words, when we observe the correlation in the data, that correlation actually exists in the real world at least 19 out of 20 times. It **does not** mean that 1 out of every 20 tests will produce a false positive. This comic displays a pretty significant failure in understanding of Bayesian mathematics. The 5% chance isn't a 5% chance that any test will produce a (false) positive; it's a 5% chance that a statistical positive is a false positive. 108.162.219.196 (talk) *(please sign your comments with ~~~~)*

- No, you are deeply mistaken. The comic and the comment above you are correct in saying that if the null hypothesis holds, 1 out of every 20 tests will produce a false positive: this is by definition of the p-value. The ratio of true positives to false positives can range anywhere from 0 to infinity, and there is unfortunately no way to predict it. 108.162.229.121 09:46, 27 January 2015 (UTC)

The explanation appears somewhat confused, as correctly noted in a couple of comments above. The most common misunderstanding of p-values is that they represent how likely it is that the observed correlation (or observed unequal outcomes, or apparent trend) came from chance. That is not what they represent - they represent the probability that results at least as extreme as those observed would have arisen by chance: 1) in a fictional world where chance was the only potential cause of the correlation/inequality/trend (a world in which the null hypothesis was true) AND 2) only one hypothesis was being tested. In the real world, other factors may be more or less plausible as explanations, and it takes judgement, not stats, to determine how likely it is that chance is the best explanation. The green jelly beans theory fails in terms of biological plausibility, so it is >99% likely to be a chance observation (regardless of the p-value). Also, given the large number of hypotheses being tested, the probability of at least one of them producing a p-value <0.05 is much greater than 5%; indeed, with 20 simultaneous hypotheses, we would expect about one to be significant at the p<0.05 level, on average. There is a huge difference between the prospective probability of a single hypothesis satisfying the p<0.05 threshold, and the probability of being able to find a retrospective hypothesis for which p<0.05. This is a case of post hoc cherry picking - the newspaper's emphasis on green jellybeans is post hoc, with the colour of interest chosen after the results were already in. 108.162.250.163 (talk) *(please sign your comments with ~~~~)*

Is the "e" in "News" supposed to look like an epsilon (and the "w" a rotated epsilon)? 108.162.250.222 15:00, 15 December 2014 (UTC)

- It's probably just a stylistic thing. GrandPiano (talk) 04:00, 28 January 2015 (UTC)

In the comic, they mention that there is a link between green jelly beans and acne. However, assuming there to be no real link, there is 50% chance that this link was caused by 95% confidence that green jelly beans help with acne.Mulan15262 (talk) 03:14, 12 November 2015 (UTC)

This comic was referenced in the book "How Not to be Wrong" by Jordan Ellenberg. SilverMagpie (talk) 20:41, 21 January 2017 (UTC)

One thing I've gleaned from this is that they apparently opened a bag of Jelly Bellys or Gimball's and tested them in whatever order. I say this because they hit colors you'd never see in the smaller-palette brands of jelly beans (brown, teal, salmon) before some very common colors (red, yellow, black, green). If it were me, I would probably have started with a smaller-palette brand, since their colors affect *everyone* who eats jelly beans, and not just the ones who go for the gourmet brands. Nyperold (talk) 12:58, 6 July 2017 (UTC)

This kind of error is why you use ANOVA. 162.158.63.238 20:21, 29 October 2018 (UTC)

Want to reiterate that using "95% confidence" for statistical significance means having a threshold of <.05 for the p-value, and the p-value is the probability noise alone would have generated a change this big or bigger. If all you ran all day long were a/a tests (randomly assign people to two groups but give them the exact same experience) then 5% of your tests would be stat sig for any given metric. However, the chance of at least one false positive over 20 tests is only 64% (1-.95^20), not 100%. But of course, you also might get MORE than 1 false positive in 20 experiments, so the expected value for the NUMBER of false positives after 20 experiments IS 1. If that hurt your brain, welcome to probability!

### Academic Citation[edit]

I don't know if it's interesting to anyone, but in a 2000 paper my coauthor (Sun) and I (Cuthbert) just decided to cite the comic rather than get into a whole academic discussion of the problems of multiple hypothesis testing: [1]. Cheers, Michael (172.70.214.84 21:25, 26 July 2024 (UTC))