1574: Trouble for Science

Explain xkcd: It's 'cause you're dumb.
Revision as of 13:16, 8 September 2015 by Jv (talk | contribs) (clarified the nature of the published critiques in the first paragraph, highlighted the likely intent to contrast scientific vs. general public understanding of them)
Jump to: navigation, search
Trouble for Science
Careful mathematical analysis demonstrates small-scale irregularities in Gaussian distribution
Title text: Careful mathematical analysis demonstrates small-scale irregularities in Gaussian distribution

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: More details in each article, especially the one about antibodies and rodents.
If you can address this issue, please edit the page! Thanks.

The comic highlights the fact that several well-publicized scientific critiques have recently been published that raise questions about some commonly accepted scientific methods. For scientists, these critiques serve as reminders of the dangers of overconfidence in any method, hopefully leading those who have naively accepted results to remember that any scientific conclusion is by its very nature tentative and limited by methodological reliability. However, popular-press reporting of these papers may lead a general public of modest scientific literacy to the impression that science might be in trouble, as implicated by the title. Some of these methodological issues and shortcomings are well-known in the scientific community, but are for – better or worse – the best toolkit science has at its disposal today. This is however greatly exaggerated by the last (fictional) headline, which suggests that Bunsen burners in fact have a cooling effect, which is of course absolutely ridiculous, but would nevertheless change one more fundamental scientific belief drastically.

The title of five scientific articles are shown:

Many commercial antibody-based immunoassays are unreliable

This sentence is true. See Kebaneilwe Lebani, Antibody Discovery for Development of a Serotyping Dengue Virus NS1 Capture Assay, 2014. In this PhD thesis, 11 references are given.

Problems with the p-value as an indicator of significance

p-value is the probability that an event is observed just by chance. If p-value is under a threshold level (α, usually <5%, or <1% for being more conservative) one can assume that the event observed "exists". The value used for α has been proposed by Fisher and is completely arbitrary.

The use of p-values as a measure of statistical significance is frequently criticized, for example in Hubbard and Lindsay. Randall has demonstrated this problem in the past in 882: Significant.

Overfeeding of laboratory rodents compromises animal models

Keenan et al. makes this case.

Replication study fails to reproduce many published results

A Replication Study is a study designed to replicate the results of a previous study by using the same methods for a different set of subjects and experimenters. It aims to recreate the results to gain confidence in the results of the previous study as well as ensuring that the findings of the previous study are transferable to other similar areas of study.

Randall is probably referring to this recent study: http://www.nature.com/news/over-half-of-psychology-studies-fail-reproducibility-test-1.18248

Controlled trials show Bunsen burners make things colder

This is a joke, but possible in high temperature cases. There is probably some methodological error if putting something over the Bunsen burner flame (which is between 1000K and 2000K) makes it colder. If that thing were already much hotter than the flame (more than 2000 Kelvin), the Bunsen Burner's flame would equalize the temperature between the flame and thing resulting in cooling.

Careful mathematical analysis demonstrates small-scale irregularities in Gaussian distribution

This is another joke, as the Gaussian probability distribution function is a very smooth curve, the well-known "bell curve". This smoothness is due to the fast decay of its Fourier transform (the characteristic function), which rules out the existence of "local" or "small-scale" irregularities.

Transcript

[Five panels, each with the top part of a scientific article, where only the title is readable. Below is the list of authors and subheading and text in unreadable wiggles.]
Many Commercial Antibody-Based Immunoassays Are Unreliable
Problems With the p-Value as an Indicator of Significance
Overfeeding of Laboratory Rodents Compromises Animal Models
Replication Study Fails to Reproduce Many Published Results
Controlled Trials Show Bunsen Burners Make Things Colder


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

Sentence case, or down style, is one method, preferred by many print and online publications and recommended by the Publication Manual of the American Psychological Association. The only two rules are the two rules mentioned above: Capitalize the first word and all proper nouns. Everything else is in lowercase. http://www.dailywritingtips.com/rules-for-capitalization-in-titles/ 173.245.50.154 12:30, 7 September 2015 (UTC)

Problems with the p-value as an indicator of significance

The p-value alone can never be an indicator of significance. However, it is still often used as the only indicator, because a full set of parameters (including sample size, test setup, etc.) can't easily be packed into a single number. There's a nice article in nature about this problem: [1] I can also recommend story about (ab-)using hacked p-values to get maximum publicity. I hope this helps :-) --141.101.105.183 12:41, 7 September 2015 (UTC)

In this section, I really want to reword the p-valye explanation that "one can assume that the event observed 'exists'." Except where it's an event indirectly observed through a chained effect (unseeable gas molecules observed through brownian motion, unstable particles through detection of their decay particles, prehistoric meteorite impact through a geological/chemical fingerprint, etc) I think it should be more that "this (directly observed) event was directly linked to the presumed cause rather than spontaneous and random, at least w.r.t. the presumed cause being tested". But writing it better than I did just now. 141.101.99.114 19:36, 7 September 2015 (UTC)

I think the joke is that these newspapers are talking about how bad science is, and yet they manage to come up with a stupid story about Bunsen burners, presumably being too scientifically illiterate to know the problem. Timband (talk) 12:55, 7 September 2015 (UTC) Although reading the other comments, it's a much better joke if the Bunsen Burner story is actually true, because that makes all of them about journalists not realising that they are highlighting their own ignorance. Timband (talk) 16:05, 7 September 2015 (UTC)

See Significant for another comic on p-values.--Henke37 (talk) 14:22, 7 September 2015 (UTC)

One journal, Basic and Applied Social Psychology (vol. 37 pages 1–2, 2015), went so far as to ban p-values entirely. So, anti-p-value sentiment does seem to be on the rise. --scjphysicist (talk) 01:10, 12 September 2015 (UTC)

Controlled trials show Bunsen burners make things colder

Actually, I can easily imagine a way to use a Bunsen burner to make something colder. Involving an unlit Bunsen burner that has been placed in the freezer for a couple hours, for example. Nowhere in the headline is there any mention of a flame. --Svenman (talk) 12:59, 7 September 2015 (UTC)

Actually, there was a (badly formatted and badly placed, probably therefore now removed) comment on the explanation page earlier which pointed out that feeding a Bunsen burner from a propane bottle will cause the pressure, and therefore the temperature, in the bottle to decrease. That is a lot less contrived than my original idea. --Svenman (talk) 13:37, 7 September 2015 (UTC)
That was me. Trying to get my 2 cents in on my phone before I forgot. http://www.propane101.com/propaneregulatorfreezing.htm as an example. Mattiep (talk) 13:45, 7 September 2015 (UTC)
Thermodynamics actually doesn't guarantee that a lit Bunsen burner always heats up a cold object. It just tells us that the probability of it doing so is so high that you can trust any number of controlled trials to be unable to find a counterexample. --Gunterkoenigsmann (talk) 12:09, 29 December 2020 (UTC)
Correct me if i'm wrong here, but doesn't burning flame from a Bunsen burner cause the temperatures of the flame and the target object to equalize? Sure in most cases that results in a temperature increase in the target object, but I don't see why that would be true in all high temperature cases. The comment about "reducing the rate of heat loss in 2000K+ temp objects" would only be true if the gas (assuming any atmosphere at all) surrounding the target object was cooler than the flame from the bunsen burner. This gets worse in a perfect vacuum. If a 5000K object was in a perfect vacuum and somebody set a lit bunsen burner (assuming the tip had an Oxygen source) to spray across the target object, then the Flame would get hotter as it touched the hotter object and the object would cool as the two temperatures attempted to equalize. No reduction of heat loss would happen. Can we remove the comment about "reducing the rate of heat loss in 2000K+ temp objects" ? Harodotus (talk) 22:20, 7 September 2015 (UTC).
Found an article backing up my previous comment and lacking any objection for several hours, reveresed the note in the article.[2] Harodotus (talk) 23:58, 7 September 2015 (UTC)
Bunsen burners hasten the heat death of the universe, making things colder generally. Showing that in "controlled trials" seems like a challenge for a type 2 civilization, though. 198.41.241.73 08:30, 8 September 2015 (UTC)

I think the joke is in the wording of the headlines. The fact that a replication study fails to reproduce can be seen as a contradiction. Overfeeding rodents leads to fat rodents. This compromises their ability to function als animal (runway) models. I haven't figured out the other ones yet. But that's çause I'm dumb :-). Alva. 141.101.104.80 (talk) (please sign your comments with ~~~~)

It's way simpler than that - The joke is that people outside of sciences (with no understanding really of how to science) will report basically anything that sounds shocking or exciting, especially if it proves those nerdy, scary scientists wrong! So Randall gives us a bunch of possibly headlines that to a layman read like real, scary news about science, but to scientists this is stuff that is generally well known and understood. The last one is just taking it a step further for credulous news editors - They've been lying to us all this time! 13:33, 7 September 2015 (UTC)
I think it's even simpler than that: the title is "Trouble for Science" and it shows a series of misleading headlines about misleading (i.e.: invalidated) scientific studies. The implication is "Trouble for Journalism".173.245.54.87 14:21, 7 September 2015 (UTC)
I agree. All of the titles are poorly written. All immunoassays are antibody-based, so saying many commercial antibody-based immunoassays are unreliable is redundant, implying they have no idea what an immunoassay is. Problems with the p-value as an indicator of significance implies that there is some significant error in the use of a tool to measure significance of error, which leads one to wonder how they figured that out. If you don't know what a p-test is, the title is paradoxical. The last title would make someone assume that the controlled trials are using turned on bunsen burners to make things colder, but could mean almost anything, such as a bunsen burner being turned off the entire time, or a bunsen burner placed inside of a freezer, or even that people consider using bunsen burners in an experiment makes the experiment cool (or sweet or groovy or whatever). 173.245.56.155 (talk) (please sign your comments with ~~~~)
I would appreciate someone adding info about what an immunoassay is. Teleksterling (talk) 22:53, 8 September 2015 (UTC)

I generally agree, but would say if you DO know what a p-test is, the title is paradoxical. If you don't know what a p-test is, the title is meaningless. Miamiclay (talk) 07:05, 8 September 2015 (UTC)

This comic may be in reference to Monsanto's latest ailments. 173.245.52.112 (talk) (please sign your comments with ~~~~)

Replication study fails to reproduce many published results
Upon reading that specific headline, the rational behavior would be to question the veracity of all the other headlines before and after. I could see a paper picking up on that sensationalist-looking headline and ignoring the fact it casts doubt on whatever else they published. Ralfoide (talk) 14:56, 8 September 2015 (UTC)

Maybe I'm missing something obvious, but what is the irony in the first headline? Djbrasier (talk) 00:54, 9 September 2015 (UTC)

From [3]: "When a substance undergoes a phase transition (changes from one state of matter to another) it usually either takes up or releases energy. For example, when water evaporates, the kinetic energy expended as the evaporating molecules escape the attractive forces of the liquid is reflected in a decrease in temperature. The amount of energy required to induce the transition is more than the amount required to heat the water from room temperature to just short of boiling temperature, which is why evaporation is useful for cooling. " That could explain the Bunsen burner making things colder (i.e. having less kinetic energy)

About gaussian irregularities. Using a computer and floating point numbers, someone would see irregularities on a gaussian distribution. That amounts to sampling the curve with a small but finite precision. Computing the value a any given point could lead to rounding errors and would be seen as irregularities. 108.162.219.118 (talk) (please sign your comments with ~~~~)

That's like saying a crack in your telescope glass has revealed new stars.108.162.229.134 23:20, 11 September 2015 (UTC)

Gregory Chaitin makes a case for using experimentally observed mathematical relations to increase the expressiveness of mathematics beyond the limits of purely deductive axiomatic methods. If this trend is adopted, it might conceivably develop that a set of foundations that support what would then be known as the "normal distribution" could have significant irregularities which would result in either adoption of this new effect, or changing the foundational proposition from which the effect is derived, or both. Randall's headline may be predictive of the type of thing that may be seen as more mathematicians explore conjectures aided by computer computations using numeric and symbolic congruences. [Comet] 20:51, 9 September 2015 (UTC)

I think everyone is over-thinking this comic. In each headline, the question is "Well if that's the case, how did they prove it?" In other words, every test would have most likely made use of the technique that they studied in the study.

Anti-bodies-I don't know anything about this topic, so I can't explain the irony that I hypothesize to be there.

P-values-Presumably the researchers started with the null hypothesis that p-values are a good indicator of significance. They then disproved it with p<0.05.

Lab rats-They proved that animal studies are compromised. They undoubtedly used animals to conduct this experiment

Replication study-They couldn't replicate the results. To show that this is a robust phenomenon, other researchers should be able to replicate their results.

Bunsen burners-In their controlled experiment, they found that bunsen burners cool things down. But since bunsen burners are the heat-source of choice for many scientific investigations, they were probably the control heat source as well as the test.

Gaussian curve-The bell curve has irregularities in it. Assuming that these irregularities are independent, their effect is modelled by a Gaussian curve (ie the average irregularity in the faulty Gaussian curve will form a Gaussian distribution per the central limit theorem)

In each case, the joke is that the study results discredit the method that would have been used to prove the result. CAS 173.245.55.149 23:37, 11 September 2015 (UTC)

There's another interpretation. All of these articles are headlines in newspapers. Reporters will only bother to write and publish news articles about highly controversial or exciting results, framed in the most inflammatory way, regardless of their reliability or applicability. So we have carnival barkers in the news media cherry-picking and misrepresenting results they really don't understand.

But most scientists are also dependent on having a steady stream of published, novel results so they can get their grant money from the government. Which means "sexy" results that are publishable and impactful- i.e. worthy of mention in the non-scientific press. So of course we have sloppy methods and irreproduceable results-- those are the methods most likely to produce the kind of excitingly counter-intuitive results that get published and catch the notice of the mainstream media. Disciplined labs that publish properly vetted results will hit dry periods when their results are unexciting or their theories don't check out, and their grant money will dry up, and they will fall apart. 108.162.237.171 14:34, 15 September 2015 (UTC)

I think the bunsen burner part might be a reference to a demonstration a teacher once did. I can't find the reference, but when her students came in she showed them a metal plate next to a lit bunsen burner. The students observed that the side closest to the flame was colder, and she asked them to write down what they thought was going on. They wrote non-answers like, "because of heat conduction," and none of them came anywhere close to guessing the correct answer, which was simply that the teacher turned the metal plate around just before they came in. Shanek (talk) 16:46, 15 September 2015 (UTC)

I figured that this comic was mostly making a joke about how often newspapers describe things as "Trouble for Science!"... when most of the things being reported are merely niggles in one narrow area of one scientific field. Whereas this is a list of things which actually *would be* "trouble for science" in that that they would invalidate huge areas of scientific "knowledge". A few of them are real, most are not. 108.162.216.77 06:52, 23 September 2015 (UTC)

A Bunsen burner could be used to drive an absorption chiller (https://en.wikipedia.org/wiki/Absorption_refrigerator). In that case it could be said to indirectly "make things colder." 172.68.35.73 (talk) (please sign your comments with ~~~~)