Editing 2599: Spacecraft Debris Odds Ratio

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 8: Line 8:
  
 
==Explanation==
 
==Explanation==
This comic is a misunderstanding of statistics very similar to that of [[1252: Increased Risk]]. It suggests that going outside for more than 5 hours per day significantly increases your risk of head injury from falling spacecraft, and advises to limit outside activity to avoid this risk.  
+
{{incomplete|Created by an EVENS RATIO - Explain how the 280% from the title text comes about. Do NOT delete this tag too soon.}}
 +
This comic is a misunderstanding of statistics very similar to that of [[1252: Increased Risk]]. It explains that going outside for more than 5 hours significantly increases your risk of head injury from falling spacecraft, and advises to limit outside activity to avoid this risk.  
  
The data are apparently based on a {{w|Monte Carlo Method|Monte Carlo simulation}}, a computational method that uses input values randomly drawn from a given distribution and which repeats that calculation many times; the distribution of the outputs is then analyzed. This method is used to determine the possible outcomes (and their respective probabilities) for a given scenario. Basically, instead of doing hard math to calculate the outcomes you let a computer repeat the scenario for a huge number of different input values and watch what happens.
+
However, since the odds of being hit in the head by (any part of) a falling spacecraft are astronomically low to begin with, {{Citation needed}} quadrupling it or more still results in a negligible probability. The horizontal error bars for times greater than 4 hours are marked with asterisks to indicate they are significantly different from the reference value at 0 hours, as indeed those error bars don't overlap the vertical line for the 0-hours reference value. The graph originally referred to "hours spent outdoors" but later was edited to specify "hours spent outdoors per day" (as opposed to, for example, hours spent outdoors in one's lifetime).
In this case, the study might have consisted of defining the baseline probability of spacecraft debris falling from the sky in a given time frame (say, 1% every minute) as well as the probability that it is heavy enough to break through the roof (say, also 1%), translating this to the output of a random number generator (e.g. "1" means "space debris falls in direction of head and can break through the roof", 2-100 means "space debris falls in direction of head  but can't go through a roof" and values 101-10000 mean "no danger from space debris"), adding another random number generator to simulate the distributions for "person is outside X hours of the day", then drawing numbers repeatedly from both distributions and calculating the outcome for each instance.
 
  
Doing a Monte Carlo simulation for a hypothetical and rare scenario like this can make sense: it is so rare for humans to be struck by spacecraft debris that an absurdly large sample size, involving tens of millions of participants over several decades, would be necessary to obtain significant experimental data.
+
{{w|Error bar}}s are graphical representations of the variability of data and used on graphs to indicate the error or uncertainty in a reported measurement.
However, the statistical analysis and presentation of the data is horribly misleading and sensationalizing. The comic essentially pokes fun at the way that data can be misrepresented and exaggerated using an example that people  would realize is absurdly unlikely.  
 
  
The results are presented not as an overall probability but rather as an {{w|Odds_ratio|odds ratio}} of the probabilities. The odds ratio is defined as p(A happens in presence of B)/p(A happens in absence of B), which here would be p(space debris head injury after Xh spent outside and 24-Xh inside)/p(space debris head injury after 24h spent inside). The resulting value tells you how much more likely an outcome becomes if you do (or have) A. E.g. the bottom line of the graph in the comic means that spending 11+ hours outside will make it 3 times as likely to get a head injury from space debris compared to not being outside at all.
+
Presenting the data by hour brackets hides the data distribution inside each bracket. If the data were presented hour by hour, and not by groups of hours, they may show a different threshold of increased risk or no threshold (odds ratio could be linear).
However, while odds ratios can be useful they tend to hide the scale of a probability - e.g. 0.00000000002%/0.00000000001% = 2, the outcome became twice as likely but the probability only rose by 0.00000000001%. And since the odds of being hit in the head by (any part of) a falling spacecraft are [https://www.livescience.com/33511-falling-nasa-satellite-uars-risk.html astronomically ([[559: No Pun Intended |no pun intended]]) low to begin with], even quadrupling it still results in a negligible probability.
 
  
The choice of hour brackets instead of a linear time scale is suspicious. Monte Carlo simulations involve a huge number of computations; the scientists should have more than enough data to plot the odds ratio for every additional hour spent outside. Moreover, each hour bracket has a different size - why didn't they use a regular binning like e.g. 1-3, 4-6, 7-9, 10-12? One might suspect that they wanted to conceal inconsistencies and that the underlying data points by themselves don't look nearly as convincing.
+
The graph and error bars are based on a {{w|Monte Carlo Method|Monte Carlo simulation}}, a type of computational algorithm that uses repeated random sampling to obtain the likelihood of a range of results of occurring; see, for instance, this article about [https://www.ibm.com/cloud/learn/monte-carlo-simulation Monte Carlo simulations]. Additionally, this may indicate that the entire study was conducted via a monte carlo simulation and that no real data was collected adding to the absurdity of the claim that more time spent outside could lead to an increased risk of head injuries due to falling space craft.
Moreover, range-based groups of any kind should never be analyzed as if they were independent categories. Spending 5 hours outside is not intrinsically different from spending 1 hour outside - the 5-hour-mark (presumably) doesn't suddenly turn humans into space-debris magnets. The likelihood of space debris falling down at any given moment stays the same and the cumulative (i.e. summed-up) probability should increase at a constant rate. Instead of comparing every hour bracket to the same baseline reference, each should each be compared to the next-lowest value.  
 
  
The error bars (the lines extending from the points in the graph) are HUGE compared to the effect they measured. Error bars define the range in which the true value might be - here, for 2-4 hours the true value could be an increase by 2, or a small DEcrease of the probability.  
+
The specific reference to falling spacecraft is likely inspired by events happening around the time of this comics release (March 2022).  Around a month before this was posted, the head of the Russian space agency, {{w|Roscosmos}}, warned that sanctions against Russia (mostly those over the {{w|2022 Russian invasion of Ukraine}}) could result in the {{w|International Space Station}} crashing. Since the Russian section of the space station is the one that provides propulsion (although it is built to rely on the power generated by the other sections), this was taken seriously and as of when this was posted, {{w|NASA}} was trying to come up with alternative stabilization strategies in case the situation worsened. There was also a recent [https://www-uol-com-br.translate.goog/tilt/noticias/redacao/2022/03/17/parte-do-foguete-spacex-e-encontrada-por-morador-do-pr.htm?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=pt-BR&_x_tr_pto=wapp report] of some 600 kg space rocket debris found in Brazil.  
  
[[1429: Data|The data]] are shown on a [[1162: Log Scale|log scale]]. Logarithmic scales are used when you have both very small and very large values and want to depict their relative differences in a single plot without making the small values look like zero or cutting off the large values. The data shown here do not have huge differences - there is no good reason for using a log scale. However, the log scale is conveniently chosen to make the error bars look like they have the same length. They do not. The error bar for the last data point is actually twice as large as that for the first data point.
+
The title text makes a similar joke. While the increase in chances of death by a bear attack are greater when going outside than the decrease in chances of death by cardiovascular disease, by getting out to exercise, it is incorrect to combine them in this way, since cardiovascular disease has a much higher starting chance of death, and reducing it by 30% has a much more significant effect on overall life expectancy than quadrupling the very very small chance of death by bear attack.
  
 
+
The "280% increase" of the title text is also an error, though perhaps not for reasons that are obvious at first (for instance, the correct calculation is not "300% − 30% = 270%"). To "increase by 300%" means multiplying the probability by (1 + 3.0) = 4.0, while to "decrease by 30%" means multiplying by (1 − 0.3) = 0.7. Combining these means multiplying by both, for an overall change of 4.0 × 0.7 = 2.8, or 280%. However, this result means the risk has increased ''to'' 280% of its old value, not ''by'' 280%. And in any case, it is ''still'' not valid to simply combine two changes in wildly different risks like this.
The title text continues the misuse of statistics by insinuating that a 30% decrease of cardiovascular disease resulting from going outside (and exercising) is outweighed by a simultaneous 300% increase of risk of being killed by a bear. As shown in [[1102: Fastest-Growing]], the percentage increase/decrease alone of something has little meaning; the context of the original size is needed to evaluate how impressive the change really is. And in this case, the probability of dying from a cardiovascular disease is much, MUCH higher than the probability of being attacked and killed by a bear, so the moderate decrease of the former has much more impact on one's overall life expectancy than even a huge increase of the latter (unless you live in an area that has many bears, in which case your best bet is to take appropriate precautions rather than to never go outside at all).
 
 
 
The "280% increase" of the title text is also an error, though perhaps not for reasons you might assume at first glance (the correct calculation is not "300% − 30% = 270%"). To "increase by 300%" means to add 300% on top of the original 100% (=400%, so multiplied by 4), while to "decrease by 30%" means to remove 30% from the original 100% (=70%, so multiplied by 0.7). Combining these (which is very very wrong!) would mean multiplying by both, for an overall change of 4.0 × 0.7 = 2.8, or 280%. However, this should be read as an increase ''to'' 280% of its old value, not ''by'' 280% (you started at 100% and added 180%). But this is a very, very wrong way of doing the math because these are probabilities of very different things with very different scales (if you threw out 30% of your dishware but in that same period also acquired 3 toothpicks on top of your original 1 toothpick, would you say that your kitchen stuff increased by 180%?). The correct way of combining the two probabilities would be to translate them onto the same scale - the overall chance of death - which would be done by multiplying each value with its probability of happening at all. For example, if the chance of dying from cardiovascular disease was 50% and the chance of being killed by a bear was 0.1%, the overall chance of dying from either would be the sum, 50.1%. Both probabilities are affected by going outside; the new chances are now 50%*0.7=35% and 0.1%*4=0.4% and the combined chance of dying from either is now 35.4% - a significant DEcrease from the original 50.1%.
 
 
 
 
 
The specific reference to falling spacecraft is likely inspired by events happening around the time of this comics release (March 2022).  Around a month before this was posted, the head of the Russian space agency, {{w|Roscosmos}}, warned that sanctions against Russia (mostly those over the {{w|2022 Russian invasion of Ukraine}}) could result in the {{w|International Space Station}} crashing.  Since the Russian section of the space station is the one that provides propulsion (although it is built to rely on the power generated by the other sections), this was taken seriously and as of when this was posted, {{w|NASA}} was trying to come up with alternative stabilization strategies in case the situation worsened. There was also a recent [https://www-uol-com-br.translate.goog/tilt/noticias/redacao/2022/03/17/parte-do-foguete-spacex-e-encontrada-por-morador-do-pr.htm?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=pt-BR&_x_tr_pto=wapp report] of some 600 kg space rocket debris found in Brazil.
 
  
 
==Transcript==
 
==Transcript==
Line 42: Line 33:
 
:X-axis: 1 2 3 4 5
 
:X-axis: 1 2 3 4 5
  
:[The Y-axis is not scaled; there are no ticks or lines. Instead it just gives five labels from top to bottom. Above those labels there is an arrow pointing to the top one with a label above explaining the axis.]
+
:[The Y-axis is not scaled; there are no ticks or lines. Instead it just gives five labels from top to bottom. Above those labels there is an arrow pointing to the top one with a label above explaning the axis.]
:Hours spent outdoors per day
+
:Hours spent outdoors
 
:Y-axis:  
 
:Y-axis:  
 
::0 (ref)
 
::0 (ref)
Line 57: Line 48:
 
:[Below the panel there is a caption:]
 
:[Below the panel there is a caption:]
 
:Our new study suggests that spending more than 5 hours outside significantly increases your risk of head injury from spacecraft debris, so try to limit outdoor activities to 4 hours or less.
 
:Our new study suggests that spending more than 5 hours outside significantly increases your risk of head injury from spacecraft debris, so try to limit outdoor activities to 4 hours or less.
 
==Trivia==
 
*In the [https://www.explainxkcd.com/wiki/images/archive/d/d5/20220329223238%21spacecraft_debris_odds_ratio.png original version] of the comic, the Y-axis label said "Hours spent outdoors", but the comic was later changed to specify "Hours spent outdoors ''per day''", which makes more sense. When the updated image was uploaded, it had a much larger size than normal, because Randall posted the same file for both the normal "double size" image and the "regular" size. This had happened before with [[2576: Control Group]], see that comic's [[2576: Control Group#Trivia|trivia section]]. This resulted in the problem that the comic broke the boundaries on [https://xkcd.com xkcd.com].
 
*This comic's title text ("That's a 280% increased") has a typo.
 
  
 
{{comic discussion}}
 
{{comic discussion}}
Line 68: Line 55:
 
[[Category:Space]]
 
[[Category:Space]]
 
[[Category:Animals]] <!-- bears title text-->
 
[[Category:Animals]] <!-- bears title text-->
[[Category:Comics edited after their publication]]
 

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)