Editing 2652: Proxy Variable

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 2: Line 2:
 
| number    = 2652
 
| number    = 2652
 
| date      = July 29, 2022
 
| date      = July 29, 2022
| title    = Proxy Variable  
+
| title    = Proxy Variable
 
| image    = proxy_variable.png
 
| image    = proxy_variable.png
 
| titletext = Our work has produced great answers. Now someone just needs to figure out which questions they go with.
 
| titletext = Our work has produced great answers. Now someone just needs to figure out which questions they go with.
 
}}
 
}}
 
__NOTOC__
 
  
 
==Explanation==
 
==Explanation==
In this comic, [[Hairy]] is discussing use of a proxy variable with [[Cueball]]. In statistics, a {{w|proxy variable}} is used as a stand-in for one or more other variables that are difficult to measure. In order to be useful as such, proxy variables must be correlated with what they are intended to represent. For example, a drug might aim to reduce deaths from a slow-acting disease. But testing if it reduces deaths might take many years, so researchers might test for a proxy outcome instead, like whether the drug appears to mitigate loss of bone density or cell-damage. Physicians use blood pressure as one of many proxies for cardiovascular health.
+
{{incomplete|Created by a PROXY BOT IN NO WAY CORRELATED WITH THE ORIGINAL BOT - Please change this comment when editing this page. Do NOT delete this tag too soon.}}
  
Hairy is dismissing the question of whether they are studying the right variable as too expensive to answer. This is deeply ironic and thus satirical, because good {{w|experiment design}} requires sufficient attention to the robustness of all the involved parts of an experiment, even if the expense may be prohibitive. This comic might be referring to the recent discovery of [https://www.science.org/content/article/potential-fabrication-research-images-threatens-key-theory-alzheimers-disease nearly two decades] of allegedly fraudulent {{w|Alzheimer's disease}} research supporting a mistaken proxy hypothesis.
+
some spam going on here
  
Choosing the wrong proxy variable might make the research misleading, irrelevant, or as the title text suggests, answer the wrong question. Separating correlation from {{w|Causality|causation}} is necessary when interpreting proxy variable results to make sure the question they answer is known. Mere correlation instead of {{w|Causal analysis|authentic causation}} yields weaker results. {{w|Exploratory causal analysis}} can assist with finding useful proxy variables, but is difficult for the layperson to interpret and can be misleading, because even if performed correctly, a {{w|combinatorial explosion}} of possible proxy variables can make traditional {{w|statistical significance}} analysis fail, requiring {{w|F-score}}s or similar measures. The history of pharmaceutical research is largely a graveyard of failed proxy hypotheses; that is one of the reasons for [https://clinicaltrials.gov/ct2/manage-recs/fdaaa experiment registration regulations.]
+
==Transcript==
 
+
{{incomplete transcript|Do NOT delete this tag too soon.}}
The title text's notion of having an answer without knowing the actual question could also be be a reference to the classic comedy science fiction novel {{w|The Hitchhiker's Guide to the Galaxy|The Hitchhiker's Guide to the Galaxy}}, where in one scene Earth turns out to be a supercomputer built for the purpose of figuring out the question for the answer "42."
 
 
 
=== Examples of noteworthy proxy variables ===
 
 
 
<!-- recap -->* Loss of bone density or damage to cells for toxicity
 
* Blood pressure for cardiovascular health
 
* Amyloid markers for Alzheimer's disease
 
 
 
* Local temperature for global warming severity
 
* GDP growth for development (demolishing a hospital adds to GDP but subtracts from development)
 
* Money supply size for price inflation (see e.g. the {{w|paradox of thrift}})
 
* {{W|Carbonic anhydrase}} expression for carbon sequestration
 
* Asphalt production for carbon sequestration
 
* Proportion renewable energy for carbon reduction (see {{w|Jevons paradox}})
 
* Dialytic {{w|desalination}} for carbon sequestration[https://drive.google.com/file/d/0B73LgocyHQnfV1Q4VE45RmFFeFlPSDlKalctVS1nRlYyY3lR/view?usp=sharing&resourcekey=0-3YeR9jAkROsI0YLf4_07GQ][https://drive.google.com/file/d/14igVdhaIhrbHVTN5lI3XfxgNWPsvjNa7]
 
* {{w|Bacillus thuringiensis israelensis}} application for mosquito abatement
 
* Indoor carbon dioxide levels for air quality and ventilation
 
 
 
==Transcript==
 
:[Cueball is looking at Hairy who points a pointer to a poster. On the poster there is a line graph at the top and below that a candlestick chart. The line graph appears to show a time series with a question mark inside a ellipsoid at the end of the curve. The candlestick chart shows a box-and-whiskers plot comparing two variables. There is no readable text except the question mark. Hairy's stick points just below the line chart.]
 
:Hairy: We want to study this variable, but it's too hard to observe.
 
:Poster: ?
 
 
 
:[In a slim panel only Hairy and the poster are shown. His pointer now points to the left variable in the box-and-whiskers plot,]
 
:Hairy: So we're studying this proxy variable.
 
:Poster: ?
 
 
 
:[Back to Cueball and Hairy with the poster out of frame. Hairy holds the pointer down by his side.]
 
:Cueball: Is it correlated with the other variable?
 
:Hairy: Look, we don't have the funding to answer every little question.
 
  
 
{{comic discussion}}
 
{{comic discussion}}
 
[[Category:Comics featuring Cueball]]
 
[[Category:Comics featuring Hairy]]
 
[[Category:Science]]
 
[[Category:Line graphs]]
 
[[Category:Charts]]
 
[[Category:Scientific research]]
 

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)