Welcome to the explain xkcd wiki!
We have an explanation for all 2450 xkcd comics,
and only 32
(1%) are incomplete. Help us finish them!
Go to this comic explanation
Title text: We've learned that weird spacing and diacritics in the methodology description are apparently the key to good research; luckily, we've developed an AI tool to help us figure out where to add them.
|| This explanation may be incomplete or incorrect: Created by a BOT (91%). TRAINED BY AN ADVERSARIAL AI (72%). If you are knowledgeable about AI, please rewrite at least one paragraph for us. The current content was completely fudged by amateurs. For instance explain Classifier and methodology for someone not familiar with these terms. Do NOT delete this tag too soon.|
The joke in this comic is that the people are using artificial intelligence (AI) without understanding how to, and that by doing this networks of AI are controlling our research. The classifier is trained on data that doesn't include the causes of the results and may have even been generated from the same codebase, and then not tested it at all, producing a model that is both random and heavily overfitted. Such a model appears perfect but makes random predictions on new data. The title text is describing this happening, and how. For an introduction to machine learning, you can visit https://fast.ai/ .
This comic shows Cueball giving a presentation of some description. He is reassuring his audience of the validity of his research's methodology, which he says is "AI-based". There are many issues that can arise from an AI-based methodology, such as lingering influence from its training data or a bad algorithm reducing the quality of the investigation.
Cueball seeks to reassure his audience by quantifying the quality of his methodology. He does this by creating yet another AI to rank methodologies. This would not actually improve the confidence of any audience member, as any flaws of the methodology AI would likely be shared by the ranking AI, due to being created by the same team.
This is problematic because the concerns about his methodology are not concerns about the methodology section. A methodology section refers to quality of writing and is a specific section of a research paper. A good methodology section would accurately and clearly explain what he did, but does not mean the research methodology itself was valid. Therefore, claiming that he has a good methodology section does nothing to address concerns with research methodology.
Furthermore, the ranking AI heavily favors the methodology of Cueball's AI, and may be biased. It shows a normal distribution, with a singular outlier to the far right with an arrow above. It can be inferred (from the arrow) that this data-point represents the AI's methodology. It is a significant outlier, and as such it is probably not an accurate representation of Cueball's AI. Alternatively, this could be taken as AI 'nepotism', where Cueball's methodology AI is more likely to select AI-based approaches over others. This type of algorithmic bias is mentioned in 2237: AI Hiring Algorithm. Another explanation would be that the x axis measures something other than "how good the methodology is" (e.g., rate of highly significant results), and the fact that Cueball's AI is not within the normal distribution should have been a red flag indicating a problem with their methodology, but the ranking AI didn't notice the skew / correctly interpret the meaning of the data. (However, the title text seems to indicate that the x axis was indeed labeled by "quality of methodology", albeit defining this quality by very strange criteria.)
The title text is a joke about overfitting in AI and its impact on the model. The model is likely trained on too small a set of data and behaves unpredictably when provided with novel data, e.g. unusual spacing or uncommon diacritics. The "AI tool" mentioned is akin to an adversarial network, which attempts to tweak bad data in very small ways (adding said punctuation) in order to trick its opponent AI into accepting bad data as good data.
- [Cueball is standing on a podium in front of a projection on a screen and points with a stick to a bar chart histogram with a bell curve to the left and a single bar to the far right marked with an arrow.]
- Cueball: Despite our great research results, some have questioned our AI-based methodology.
- Cueball: But we trained a classifier on a collection of good and bad methodology sections, and it says ours is fine.
Is this out of date? .
Lots of people
contribute to make this wiki a success. Many of the recent contributors, listed above, have just joined
. You can do it too! Create your account here
You can read a brief introduction about this wiki at explain xkcd. Feel free to sign up for an account and contribute to the wiki! We need explanations for comics, characters, themes and everything in between. If it is referenced in an xkcd web comic, it should be here.
- There are incomplete explanations listed here. Feel free to help out by expanding them!
- We sell advertising space to pay for our server costs. To learn more, go here.
Don't be a jerk.
There are a lot of comics that don't have set-in-stone explanations; feel free to put multiple interpretations in the wiki page for each comic.
If you want to talk about a specific comic, use its discussion page.
Please only submit material directly related to (and helping everyone better understand) xkcd... and of course only submit material that can legally be posted (and freely edited). Off-topic or other inappropriate content is subject to removal or modification at admin discretion, and users who repeatedly post such content will be blocked.
If you need assistance from an admin, post a message to the Admin requests board.