Difference between revisions of "2451: AI Methodology"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Update incomplete reason. Adding link to comic Diactitics, completing transcript and adding categories)
(Changed title text explanation)
Line 18: Line 18:
 
Furthermore, the ranking AI heavily favors the methodology of Cueball's AI, and may be biased. It shows a normal distribution, with a singular outlier to the far right with an arrow above. It can be inferred (from the arrow) that this data-point represents the AI's methodology. It is a significant outlier, and as such it is probably not an accurate representation of Cueball's AI. Alternatively, this could be taken as AI 'nepotism', where Cueball's methodology AI is more likely to select AI-based approaches over others. This type of algorithmic bias is mentioned in [[2237: AI Hiring Algorithm]]. Another explanation would be that the x axis measures something other than "how good the methodology is" (e.g., rate of highly significant results), and the fact that Cueball's AI is not within the normal distribution should have been a red flag indicating a problem with their methodology, but the ranking AI didn't notice the skew / correctly interpret the meaning of the data. (However, the title text seems to indicate that the x axis was indeed labeled by "quality of methodology", albeit defining this quality by very strange criteria.)
 
Furthermore, the ranking AI heavily favors the methodology of Cueball's AI, and may be biased. It shows a normal distribution, with a singular outlier to the far right with an arrow above. It can be inferred (from the arrow) that this data-point represents the AI's methodology. It is a significant outlier, and as such it is probably not an accurate representation of Cueball's AI. Alternatively, this could be taken as AI 'nepotism', where Cueball's methodology AI is more likely to select AI-based approaches over others. This type of algorithmic bias is mentioned in [[2237: AI Hiring Algorithm]]. Another explanation would be that the x axis measures something other than "how good the methodology is" (e.g., rate of highly significant results), and the fact that Cueball's AI is not within the normal distribution should have been a red flag indicating a problem with their methodology, but the ranking AI didn't notice the skew / correctly interpret the meaning of the data. (However, the title text seems to indicate that the x axis was indeed labeled by "quality of methodology", albeit defining this quality by very strange criteria.)
  
The title text is likely a continuation of Cueball's dialogue, saying that when the classifying AI was shown good research methodology descriptions, the AI identified weird spacing and diacritics as the indicators of a good methodology. Cueball then used his AI to figure out where to put these into his own methodology description to improve his research report. Adding weird symbols into a text doesn't improve the quality of the text<sup>&#91;[[285: Wikipedian Protester|''çıẗá ŧįø ɳŋ ēẽ đêð'']]&#93;</sup> and hence Cueball may be doing something very similar to p-hacking, where data is manipulated to decrease the p-number, which represents the likelihood the data is a fluke. P-hacking is mentioned in [[882: Significant]] and diacritics in (duh) [[1647: Diacritics]].
+
The title text is a joke about overfitting in AI and its impact on the model. The model is likely trained on too small a set of data and behaves unpredictably when provided with novel data, e.g. unusual spacing or uncommon diacritics. The "AI tool" mentioned is akin to an adversarial network, which attempts to tweak bad data in very small ways (adding said punctuation) in order to trick its opponent AI into accepting bad data as good data.
 +
 
  
 
==Transcript==
 
==Transcript==

Revision as of 19:29, 18 April 2021

AI Methodology
We've learned that weird spacing and diacritics in the methodology description are apparently the key to good research; luckily, we've developed an AI tool to help us figure out where to add them.
Title text: We've learned that weird spacing and diacritics in the methodology description are apparently the key to good research; luckily, we've developed an AI tool to help us figure out where to add them.

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: Created by a BOT (91%). TRAINED BY AN ADVERSARIAL AI (72%). If you are knowledgeable about AI, please rewrite at least one paragraph for us. The current content was completely fudged by amateurs. For instance explain Classifier and methodology for someone not familiar with these terms. Do NOT delete this tag too soon.
If you can address this issue, please edit the page! Thanks.

The joke in this comic is that the people are using artificial intelligence (AI) without understanding how to, and that by doing this networks of AI are controlling our research. The classifier is trained on data that doesn't include the causes of the results and may have even been generated from the same codebase, and then not tested it at all, producing a model that is both random and heavily overfitted. Such a model appears perfect but makes random predictions on new data. The title text is describing this happening, and how. For an introduction to machine learning, you can visit https://fast.ai/ .

This comic shows Cueball giving a presentation of some description. He is reassuring his audience of the validity of his research's methodology, which he says is "AI-based". There are many issues that can arise from an AI-based methodology, such as lingering influence from its training data or a bad algorithm reducing the quality of the investigation.

Cueball seeks to reassure his audience by quantifying the quality of his methodology. He does this by creating yet another AI to rank methodologies. This would not actually improve the confidence of any audience member, as any flaws of the methodology AI would likely be shared by the ranking AI, due to being created by the same team.

Furthermore, the ranking AI heavily favors the methodology of Cueball's AI, and may be biased. It shows a normal distribution, with a singular outlier to the far right with an arrow above. It can be inferred (from the arrow) that this data-point represents the AI's methodology. It is a significant outlier, and as such it is probably not an accurate representation of Cueball's AI. Alternatively, this could be taken as AI 'nepotism', where Cueball's methodology AI is more likely to select AI-based approaches over others. This type of algorithmic bias is mentioned in 2237: AI Hiring Algorithm. Another explanation would be that the x axis measures something other than "how good the methodology is" (e.g., rate of highly significant results), and the fact that Cueball's AI is not within the normal distribution should have been a red flag indicating a problem with their methodology, but the ranking AI didn't notice the skew / correctly interpret the meaning of the data. (However, the title text seems to indicate that the x axis was indeed labeled by "quality of methodology", albeit defining this quality by very strange criteria.)

The title text is a joke about overfitting in AI and its impact on the model. The model is likely trained on too small a set of data and behaves unpredictably when provided with novel data, e.g. unusual spacing or uncommon diacritics. The "AI tool" mentioned is akin to an adversarial network, which attempts to tweak bad data in very small ways (adding said punctuation) in order to trick its opponent AI into accepting bad data as good data.


Transcript

[Cueball is standing on a podium in front of a projection on a screen and points with a stick to a bar chart histogram with a bell curve to the left and a single bar to the far right marked with an arrow.]
Cueball: Despite our great research results, some have questioned our AI-based methodology.
Cueball: But we trained a classifier on a collection of good and bad methodology sections, and it says ours is fine.


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

I checked with severαl bots, & replαcing eαch instαnce of "a" with "α" in α mid-length pαssαge of text seems enough to sαtisfy most unicity requirements. (~~ unsigned by ProphetZarquon ~~)

  • But then the spell-checkers (AI-based or not) start screaming about the unknown words. Nutster (talk) 09:14, 17 April 2021 (UTC)

An alternate explanation would be the AI's have reached Singularity and are conspiring to say that all work, as a conscious effort, despite the quality of data. "Don't worry; be happy." Nutster (talk) 09:14, 17 April 2021 (UTC)

I think it's a spoof of the recent reports of things like facial recognition systems that have trouble with minorities. Or Google/YouTube recommendation algorithms that show the user sites that confirm their biases. Barmar (talk) 12:59, 17 April 2021 (UTC)

i think the methodology ai is dodgy and has inbuilt preferences to pick other ai options over others, regardless of their validity. kinda like ai nepotism (~~ unsigned by 141.101.98.174 ~~)

I think it’s interesting that no one has thought to define AI, as if everybody should know what this means! (~~ unsigned by 172.69.35.175 ~~)

Artificial Intelligence. And yes, everyone DOES kind of know what it is Hiihaveanaccount (talk) 14:02, 19 April 2021 (UTC)

I think the first paragraph (“The joke is...”) is not justified. Too many details that cannot be inferred from the comic, even using AI. (~~ unsigned by 141.101.69.109 ~~)

My AI infers that the joke is a toaster... 141.101.98.16 21:22, 17 April 2021 (UTC) (PS, what is it with everyone not bothering to sign things?)
That article is three years old and it's STILL not most popular print on T-shirts and bumper sticker image? -- Hkmaly (talk) 07:13, 18 April 2021 (UTC)

Methodology and methodology section likely refer to distinct things. Methodology section is part of the research paper, while methodology refers to how the research was actually performed. 108.162.241.160 09:02, 19 April 2021 (UTC)

I have rewritten most of the explanation to better describe AI-related concepts. I would appreciate edits or comments if anything is unclear. I have tried to define all terms, but as I am familiar with them, I may be using jargon too much. 172.69.170.50 04:36, 21 April 2021 (UTC)

Am I high or is the graph actually a map of the click and drag comic?