2169: Predictive Models

Explain xkcd: It's 'cause you're dumb.
Revision as of 17:58, 28 June 2019 by (talk)
Jump to: navigation, search
Predictive Models


Ambox notice.png This explanation may be incomplete or incorrect: Created by a PREDICTIVE MODEL THAT WILL BE FIRST AGAINST THE WALL WHEN THE REVOLUTION COMES. Please mention here why this explanation isn't complete. Do NOT delete this tag too soon.
If you can address this issue, please edit the page! Thanks.
Predictive text is a feature on many systems where as you type the system automatically suggests likely words or phrases to follow what you have written to that point. For instance, if you type "I'm heading" the system may suggest "home" or "back" as likely words to follow. Predictive systems usually use prior input to generate their predictions, so if you frequently type "Totally amazing!" the system will suggest "amazing!" every time you type "totally" even if you actually want to type "totally true" sometimes.

In the comic, Cueball is using predictive text to uncover a plot against him, or his organization/government. By typing in an obscure phrase related to revolution and a meeting, he gets the predictive text algorithm to display where and when the next, supposedly secret, meeting will be held. This works because it is unlikely that anyone else other than revolutionaries would be typing this phrase, thus the only data the algorithm has to predict from is the actual message from the revolutionaries on their next meeting. The caption of the comic is pointing out that systems which use prior input for predictive purposes in this way can end up leaking information that might otherwise be considered private.

As humanity adapts to a digital world, people are finding that their digital communications provide the illusion of confidentiality, with damaging results when the information leaks out. Real-life examples include a 2016 British trainee doctor strike, where a technically-secure whatsapp group leaked information to the press.

The title text shows the revolutionaries using the same technique. By typing in "We will arrest the revolution members" they are hoping that the algorithm will suggest the time and date of their planned arrest, since no one other than the authorities would be typing in that phrase. The image tag for the comic itself is wrapped in a broken HTML anchor tag: <a href="[AT THE JULY 28TH MEETING][tab] "Cancel the meeting! Our cover is blown."">...</a>. This link does not work, though, and takes you to the 404 page.

Both examples assume that the revolutionaries and the authorities would be talking about very secret information in the clear on a network accessible to their adversaries. In the real world people engaged in sensitive activities would communicate via code, encryption, or both, or would do so through secure channels. There is still the danger of secret information leaking via non-secret channels, however.

Although the comic title is "Predictive Models", the term Predictive modelling usually refers to computer programs that try to predict outcomes from data aggregation, such as reviewing health records to identify people most at risk from certain diseases based on weight, prior injuries, etc., before testing directly for the diseases themselves. This is similar to but not precisely like the example in the comic, since predictive text is using direct input to predict further input, while predictive modelling is using related input (such as make and model of a car along with driver acceleration patterns) to predict a different output (such as likelihood of a crash). Both predictive text and predictive modelling could leak information as the comic suggests, however. A famous example occurred in World War II. The Germans kept tank production figures a secret, but they gave items like engine blocks sequential serial numbers. The Allies wanted to know exact tank production figures, so the solved the German tank problem by using statistical methods to analyze the the distribution of these numbers on captured vehicles. They were able to predict tank production figures extremely accurately, to the point they predicted 270 tanks in a month when 276 were actually built. Thus the secret information on tank production leaked.

Predictive text and the possibility to leak unintended information has been parodied on xkcd before in 1068: Swiftkey.


[Single panel with Cueball sitting at a desk typing on a laptop.]
Cueball typing: "Long live the revolution. Our next meeting will be at|" [Predictive text tool suggests in grey text] "the docks at midnight on June 28.[TAB]"
Cueball: Aha, found them!
[Caption below panel]
When you train predictive models on input from your users, it can leak information in unexpected ways.

comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!


If you click on the comic, it opens a page with error 404. Looking at the URL, it says "At the July 28th meeting", which I assume is the prediction result to the title text suggesting that they will be 1 month late. 17:13, 28 June 2019 (UTC)

Fixsed it, my years of mediawiki knowledge have finally come to use. Iggyvolz (talk)

In the HTML tag for the link (the <a> tag surrounding the comic image) after the link it says "cancel the meeting! our cover is blown" Everlastingwonder (talk) 17:21, 28 June 2019 (UTC)

In the mobile version, you can read «See also: [AT THE JULY 28TH MEETING][tab] "Cancel the meeting! Our cover is blown."» It leads to a 404, like the other examples in the comments here. 17:31, 28 June 2019 (UTC)

This looks a whole lot like Gmail's Smart Compose

Today GMail actually predicted the beginning of my mail correctly. I typed literally zero characters and it already knew how to continue. In the future, we won't even have to upload our brains to a computer, a backup will already be available there automatically. Fabian42 (talk) 21:32, 28 June 2019 (UTC)

Not a backup, a simulation. 04:46, 29 June 2019 (UTC)

If you can't tell the difference, does it matter? 17:04, 1 July 2019 (UTC)

On my Mac the title text only shows "WE WILL ARREST THE REVOLUTION MEMBERS" while on my iPad (where you long press to see title texts) long pressing only shows the link. Weird. Also someone remind me to check the link again on July 28. Herobrine (talk) 13:10, 29 June 2019 (UTC)

On my Ubuntu system, both Firefox and Chrome display "WE WILL ARREST THE REVOLUTION MEMBERS" as the title text and "https://xkcd.com/[AT THE JULY 28TH MEETING][tab]" as the link target, which is also what's in the HTML source. Additionally, the HTML source is malformed, with quotes inside quotes in the href attribute. - Linneris (talk) 14:37, 29 June 2019 (UTC)
Malformed. Precisely! I think there was a glitch while the comic was uploaded, which used the title text as a link in addition to as the title text. It didn't include the last part due to the quotes. It will be either fixed or legitimate, or at least make the href a little nicer. That's right, Jacky720 just signed this (talk | contribs) 21:24, 29 June 2019 (UTC)
Actually... Looking at the comic again (for the first time on my PC), I would like to rethink that. I think this is Randall's method of demonstrating the [tab]; clicking and looking at the URL. [EDIT] Man, the more I think, the weirder it gets. Maybe it's about how sometimes you can find the information on the client side in the code where it should be hidden? I don't know anymore. That's right, Jacky720 just signed this (talk | contribs) 21:27, 29 June 2019 (UTC)
When you look at the source of that 404 page, you can see six HTML comments with the content a padding to disable MSIE and Chrome friendly error page. This is to prevent MSIE and Chrome from displaying "helpful" proprietary error pages. If you change the link in the slightest, you will also get a 404 page, but without these comments. I assume that either this was a glitch (intended or unintended) and this particular 404 page was modified so that everyone can see that the authors are aware of it, *or* it's a hint pointing to somewhere else. A rabbit hole maybe? I would like the latter to be true, but I haven't found anything.-- 22:42, 29 June 2019 (UTC)
Not for me. I see the same tiny Nginx 404 page with the same HTML source as any other 404 page due to invalid link on xkcd.com. - Linneris (talk) 07:14, 30 June 2019 (UTC)
My computer did that, but then it didn't happen anymore and the title text was complete. 13:09, 22 July 2019 (UTC)

This reminds me of that time where via data analytics on things like shopping habits, Target figured out that a teen girl was pregnant before her father did. Ahiijny (talk) 06:42, 30 June 2019 (UTC)

I tried this on google, and got "we will arrest chamisa" and "the meeting will be in room 27" and "our next meeting will be at 3 p.m. on wednesday". Any more? 19:16, 30 June 2019 (UTC)

I decided to see what a more sophisticated predictive model would do, so I plugged it into Talk to Transformer. The output: "Long live the revolution. Our next meeting will be at 10 a.m. on December 14 at the Cressey Building, 1636 S. Second St. Please invite your friends, family, and coworkers! For those interested in donating to the cause, please contact:" I'm legitimately impressed. Arcorann (talk) 01:03, 1 July 2019 (UTC)

Thinking about predictive text, in combination with the advice on the futility of making people change their passwords frequently, perhaps systems which require people to change their passwords could be more helpful by observing the pattern the user is using, and suggesting what the next password should be. Passwords Evolved: Authentication Guidance for the Modern Era 20:05, 1 July 2019 (UTC)

This paragraph was in the explanation, however the cited source gives no information about how the private correspondence was obtained, and no suggestion that the privacy of the communication channel was compromised. (The most obvious way that such information would be obtained is that somebody who was party to the communication made it available.) I moved it here in case somebody has sources to show that it was a breach of security. "As humanity adapts to a digital world, people are finding that their digital communications provide the illusion of confidentiality, with damaging results when the information leaks out. Real-life examples include a 2016 British trainee doctor strike, where a technically-secure WhatsApp group leaked information to the press." 05:18, 3 July 2019 (UTC)

Incomplete tag worth saving for posterity, due to H2G2 reference: Created by a PREDICTIVE MODEL THAT WILL BE FIRST AGAINST THE WALL WHEN THE REVOLUTION COMES. -- 01:56, 12 July 2019 (UTC)

Social media has been used in revolutions

Once revolutions achieve critical mass, they often communicate on more insecure channels. Many of the Arab Spring revolutions involved spread through Twitter. Broadly speaking the security vs contagiousness issues often cause disagreement among revolutionaries.

I.E. Trotskyist/Stalinist disagreements over "Permanent revolution" (expansionist) vs "Socialism in one country" (security and development of the USSR without spending all available surplus on spreading communism directly).