2169: Predictive Models
Title text: WE WILL ARREST THE REVOLUTION MEMBERS [AT THE JULY 28TH MEETING][tab] "Cancel the meeting! Our cover is blown."
- When the image is clicked the "Not Available" xkcd post opens up: https://xkcd.com/[AT%20THE%20JULY%2028TH%20MEETING][tab]
| This explanation may be incomplete or incorrect: Created by a PREDICTIVE MODEL THAT WILL BE FIRST AGAINST THE WALL WHEN THE REVOLUTION COMES. Do NOT delete this tag too soon.|
If you can address this issue, please edit the page! Thanks.
In the comic, Cueball is using predictive text to uncover a plot against his organization/government, but instead of using only his personal input, the system is using input from all users. By typing in an obscure phrase related to revolution and a meeting, he gets the predictive text algorithm to display where and when the next supposedly secret meeting will be held based on other users input. This works because it is unlikely that anyone else other than revolutionaries would be typing this phrase, thus the only data the algorithm has to predict from is the actual message from the revolutionaries on their next meeting. The caption of the comic is pointing out that systems which use prior input for predictive purposes in this way can end up leaking information that might otherwise be considered private. (However, this method may produce outdated information. On June 29, 2019, typing in Google "Long live the revolution. Our next meeting will be at" gave the predicted completion "long live the revolution. our next meeting will be at comic con 2018", which would not be useful information to anyone looking for revolutionaries, because Comic-Con 2018 was already over.)
As humanity adapts to a digital world, people are finding that their digital communications provide the illusion of confidentiality, with damaging results when the information leaks out. Real-life examples include a 2016 British trainee doctor strike, where a technically-secure WhatsApp group leaked information to the press.
The title text shows the revolutionaries using the same technique. By typing in "We will arrest the revolution members" they are hoping that the algorithm will suggest the time and date of their planned arrest, since no one other than the authorities would be typing in that phrase. Pressing the key [tab] to autocomplete that text produces "WE WILL ARREST THE REVOLUTION MEMBERS [AT THE JULY 28TH MEETING]", and the revolutionaries then say "Cancel the meeting! Our cover is blown." The revolutionaries have apparently made the serious mistake of holding secret meetings on regular, predictable dates (such as the 28th day of each month, the last date guaranteed to exist in any month of the Gregorian Calendar), and the authorities have successfully figured this out, either through the predictive-text attack or by other means.
Both examples assume that the revolutionaries and the authorities would be talking about very secret information in the clear on a network accessible to their adversaries. In the real world people engaged in sensitive activities would communicate via code, encryption, or both, or would do so through what they believe to be secure channels. There is still the danger of secret information leaking via non-secret channels, however.
Side-channel attacks use information gained from the implementation of a system to deduce supposedly protected information. A famous example occurred in World War II. The Germans kept tank production figures a secret, but they gave items like engine blocks sequential serial numbers. The Allies wanted to know exact tank production figures, so they solved the German tank problem by using statistical methods to analyze the distribution of these numbers on captured vehicles. They were able to predict tank production figures extremely accurately, to the point they predicted 270 tanks in a month when 276 were actually built. Thus the secret information on tank production leaked.
Although the comic title is "Predictive Models", the term Predictive modelling usually refers to computer programs that try to predict outcomes from data aggregation, such as reviewing health records to identify people most at risk from certain diseases based on weight, prior injuries, etc., before testing directly for the diseases themselves. This is similar to but not precisely like the example in the comic, since predictive text is using direct input to predict further input, while predictive modelling is using related input (such as make and model of a car along with driver acceleration patterns) to predict a different output (such as likelihood of a crash). Both predictive text and predictive modelling could leak information as the comic suggests, however.
Predictive text and the possibility to leak unintended information has been parodied on xkcd before in 1068: Swiftkey.
- [Cueball is sitting in an office chair at a desk typing on a laptop. Above him is the text he writes along with what the predictive text tool suggests, the latter in grey text. The TAB at the end is in a small frame.]
- Cueball typing: Long live the revolution. Our next meeting will be at| the docks at midnight on June 28 [tab]
- Cueball: Aha, found them!
- [Caption below the panel:]
- When you train predictive models on input from your users, it can leak information in unexpected ways.
- Clicking on the comic takes you to this page: https://xkcd.com/[AT%20THE%20JULY%2028TH%20MEETING][tab], which as of this moment only shows "404 Not Found".
- The anchor actually contains invalid HTML <a href=" [AT THE JULY 28TH MEETING][tab] "Cancel the meeting! Our cover is blown."">. This would suggest that Randall didn't intend this behaviour.
- It is also possible that Randall may add what he intends to add at a later date, most likely July 28, the date mentioned in the title text. In this case the page will likely remain this way until then.
- Some browsers, only show the first part of the title text "WE WILL ARREST THE REVOLUTION MEMBERS." For example FireFox version 66 Windows does this, evidently some versions of Firefox and chrome do likewise on GNU/Linux.
add a comment! ⋅ add a topic (use sparingly)! ⋅ refresh comments!