Editing 2169: Predictive Models

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 8: Line 8:
  
 
==Explanation==
 
==Explanation==
{{w|Predictive text}} is a feature on many systems whereas you type the system automatically suggests likely words or phrases to follow what you have written to that point.  For instance, if you type "I'm heading" the system may suggest "home" or "back" as likely words to follow. Predictive systems usually use prior input to generate their predictions, so if you frequently type "Totally amazing!" the system will suggest "amazing!" every time you type "totally" even if you actually go on to type "totally true" sometimes.
 
  
In the comic, [[Cueball]] is using predictive text to uncover a plot against his organization/government, but instead of using only his personal input, the system is using input from ''all'' users. By typing in an obscure phrase related to revolution and a meeting, he gets the predictive text algorithm to display where and when the next supposedly secret meeting will be held based on other users input. This works because it is unlikely that anyone else other than revolutionaries would be typing this phrase, thus the only data the algorithm has to predict from is the actual message from the revolutionaries on their next meeting. The caption of the comic is pointing out that systems which use prior input for predictive purposes in this way can end up leaking information that might otherwise be considered private. (However, this method may produce outdated information. On June 29, 2019, typing in Google "Long live the revolution. Our next meeting will be at" gave the predicted completion "long live the revolution. our next meeting will be at comic con 2018", which would not be useful information to anyone looking for revolutionaries, because Comic-Con 2018 was already over.)
 
  
The title text shows the revolutionaries using the same technique. By typing in "We will arrest the revolution members" they are hoping that the algorithm will suggest the time and date of their planned arrest, since no one other than the authorities would be typing in that phrase. Pressing the key [tab] to autocomplete that text produces "WE WILL ARREST THE REVOLUTION MEMBERS [AT THE JULY 28TH MEETING]", and the revolutionaries then say "Cancel the meeting! Our cover is blown." The revolutionaries have apparently made the serious mistake of holding secret meetings on regular, predictable dates (such as the 28th day of each month, the last date guaranteed to exist in any month of the Gregorian Calendar), and the authorities have successfully figured this out, either through the predictive-text attack or by other means.
+
{{w|Predictive text}} is a feature on many systems where as you type the system automatically suggests likely words or phrases to follow what you have written to that point. For instance, if you type "I'm heading" the system may suggest "home" or "back" as likely words to follow.  Predictive systems usually use prior input to generate their predictions, so if you frequently type "Totally amazing!" the system will suggest "amazing!" every time you type "totally" even if you actually want to type "totally true" sometimes.
  
Both examples assume that the revolutionaries and the authorities would be talking about very secret information in the clear on a network accessible to their adversaries. In the real world, people engaged in sensitive activities would communicate via code, encryption, or both, or would do so through what they believe to be secure channels. There is still the danger of secret information leaking via non-secret channels, however. {{w|Side-channel attack|Side-channel attacks}} use information gained from the implementation of a system to deduce supposedly protected information. A famous example occurred in World War IIThe Germans kept tank production figures a secret, but they gave items like engine blocks sequential serial numbers. The Allies wanted to know exact tank production figures, so they solved the {{w|German tank problem}} by using statistical methods to analyze the distribution of these numbers on captured vehicles. They were able to predict tank production figures extremely accurately, to the point they predicted 270 tanks in a month when 276 were actually built. Thus, the secret information on tank production leaked.
+
In the comic, [[Cueball]] is using predictive text in Gmail to uncover a plot against his organization/government, but instead of using only his personal input, the system is using input from ''all'' users.  By typing in an obscure phrase related to revolution and a meeting, he gets the predictive text algorithm to display where and when the next supposedly secret meeting will be held based on other users input. This works because it is unlikely that anyone else other than revolutionaries would be typing this phrase, thus the only data the algorithm has to predict from is the actual message from the revolutionaries on their next meeting.  The caption of the comic is pointing out that systems which use prior input for predictive purposes in this way can end up leaking information that might otherwise be considered private. (However, this method may produce outdated informationOn June 29, 2019, typing in Google "Long live the revolution. Our next meeting will be at" gave the predicted completion "long live the revolution. our next meeting will be at comic con 2018", which would not be useful information to anyone looking for revolutionaries, because Comic-Con 2018 was already over.)
  
Some systems require frequent password change, in an effort to limit danger from a password being discovered. However, people respond by choosing passwords in patterns, so it is easy to predict what subsequent passwords will be, given old ones, thus defeating the purpose of requiring frequent changes.[https://www.troyhunt.com/passwords-evolved-authentication-guidance-for-the-modern-era/ Passwords Evolved: Authentication Guidance for the Modern Era]
+
The title text shows the revolutionaries using the same technique.  By typing in "We will arrest the revolution members" they are hoping that the algorithm will suggest the time and date of their planned arrest, since no one other than the authorities would be typing in that phrase. Pressing the key [tab] to autocomplete that text produces "WE WILL ARREST THE REVOLUTION MEMBERS [AT THE JULY 28TH MEETING]", and the revolutionaries then say "Cancel the meeting! Our cover is blown." The revolutionaries have apparently made the serious mistake of holding secret meetings on regular, predictable dates (such as the 28th day of each month, the last date guaranteed to exist in any month of the Gregorian Calendar), and the authorities have successfully figured this out, either through the predictive-text attack or by other means.
  
Although the comic title is "Predictive Models", the term {{w|Predictive modelling}} usually refers to computer programs that try to predict outcomes from data aggregation, such as reviewing health records to identify people most at risk from certain diseases based on weight, prior injuries, etc., before testing directly for the diseases themselves. This is similar to but not precisely like the example in the comic, since predictive text is using direct input to predict further input, while predictive modelling is using related input (such as make and model of a car along with driver acceleration patterns) to predict a different output (such as likelihood of a crash). Both predictive text and predictive modelling could leak information as the comic suggests, however. Predictive text and the possibility to leak unintended information has been parodied on xkcd before in [[1068: Swiftkey]].
+
Both examples assume that the revolutionaries and the authorities would be talking about very secret information in the clear on a network accessible to their adversaries.  In the real world people engaged in sensitive activities would communicate via code, encryption, or both, or would do so through what they believe to be secure channels.  There is still the danger of secret information leaking via non-secret channels, however. 
 +
 
 +
{{w|Side-channel attack|Side-channel attacks}} use information gained from the implementation of a system to deduce supposedly protected information.  A famous example occurred in World War II.  The Germans kept tank production figures a secret, but they gave items like engine blocks sequential serial numbers.  The Allies wanted to know exact tank production figures, so they solved the {{w|German tank problem}} by using statistical methods to analyze the distribution of these numbers on captured vehicles.  They were able to predict tank production figures extremely accurately, to the point they predicted 270 tanks in a month when 276 were actually built.  Thus the secret information on tank production leaked.
 +
 
 +
Some systems require frequent password change, in an effort to limit danger from a password being discovered.  However, people respond by chosing passwords in patterns, so it is easy to predict what subsequent passwords will be, given old ones, thus defeating the purpose of requiring frequent changes.[https://www.troyhunt.com/passwords-evolved-authentication-guidance-for-the-modern-era/ Passwords Evolved: Authentication Guidance for the Modern Era]
 +
 
 +
Although the comic title is "Predictive Models", the term {{w|Predictive modelling}} usually refers to computer programs that try to predict outcomes from data aggregation, such as reviewing health records to identify people most at risk from certain diseases based on weight, prior injuries, etc., before testing directly for the diseases themselves. This is similar to but not precisely like the example in the comic, since predictive text is using direct input to predict further input, while predictive modelling is using related input (such as make and model of a car along with driver acceleration patterns) to predict a different output (such as likelihood of a crash). Both predictive text and predictive modelling could leak information as the comic suggests, however.
 +
 
 +
Predictive text and the possibility to leak unintended information has been parodied on xkcd before in [[1068: Swiftkey]].
  
 
==Transcript==
 
==Transcript==
Line 36: Line 42:
 
{{comic discussion}}
 
{{comic discussion}}
  
[[Category:Artificial Intelligence]]
 
 
[[Category:Comics featuring Cueball]]
 
[[Category:Comics featuring Cueball]]

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)