Editing 1683: Digital Data

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 19: Line 19:
  
 
The mouseover text is seemingly addressed to a reader in the future who will only be able to access xkcd through a {{w|digital archive}}. Digital information might not degrade with time, but it can't be properly displayed without knowledge of the encoding. As new encodings and file formats get developed and old ones abandoned, the webpage format of the comic might not be available in the future, when users would need special archives to view content from today's world. The mouseover text contains seemingly {{w|mojibake|garbage characters}}, which typically result from data being interpreted according to a {{w|character encoding}} different from the one used to encode it. In this case, the characters are the result of encoding the string [https://ftfy.vercel.app/?s=%C3%A2%E2%82%AC%C5%93If+you+can+read+this%2C+congratulations%C3%A2%E2%82%AC%E2%80%9Dthe+archive+you%C3%A2%E2%82%AC%E2%84%A2re+using+still+knows+about+the+mouseover+text%C3%A2%E2%82%AC%C2%9D%21 <tt>“If you can read this, congratulations—the archive you’re using still knows about the mouseover text”!</tt>] using {{w|UTF-8}} (which represents non-{{w|ASCII}} {{w|Unicode}} characters as multibyte sequences) and then interpreting the resulting bytes as the still commonly used {{w|Windows-1252}}  encoding (which uses only one byte per character, but utilizes the non-ASCII codepoints for a limited selection of extra letters and symbols such as "â" or "€"). This shows that degradation of digital data through conversions isn't restricted to images. Furthermore, as screen navigation moves away from the mouse toward touch, voice recognition, and modes still to be implemented, mouseover text will itself become archaic.
 
The mouseover text is seemingly addressed to a reader in the future who will only be able to access xkcd through a {{w|digital archive}}. Digital information might not degrade with time, but it can't be properly displayed without knowledge of the encoding. As new encodings and file formats get developed and old ones abandoned, the webpage format of the comic might not be available in the future, when users would need special archives to view content from today's world. The mouseover text contains seemingly {{w|mojibake|garbage characters}}, which typically result from data being interpreted according to a {{w|character encoding}} different from the one used to encode it. In this case, the characters are the result of encoding the string [https://ftfy.vercel.app/?s=%C3%A2%E2%82%AC%C5%93If+you+can+read+this%2C+congratulations%C3%A2%E2%82%AC%E2%80%9Dthe+archive+you%C3%A2%E2%82%AC%E2%84%A2re+using+still+knows+about+the+mouseover+text%C3%A2%E2%82%AC%C2%9D%21 <tt>“If you can read this, congratulations—the archive you’re using still knows about the mouseover text”!</tt>] using {{w|UTF-8}} (which represents non-{{w|ASCII}} {{w|Unicode}} characters as multibyte sequences) and then interpreting the resulting bytes as the still commonly used {{w|Windows-1252}}  encoding (which uses only one byte per character, but utilizes the non-ASCII codepoints for a limited selection of extra letters and symbols such as "â" or "€"). This shows that degradation of digital data through conversions isn't restricted to images. Furthermore, as screen navigation moves away from the mouse toward touch, voice recognition, and modes still to be implemented, mouseover text will itself become archaic.
 +
 +
==Explanation of the Joke==
 +
 +
The joke here in this strip is that digital information degrades in different but just as important ways to real-world objects.  Digital object can be reprocessed and lose fidelity, but just as important as technology evolves old information may no longer be viewed in the same was as it was originally intended -- to see an example of this, go to any page on the [http://archive.org way-back-machine] and look at any old webpage to see this problem.  The title text is implicitly referring to a problem of badly encoded characters, such as when the content is improperly marked as utf-8 or extended ascii simply due to that the choice was implicit when the content was created.  Any material we are creating digitally today will most certainly suffer the same type of fate in the future where people will be looking back and not understand why the their future tech will not be able to correctly and automatically determine the proper encoding.  And then there is the ironic truth of that digital content is reliant on technology, so in a future were we lose the technology or no longer are able to produce electricity we will have no possible way of retrieving any information of a computer harddisk and certainly all information will be lost forever. 
 +
 +
So who is feeling stupid printing a copy of the internet now?  We are actually saving it for future generations.
  
 
==Transcript==
 
==Transcript==

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)