Editing 2143: Disk Usage

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 4: Line 4:
 
| title    = Disk Usage
 
| title    = Disk Usage
 
| image    = disk_usage.png
 
| image    = disk_usage.png
| titletext = Menu -> Manage -> [Optimize space usage, Encrypt disk usage report, Convert photos to text-only, Delete temporary files, Delete permanent files, Delete all files currently in use, Optimize menu options, Download cloud, Optimize cloud, Upload unused space to cloud]
+
| titletext = Menu -> Manage -> [Optimize space usage, Encrypt disk usage report, Convert photos to text-only, Delete temporary files, Delete permanent files, Delete all files currently in use, Optimize menu options, Download cloud, Optimize cloud , Upload unused space to cloud]
 
}}
 
}}
  
 
==Explanation==
 
==Explanation==
Many personal computers provide a way to obtain a graphical breakdown of how their storage space is being used, most commonly by representing the filesystem as a pie chart in which each slice represents the proportion of the total storage space being taken up by a particular item.
+
{{incomplete|Created by a monstrosity of a powerpoint presentation. Please mention here why this explanation isn't complete. Do NOT delete this tag too soon.}}
 +
Computers save their data on memory; traditionally this was known as the hard disk drive, though it is increasingly common that it is not a literal disc and may be made of flash memory (this is traditional in computers such as smartphones). Users can ask the system for a breakdown of the usage of their hard disk, which is usually displayed as a pie chart such as this comic. As with everything else in Randall's computers in the comics, his hard disk usage is... strange.
  
In this comic, Randall has illustrated the usage of his hard disk drive in just such a way, although as is common for him, the items in his hard drive start off seemingly normal and become increasingly strange:
+
"Photos" and "Documents" are both common things to see in a disc usage report. Randall has marked a distinction between "photos" and "good photos". The ratio between these two suggests that there is an alarming number of "bad" photos on the computer.
  
{| class="wikitable"
+
The fact that almost a tenth of the storage is taken up by a single powerpoint presentation made five years ago means that it is either a ridiculously long, or ridiculously detailed presentation in order to use so much storage.
! Item !! Explanation
 
|-
 
|width=20%|'''Photos'''
 
|width=80%|Digital photographs are a common item to be stored on a hard disk; many people take lots of photographs with their smartphone or a camera, and will commonly transfer them to a disk drive for safekeeping, editing and/or organizing. With the high resolution of modern cameras and the ease of taking photos, it is common for photo collections to consume a significant amount of disk space.
 
|-
 
|'''Good photos'''
 
|On the flipside, the ease of taking photographs means that it is very easy to take ''bad'' photographs, particularly as most people are not experienced at photography. The pie chart is rather bluntly indicating that of the many photographs Randall has taken, only a vanishingly small fraction of them are actually good.
 
|-
 
|'''Documents'''
 
|On a file system, "Documents" is generally used as a catch-all term for the user's personal files.
 
|-
 
|'''Everything you've streamed since 2017'''
 
|Streaming is a term that refers to accessing audio or video content on the Internet without downloading the entire media file first - it is instead played while it's being retrieved. An example of streaming is watching a YouTube video. Assuming a weekly 2h live stream (@4Mbps) between 2017-01-01 and 2019-04-29, these recordings would be 425GB in size. When these files take up 6% of all the used disk space, the full amount of used space would be roughly 7TB, which is plausible, given the [https://www.anandtech.com/show/10106/western-digital-introduces-its-consumer-helium-drives rise of 10TB hard disks in 2016].
 
  
It might also be referring to temporary media files that were stored on the disk while it was being "streamed" for viewing or listening from the Internet and never deleted when done.  
+
"System" is another common thing to see in the graph. This contains files the user is not using but the computer needs to run, such as the operating system. Randall has put quotation marks around the word to highlight how nebulous the term actually is and how he has no idea what the system files actually are.
|-
 
|'''A single five-year old PowerPoint presentation'''
 
|Almost a tenth of the entire disk space is taken up by a single file, a presentation made five years ago in {{w|Microsoft PowerPoint}}. It's unclear why Randall has kept this file or why it is so huge - possibly it is important to him for some reason, or perhaps he can't bear the thought of throwing information away, regardless of how much storage it requires.
 
  
While it's possible that the file may genuinely be long or detailed enough to require so much space, it could also be that the file is bloated due to PowerPoint's strategy of [http://www.pptfaq.com/FAQ00062_Why_are_my_PowerPoint_files_so_big-_What_can_I_do_about_it-.htm converting compressed graphics to full-resolution bitmaps for historical cross-platform compatibility]. This has been known to result in PowerPoint decks that are much larger than the sum of their component files.
+
One more possible explanation is that folder names like "Other", "Cache", "System" refer to storing porn while trying to hide this fact by using unsuspicious folder names. Hence the quotes.
|-
 
|'''"System"'''
 
|This would be files related to the computer's {{w|Operating System}}. While these files will generally show up on a disk usage analysis, it is generally recommended to leave them alone, as they may be critical to the computer's operation. A well-known trolling tactic involves tricking unsuspecting users into deleting their critical system files (eg. the "System32" folder on Windows), which renders the operating system unusable.
 
|-
 
|'''Unused'''
 
|{{w|Parkinson's law}}, the computer storage corollary, says that data expands to fill the space available for storage.  As such, this sliver representing the unused portion of the storage device will always be tiny.
 
|-
 
|'''"Cache"'''
 
|The operating system and other programs often keep copies of data they've used or downloaded in case they need to use that data again; such data is usually stored in cache files.  Often these can be deleted without too much ill effect, but some programs have different ways of deleting their own cache files.
 
|-
 
|'''"Other"'''
 
|People attempting to organize their files will often end up creating a directory called "Other" or "Misc" for any files that they could not categorize. On Randall's hard disk, this "Other" directory takes up a significant amount of disk space, indicating that either his categorization system isn't working very well, or he doesn't have the discipline to properly maintain his file organization. Alternatively, this could be a category defined by the usage report, which would include anything it can't categorize - often a strangely large portion of the files.
 
 
 
Another possible explanation is that folder names like "Other", "Cache" and "System" refer to storing porn while trying to hide this fact by using innocuous folder names, hence the quotation marks.
 
|-
 
|'''Why are there two full backups of my phone from 2015 deep in a settings folder?'''
 
|A settings folder is a directory that usually contains configuration data for a program, but could also potentially contain other data relevant to that program's operation. A phone backup program might store a backup of a phone to this location as part of its operation.
 
|}
 
 
 
Alarmingly, the "Unused" portion of the pie chart is extremely small, which means the disk is nearly full with very little remaining capacity. Users don't usually worry about what is using space on their computer disk until they get an alert about the disk running out of space - this is likely when a user would resort to viewing this type of graph to figure out what they can delete to free up disk space.
 
  
 
The title text references the management UI of a hypothetical disk cleaning utility. The following options are mentioned in its menu:
 
The title text references the management UI of a hypothetical disk cleaning utility. The following options are mentioned in its menu:
 
+
;Optimize space usage
{| class="wikitable"
+
:A common non-descript phrase often found in such tools.
! Option !! Explanation
+
;Encrypt disk usage report
|-
+
:Usually, one would want to encrypt data on the disk, not reports about said data.
|width=20%|'''Optimize space usage'''
+
;Convert photos to text-only
|width=80%|A common nondescript phrase often found in such tools, this option would presumably perform actions to increase the amount of available space. One such way could be to use {{w|data deduplication|deduplication}} to delete files that contain duplicated or redundant data.
+
:Plain-text documents take less space than pictures. Scanned documents can be automatically transcribed (OCR). However, applying such an algorithm to photos will result in garbage.
|-
+
;Delete temporary files
|'''Encrypt disk usage report'''
+
:Another real option. Temporary files are often not deleted automatically, so deleting them can safe a significant amount of disk space.
|{{w|Disk encryption}} is a common security measure to prevent unwanted parties from reading the contents of a hard drive unless they know the passphrase. However, this option would encrypt the disk usage report itself, which is not very helpful as the report simply contains output that the user requested. However, given that the output can show the potentially personal things a person may have on their hard drive, this may suggest that the unusual disk usage is embarrassing enough that the user may want to encrypt the usage report to prevent other people from reading it.
+
;Delete permanent files
|-
+
:A made-up term, that might refer to the user's documents, pictures, etc. You would not want to delete them.
|'''Convert photos to text-only'''
+
;Delete all files currently in use
|Text files are typically much smaller than images, as a typical image requires a lot more information to represent it than the usual use case for a plain text document. Therefore, on the surface this option seems like it could be a potential disk space optimization. However, there is no general way to convert photographs to text, nor is it clear what this would mean, nor would it likely be desirable to do so. The most space-efficient image-to-text conversion would be to replace the photo file with a text file containing a short description of the photo, for example using an AI algorithm like [https://www.captionbot.ai/ CaptionBot]. However, most people would consider the loss of visual information to be unacceptable. An alternative would be to convert the photos into {{w|ASCII art}}, by converting regularly sampled blocks of pixels to ASCII characters that closely approximate the general shape and shade of those pixels. This would result in a low-fidelity impression of the photo, which would indeed significantly reduce the file size, but would still likely be considered an unacceptable degradation of the photo. Another possibility is to use {{w|optical character recognition}} (OCR) to automatically transcribe the text in the image; however, this only works on images containing text and can produce garbage output when applied to non-textual images, which most photographs are. Finally, it could be that the tool turns image files into text files by changing the {{w|filename extension}} to .txt, but this would not save any space, and would only make the files more difficult to open as the operating system might fail to recognize them as images.
+
:Deleting files that are in use might result in data loss or program crashes.  
|-
+
;Optimize menu options
|'''Delete temporary files'''
+
:Those options could really do with some optimization. (a reference to the first entry?)
|This is a real option. Temporary files usually contain ephemeral data used by programs as part of their operation. For example, cache files are usually temporary - they contain data which is being kept locally to speed up access and avoid the need to refetch or recalculate data. However, it would not be a problem to delete this data, as it can simply be reacquired if needed. Temporary files are often not deleted automatically, so deleting them can save a significant amount of disk space.
+
;Download cloud, Optimize cloud
|-
+
:Here, the cloud probably refers to cloud storage (online storage). Cloud storage would be too large by many orders of magnitude to fit, let alone download onto a desktop computer.  
|'''Delete permanent files'''
+
:"Optimize cloud" might again be a reference to "optimize disk usage".
|Playing on the previous option, a "permanent file" is a made-up term, as no files are permanent (they can always be deleted). However, if we interpret a permanent file to mean the opposite of a temporary file, this would refer to the user's documents, pictures, etc., plus all their operating system files. You would not want to delete these.
+
;Upload unused space to cloud
|-
+
:"Uploading empty space" is a) impossible and b) would result in less space being available, which is the opposite of what a disk cleaner utility is supposed to do.
|'''Delete all files currently in use'''
 
|Operating systems will typically lock files while a program is actively accessing them, which prevents other programs from modifying the file until it is unlocked. This prevents clashes that can occur when two processes are modifying the same data. If you attempt to delete a file locked in this way, you'll usually be warned that the file is currently in use, and likely be prevented from doing so. This option, however, apparently gives you the specific power to delete ''only'' the files that are in use, which would most definitely result in data loss or program crashes, including perhaps even the program doing the deleting, making it effectively single-use. Windows explicitly disallows deleting open files (since open files are internally referred to by their names), and Linux, etc. provide locking mechanisms to prevent it, since it can cause data loss. Deleting all open files would be catastrophic, especially if it included system utilities & the kernel. If this hypothetical disk usage program is capable of deleting all files in use anywhere on the planet, it would be considerably worse{{citation needed}} (and given some of the options, that possibility can't be ruled out).
 
|-
 
|'''Optimize menu options'''
 
|This is a play on the first entry, except this time it optimizes the menu options themselves. It is unclear what exactly this would entail. One way to optimize a menu could be to put the most useful options at the top to reduce the amount of time needed to find them.
 
|-
 
|'''Download cloud'''
 
|Cloud refers to {{w|cloud storage}}, which is storage space on remote machines that can be requisitioned on demand. Cloud providers usually have a huge amount of available storage space in dedicated datacenters to meet the needs of their clients; thus, downloading the entire cloud would be many orders of magnitude to fit on a typical desktop computer. That said, in [[908: The Cloud]], the cloud is depicted as (ultimately) running on a single desktop-sized server in [[Black Hat]]'s house, by making heavy use of caching.
 
|-
 
|'''Optimize cloud'''
 
|Again, it is not clear how an entire cloud storage system would be optimized, and in any case, it would not be the job of a simple disk space usage utility to optimize a cloud provider's data storage for them. Alternatively, perhaps this option is how the aforementioned Black Hat is able to run the cloud on his desktop server.
 
|-
 
|'''Upload unused space to cloud'''
 
|This option is nonsensical, as you cannot upload disk space; you can only upload the data contained within that space. If it were somehow possible to upload disk space itself (and that doing so simultaneously removed the space from your disk), this would result in your disk having less space available, which is the opposite of what a disk cleaner utility is supposed to do. If, on the other hand, the option uploads the data contained within unused disk space, this is possible but problematic for a different reason: unused disk space often contains actual data that was previously deleted. This is because deletion typically doesn't erase data from existence; it simply frees the space used by the data, and removes any reference to the data from the file system. The data itself is, in most cases, still there, and will remain there until something else claims the disk space. Data recovery tools take advantage of this to "undelete" data by recovering it from the unused space. If you were to upload your unused space to the cloud, it may contain information that you wanted to remain deleted.
 
|}
 
  
 
==Transcript==
 
==Transcript==
 +
{{incomplete transcript|Do NOT delete this tag too soon.}}
 
:[This comic shows a pie chart with 10 slices, each with a label and a line pointing to these ten different sized slices. There is a caption above the chart:]
 
:[This comic shows a pie chart with 10 slices, each with a label and a line pointing to these ten different sized slices. There is a caption above the chart:]
:Disk Space Usage Report  
+
:Disk Space Usage Report
  
:[The labels on each slice is given in anti-clockwise order starting from the 12 o'clock position. The percentages are estimated from the image and are noted in the square brackets before the transcript:]
+
:[The labels on each slice is given in clockwise order starting from the top middle. The percentages are estimated from the image and are noted in the square brackets before the transcript:]
  
 +
:[8%]: Why are there two full backups of my phone from 2015 deep in a settings folder?
 +
:[23%]: "Other"
 +
:[9%]: "Cache"
 +
:[2%]: Unused
 +
:[21%]: "System"
 +
:[9%]: A single five-year-old PowerPoint presentation
 +
:[6%]: Everything you've streamed since 2017
 +
:[3%]: Documents
 +
:[1%] Good Photos
 
:[18%] Photos
 
:[18%] Photos
:[1%] Good Photos
 
:[3%]: Documents
 
:[6%]: Everything you've streamed since 2017
 
:[9%]: A single five-year-old PowerPoint presentation
 
:[21%]: "System"
 
:[2%]: Unused
 
:[9%]: "Cache"
 
:[23%]: "Other"
 
:[8%]: Why are there two full backups of my phone from 2015 deep in a settings folder?
 
  
 
{{comic discussion}}
 
{{comic discussion}}
 
[[Category:Computers]]
 
[[Category:Computers]]
 
[[Category:Pie charts]]
 
[[Category:Pie charts]]

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)