2143: Disk Usage

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
Disk Usage
Menu -> Manage -> [Optimize space usage, Encrypt disk usage report, Convert photos to text-only, Delete temporary files, Delete permanent files, Delete all files currently in use, Optimize menu options, Download cloud, Optimize cloud, Upload unused space to cloud]
Title text: Menu -> Manage -> [Optimize space usage, Encrypt disk usage report, Convert photos to text-only, Delete temporary files, Delete permanent files, Delete all files currently in use, Optimize menu options, Download cloud, Optimize cloud, Upload unused space to cloud]

Explanation[edit]

Many personal computers provide a way to obtain a graphical breakdown of how their storage space is being used, most commonly by representing the filesystem as a pie chart in which each slice represents the proportion of the total storage space being taken up by a particular item.

In this comic, Randall has illustrated the usage of his hard disk drive in just such a way, although as is common for him, the items in his hard drive start off seemingly normal and become increasingly strange:

Item Explanation
Photos Digital photographs are a common item to be stored on a hard disk; many people take lots of photographs with their smartphone or a camera, and will commonly transfer them to a disk drive for safekeeping, editing and/or organizing. With the high resolution of modern cameras and the ease of taking photos, it is common for photo collections to consume a significant amount of disk space.
Good photos On the flipside, the ease of taking photographs means that it is very easy to take bad photographs, particularly as most people are not experienced at photography. The pie chart is rather bluntly indicating that of the many photographs Randall has taken, only a vanishingly small fraction of them are actually good.
Documents On a file system, "Documents" is generally used as a catch-all term for the user's personal files.
Everything you've streamed since 2017 Streaming is a term that refers to accessing audio or video content on the Internet without downloading the entire media file first - it is instead played while it's being retrieved. An example of streaming is watching a YouTube video. Assuming a weekly 2h live stream (@4Mbps) between 2017-01-01 and 2019-04-29, these recordings would be 425GB in size. When these files take up 6% of all the used disk space, the full amount of used space would be roughly 7TB, which is plausible, given the rise of 10TB hard disks in 2016.

It might also be referring to temporary media files that were stored on the disk while it was being "streamed" for viewing or listening from the Internet and never deleted when done.

A single five-year old PowerPoint presentation Almost a tenth of the entire disk space is taken up by a single file, a presentation made five years ago in Microsoft PowerPoint. It's unclear why Randall has kept this file or why it is so huge - possibly it is important to him for some reason, or perhaps he can't bear the thought of throwing information away, regardless of how much storage it requires.

While it's possible that the file may genuinely be long or detailed enough to require so much space, it could also be that the file is bloated due to PowerPoint's strategy of converting compressed graphics to full-resolution bitmaps for historical cross-platform compatibility. This has been known to result in PowerPoint decks that are much larger than the sum of their component files.

"System" This would be files related to the computer's Operating System. While these files will generally show up on a disk usage analysis, it is generally recommended to leave them alone, as they may be critical to the computer's operation. A well-known trolling tactic involves tricking unsuspecting users into deleting their critical system files (eg. the "System32" folder on Windows), which renders the operating system unusable.
Unused Parkinson's law, the computer storage corollary, says that data expands to fill the space available for storage. As such, this sliver representing the unused portion of the storage device will always be tiny.
"Cache" The operating system and other programs often keep copies of data they've used or downloaded in case they need to use that data again; such data is usually stored in cache files. Often these can be deleted without too much ill effect, but some programs have different ways of deleting their own cache files.
"Other" People attempting to organize their files will often end up creating a directory called "Other" or "Misc" for any files that they could not categorize. On Randall's hard disk, this "Other" directory takes up a significant amount of disk space, indicating that either his categorization system isn't working very well, or he doesn't have the discipline to properly maintain his file organization. Alternatively, this could be a category defined by the usage report, which would include anything it can't categorize - often a strangely large portion of the files.

Another possible explanation is that folder names like "Other", "Cache" and "System" refer to storing porn while trying to hide this fact by using innocuous folder names, hence the quotation marks.

Why are there two full backups of my phone from 2015 deep in a settings folder? A settings folder is a directory that usually contains configuration data for a program, but could also potentially contain other data relevant to that program's operation. A phone backup program might store a backup of a phone to this location as part of its operation.

Alarmingly, the "Unused" portion of the pie chart is extremely small, which means the disk is nearly full with very little remaining capacity. Users don't usually worry about what is using space on their computer disk until they get an alert about the disk running out of space - this is likely when a user would resort to viewing this type of graph to figure out what they can delete to free up disk space.

The title text references the management UI of a hypothetical disk cleaning utility. The following options are mentioned in its menu:

Option Explanation
Optimize space usage A common nondescript phrase often found in such tools, this option would presumably perform actions to increase the amount of available space. One such way could be to use deduplication to delete files that contain duplicated or redundant data.
Encrypt disk usage report Disk encryption is a common security measure to prevent unwanted parties from reading the contents of a hard drive unless they know the passphrase. However, this option would encrypt the disk usage report itself, which is not very helpful as the report simply contains output that the user requested. However, given that the output can show the potentially personal things a person may have on their hard drive, this may suggest that the unusual disk usage is embarrassing enough that the user may want to encrypt the usage report to prevent other people from reading it.
Convert photos to text-only Text files are typically much smaller than images, as a typical image requires a lot more information to represent it than the usual use case for a plain text document. Therefore, on the surface this option seems like it could be a potential disk space optimization. However, there is no general way to convert photographs to text, nor is it clear what this would mean, nor would it likely be desirable to do so. The most space-efficient image-to-text conversion would be to replace the photo file with a text file containing a short description of the photo, for example using an AI algorithm like CaptionBot. However, most people would consider the loss of visual information to be unacceptable. An alternative would be to convert the photos into ASCII art, by converting regularly sampled blocks of pixels to ASCII characters that closely approximate the general shape and shade of those pixels. This would result in a low-fidelity impression of the photo, which would indeed significantly reduce the file size, but would still likely be considered an unacceptable degradation of the photo. Another possibility is to use optical character recognition (OCR) to automatically transcribe the text in the image; however, this only works on images containing text and can produce garbage output when applied to non-textual images, which most photographs are. Finally, it could be that the tool turns image files into text files by changing the filename extension to .txt, but this would not save any space, and would only make the files more difficult to open as the operating system might fail to recognize them as images.
Delete temporary files This is a real option. Temporary files usually contain ephemeral data used by programs as part of their operation. For example, cache files are usually temporary - they contain data which is being kept locally to speed up access and avoid the need to refetch or recalculate data. However, it would not be a problem to delete this data, as it can simply be reacquired if needed. Temporary files are often not deleted automatically, so deleting them can save a significant amount of disk space.
Delete permanent files Playing on the previous option, a "permanent file" is a made-up term, as no files are permanent (they can always be deleted). However, if we interpret a permanent file to mean the opposite of a temporary file, this would refer to the user's documents, pictures, etc., plus all their operating system files. You would not want to delete these.
Delete all files currently in use Operating systems will typically lock files while a program is actively accessing them, which prevents other programs from modifying the file until it is unlocked. This prevents clashes that can occur when two processes are modifying the same data. If you attempt to delete a file locked in this way, you'll usually be warned that the file is currently in use, and likely be prevented from doing so. This option, however, apparently gives you the specific power to delete only the files that are in use, which would most definitely result in data loss or program crashes, including perhaps even the program doing the deleting, making it effectively single-use. Windows explicitly disallows deleting open files (since open files are internally referred to by their names), and Linux, etc. provide locking mechanisms to prevent it, since it can cause data loss. Deleting all open files would be catastrophic, especially if it included system utilities & the kernel. If this hypothetical disk usage program is capable of deleting all files in use anywhere on the planet, it would be considerably worse[citation needed] (and given some of the options, that possibility can't be ruled out).
Optimize menu options This is a play on the first entry, except this time it optimizes the menu options themselves. It is unclear what exactly this would entail. One way to optimize a menu could be to put the most useful options at the top to reduce the amount of time needed to find them.
Download cloud Cloud refers to cloud storage, which is storage space on remote machines that can be requisitioned on demand. Cloud providers usually have a huge amount of available storage space in dedicated datacenters to meet the needs of their clients; thus, downloading the entire cloud would be many orders of magnitude to fit on a typical desktop computer. That said, in 908: The Cloud, the cloud is depicted as (ultimately) running on a single desktop-sized server in Black Hat's house, by making heavy use of caching.
Optimize cloud Again, it is not clear how an entire cloud storage system would be optimized, and in any case, it would not be the job of a simple disk space usage utility to optimize a cloud provider's data storage for them. Alternatively, perhaps this option is how the aforementioned Black Hat is able to run the cloud on his desktop server.
Upload unused space to cloud This option is nonsensical, as you cannot upload disk space; you can only upload the data contained within that space. If it were somehow possible to upload disk space itself (and that doing so simultaneously removed the space from your disk), this would result in your disk having less space available, which is the opposite of what a disk cleaner utility is supposed to do. If, on the other hand, the option uploads the data contained within unused disk space, this is possible but problematic for a different reason: unused disk space often contains actual data that was previously deleted. This is because deletion typically doesn't erase data from existence; it simply frees the space used by the data, and removes any reference to the data from the file system. The data itself is, in most cases, still there, and will remain there until something else claims the disk space. Data recovery tools take advantage of this to "undelete" data by recovering it from the unused space. If you were to upload your unused space to the cloud, it may contain information that you wanted to remain deleted.

Transcript[edit]

[This comic shows a pie chart with 10 slices, each with a label and a line pointing to these ten different sized slices. There is a caption above the chart:]
Disk Space Usage Report
[The labels on each slice is given in anti-clockwise order starting from the 12 o'clock position. The percentages are estimated from the image and are noted in the square brackets before the transcript:]
[18%] Photos
[1%] Good Photos
[3%]: Documents
[6%]: Everything you've streamed since 2017
[9%]: A single five-year-old PowerPoint presentation
[21%]: "System"
[2%]: Unused
[9%]: "Cache"
[23%]: "Other"
[8%]: Why are there two full backups of my phone from 2015 deep in a settings folder?


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

Seems fine to me! 172.69.62.40 20:54, 29 April 2019 (UTC)

Finally managed to contribute something again! It's 00:00 now, so I'll pick this up tomorrow if noone else has by then -- //gir.st/ (talk) 21:56, 29 April 2019 (UTC)

I don't see what's alarming on ratio between good and bad photos. With digital cameras, actually choosing which photos are good takes more time than taking them in first place, so its often skipped. -- Hkmaly (talk) 22:47, 29 April 2019 (UTC)

Why do I feel so seen?!? Explain THAT! 162.158.255.88 23:52, 29 April 2019 (UTC)

I think the menu option about "Download Cloud" in the title text is referring to the general concept of the cloud - in other words, downloading the "entire" cloud, not their own personal cloud storage! Ianrbibtitlht (talk) 00:36, 30 April 2019 (UTC)

Why is there an IP editor bolding random letters? RandomIsocahedron (talk) 02:13, 30 April 2019 (UTC)

Look at the bolded letters. It's the guy who plasters 'soon the truth will be revealed' everywhere again. 162.158.114.94 09:58, 30 April 2019 (UTC)
I totally missed the message contained in the bold letters! I guess the truth was not revealed to me! Ianrbibtitlht (talk) 12:55, 30 April 2019 (UTC)

I've replaced the table with a description list, as per the Editor FAQ. It's obvious that one is the "item" and the other its "explanation." -- //gir.st/ (talk) 17:04, 30 April 2019 (UTC)

Thanks, I was just thinking I should probably do that :) Hawthorn (talk) 17:09, 30 April 2019 (UTC)


I think that "Optimize menu options" relates to the facility in older MS Office whereby it "hid" menu options that you hadn't used for "a while"; in theory "optimising" the menu and only showing you the options you used recently but in reality if you didn't use one of the programs for a long time you could open it and, so helpfully, find all your menu options gone. Not sure what to word in the explanation tho'. Nobby (talk) 07:15, 1 May 2019 (UTC)

FYI about the "Convert photos to text-only" part: Sometimes, when downloading pictures from internet the file is somehow saved as .txt and double click will successfully open as such (although it will look like gibberish as-is). If opened with a photo-viewing program that detects files with wrong extensions (IrfanView for example) the file can be renamed and opened as the photo file (.jpg usually) without loss of information. This can be used to hide photos not intended to be seen by other people.

"Convert photos to text-only" What about UUEncoding & UUDecoding? When newsgroups were much bigger, it was a popular way to transmit images. These Are Not The Comments You Are Looking For (talk) 04:36, 6 May 2019 (UTC)

that's an encoding, not a conversion. (this section, imo, already has way too many things. i'd trim it down to 1) ascii-art and 2) captionbot-like services) -- //gir.st/ (talk) 22:10, 7 May 2019 (UTC)

I mean he's got more free space than I do and I've got no videos, no documents, less than 5MB of photos, only about 1MB of which is actually on my computer's internal storage, and less than 2GB of applications, when you're using a laptop with 29.1GB max storage and no capability to format drives as internal for some reason or to increase said storage, you really start to hate system, and you wish that your internal storage could be replaced with that one SD you have that has more than 10 times the storage of your entire computer. 10:46, 26 June 2019 (UTC+12)

Randall has been known to use Mac computers (see comics referencing OS X "say" command, "Trash" folder, etc.) and the default MacOS disk storage display ( menu -> About This Mac -> Storage) is notorious for displaying huge wodges of used storage space with only the vague labels "System" or "Other", which can end up being infuriating when trying to clear out a hard drive. I'm pretty sure that's where this comic comes from. 172.68.66.63 11:10, 9 June 2022 (UTC)