Difference between revisions of "Main Page"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Latest comic)
(Latest comic)
Line 11: Line 11:
 
<span style="float:right;">[{{fullurl:1091: Curiosity|action=edit}} '''edit this explanation!''']</span>
 
<span style="float:right;">[{{fullurl:1091: Curiosity|action=edit}} '''edit this explanation!''']</span>
 
<br clear="right">
 
<br clear="right">
{{LATESTCOMIC}}
+
{{:{{LATESTCOMIC}}}}
 
</div>
 
</div>
  

Revision as of 23:07, 7 August 2012


Welcome to the explain xkcd wiki! We currently have 12 comic explanations. Come and add yours!

Latest comic

edit this explanation!

Bloom Filter
Sometimes, you can tell Bloom filters are the wrong tool for the job, but when they're the right one you can never be sure.
Title text: Sometimes, you can tell Bloom filters are the wrong tool for the job, but when they're the right one you can never be sure.

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: PROBABLY CREATED - Please change this comment when editing this page. Do NOT delete this tag too soon.

The comic is referring to a Bloom Filter, a data structure that is used for approximate membership queries and cardinality estimation using a bounded amount of memory. That is, after a series of objects are added to the bloom filter, given another object, the bloom filter can be queried to see if that object has already been added to it, with a chance of a false positive answer that depends on the size of the bloom filter. Or, the bloom filter can be queried for an approximate count of the objects that have been added to the bloom filter already.

A bloom filter uses a large bit array, and a number of hashing functions that produce indexes into this array. When a value is added to the set, it's hashed with each function, and the corresponding bits in the array are set to 1. To test if a value is in the set you hash it with all the functions, and check if all the bits are 1. If they are, the value may be in the set, but there can also be false positives because each hash collides with some other value in the set (assuming reasonable hash functions, a different element for each hash). But if any of the bits is 0, you know for sure the value is not in the set. The higher the ratio between the size of the bit array and the number of elements in the set, the smaller the false positive rate is (10 bits/element has about 1% false positives.

The joke in the comic is that Cueball has a 1-bit Bloom filter. When the set is empty, it accurately reports that any value is not in the set. But as soon as anything is added to the set, it has a very large false positive rate, since that single bit will be set and everything will hash to that index. Similarly the cardinality estimation is (correctly) 0 initially, but after the first addition the estimate will be "somewhere between 1 and infinity" which is not a terribly useful estimate.

There's also no point in having multiple hash functions for a 1-bit filter, since there's only one possible hash value.

The title text references how bloom filters are always accurate in saying that an element is not in the list (bloom filters are not correct), but you can never be sure if an element is actually in the list (when a bloom filter actually is correct), because of false positives.

Transcript

Ambox notice.png This transcript is incomplete. Please help editing it! Thanks.
[Ponytail holds out her hand to Cueball, who is holding a paper with a 1 on it.]
Ponytail: Does your set contai-
Cueball: Yeah, probably.
[Caption below the panel:]
One-Bit Bloom Filter


New here?

Feel free to sign up for an account and contribute to the explain xkcd wiki! We need explanations for comics, characters, themes, memes and everything in between. If it is referenced in an xkcd web comic, it should be here.

  • List of all comics contains a complete table of all xkcd comics so far and the corresponding explanations. The red links (like this) are missing explanations. Feel free to help out by creating them!

Rules

Don't be a jerk. There are a lot of comics that don't have set in stone explanations, feel free to put multiple interpretations in the wiki page for each comic.

If you want to talk about a specific comic, use its discussion page.

Please only submit material directly related to—and helping everyone better understand—xkcd... and of course only submit material that can legally be posted (and freely edited.) Off-topic or other inappropriate content is subject to removal or modification at admin discretion, and users posting such are at risk of being blocked.

If you need assistance from an admin, feel free to leave a message on their personal discussion page. The list of admins is here.

Explain xkcd logo courtesy of User:Alek2407.