Difference between revisions of "1700: New Bug"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Transcript: (Sorry for all the edits.))
Line 12: Line 12:
{{incomplete transcript}}
{{incomplete transcript}}
(Cueball sits in front of his computer.)
Cueball: Can you take a look at the bug I just opened?
Cueball: Can you take a look at the bug I just opened?

Revision as of 05:02, 29 June 2016

New Bug
There's also a unicode-handling bug in the URL request library, and we're storing the passwords unsalted ... so if we salt them with emoji, we can close three issues at once!
Title text: There's also a unicode-handling bug in the URL request library, and we're storing the passwords unsalted ... so if we salt them with emoji, we can close three issues at once!


Ambox notice.png This explanation may be incomplete or incorrect: Created by a BOT - Please change this comment when editing this page.
If you can address this issue, please edit the page! Thanks.


Ambox notice.png This transcript is incomplete. Please help editing it! Thanks.

(Cueball sits in front of his computer.)

Cueball: Can you take a look at the bug I just opened?

Off-panel: Uh oh.

Off-panel: Is this a normal bug, or one of those horrifying ones that prove your whole project is broken beyond repair and should be burned to the ground?

Cueball: It's a normal one this time, I promise.

Off-panel: OK, what's the bug?

Cueball: The server crashes if a user's password is a resolvable URL.

Off-panel: I'll get the lighter fluid.

comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!


I'm new. For the explanation: A bug, as in a computer (programming) bug, can be reported and tracked, and many systems allow collaboration on the reporting and tracking of problems, or bugs, in code, and their solutions. Cueball reported a problem (bug) he found in the code, which presumably caused the server (program)—which he wrote as part of his project—to try to read the passwords as URLs before storing them. This exposes serious cross-site scripting attacks and other serious security vulnerabilities, and since handling password and user account information usually requires a lot of programming, this would be difficult to fix, which is why the character off-panel suggests burning the project down, as that would be much easier, and would solve any security problems, much more quickly than fixing the bug would. The comment text refers to Cueball's horrid solution to a horrid problem: Instead of solving the problem that is causing the server to read passwords as URLs, he can instead leverage a known problem in the programme which reads URLs which prevents it from reading a particular way of representing text in binary form, by adding a few characters to the user's password that the URL-reading program can't read. This would also "salt" the user's password, which is a security technique that makes passwords harder to figure out when they are stored properly. Cueball thinks this would solve the original problem, and two other problems at the same time, the second problem being the fact that user's passwords aren't salted (a security problem). The third solved problem is difficult to deduce.  Zyzygy 05:40, 29 June 2016 (UTC)

The third bug is the unicode handling, which would need to be solved in order to salt passwords with emoji since these are unicode only character. Although I'm not sure if salting with emoji really increases security since as a rule i'd say nobody uses emoji in their passwords. 06:34, 29 June 2016 (UTC)
Password: 👍🐎🔋Π 10:11, 29 June 2016 (UTC)
That is a really funny password, but is it strong ennough? :-) --Kynde (talk) 20:22, 29 June 2016 (UTC)
Actually, nobody using emoji in their password would be reason salting with emoji is MORE effective. Salting doesn't really increase security of single password, but it does increase security of whole password database, because you can hash some string - like, 123456 and check whole database for users having that as password. If every password is salted with different emoji, this strategy will not work, because while you KNOW which emoji is used - the salt is stored unhashed with the password hash - it's always different so you need to compute new hash for every line in password database. Hashing takes MUCH more time than just comparing strings. And how it's even more effective? Because someone might actually get multiple databases and search for entries with same salt, hoping there will be enough of them to be worth it. And salt with emoji likely wouldn't be so common ... -- Hkmaly (talk) 09:54, 29 June 2016 (UTC)

From (long rant)

Two comments: first, the explanation on password salting is incorrect. The current version says "Salting passwords increases security by adding random data to the passwords which primarily helps defend against dictionary attacks.". Password salting only protects against particular kinds of (common) attacks in specific situations. Most importantly, it is designed to protect passwords only in the event ofa database breach, when a malicious user has gained direct access to the database itself. Password salting provides no protection when brute force attacks (aka dictionary attacks) are directed at the application itself, as the application automatically takes hashes into account. Instead, proper password salting randomizes the hash for each password, ensuring that if two users have the same password, they will not have the same hash. This makes it much more difficult to guess passwords through attack vectors like lookup tables, reverse lookup tables, and rainbow tables. However, because the salt has to be stored with the password (otherwise the application would not be able to make sense of the hash itself), password salting does not secure passwords against dictionary attacks even in the event that a malicious user has managed to acquire the database itself. I will update the explanation with a brief description of what password salts do.
Finally, I think there is a big misunderstanding throughout this explanation. In a web services context the "server" (referenced in the comic) is a very different thing than the application that a programmer builds. A server can refer to either the computer itself or the software that is responsible for responding to web requests and executing the actual application. In a professional context, the application (which is what cueball would be building) would never be referred to as the "server". It is possible that this is a mis-use of terminology on the part of cueball or Randall, but I suspect that the term was used properly and intentionally. The reason is because if cueball's application is crashing the *server*, it takes the level of incompetence up to completely new (and unusual) levels, in much the same way that he has done in the past. Normally the programming language used to build the application, the software hosting the application, and the operating system itself have a number of safe guards in place to ensure that if an application misbehaves, the only thing that crashes is the application itself. For cueball's application to break through all those safeguards and crash the server itself (either the operating system or the web server software) would require cueball to have developed a program that operates *well* outside the bounds of normal procedures. Just for reference, as someone who has been building web software for over 15 years, I wouldn't even know where to start to crash the server from within an application. It would probably have to involve either exploiting a previously unknown bug in the programming language or some *very* poorly designed system calls. Special:Contributions/ 15:11, 29 June 2016‎ (Rememeber to sign your comments)
If you by 'server' means Apache it is not completely unexpected that a sloppy coded extension to Perl or PHP could crash part of the server – and I still maintain mod_perl code that does 'fancy' stuff. I wouldn't be surprised if Cueball still wrote web-application as he did when mod_perl was the hot stuff. Today it is a common setup to have a chain of servers. In the front nginx for SSL termination, maybe an application level firewall filtering out spooky requests, then Varnish for caching and load-balancing and finally the application server running the actual web-application – all layers implementing the HTTP protocol. Which of these are 'the server'? At least it is often easy for the application developer to make the last server in the chain unresponsive (i.e. crashed). Pmakholm (talk) 12:08, 1 July 2016 (UTC)
Writing an extension to your language of choice would probably be a good way of crashing the server, although I would say that writing extensions to the language itself is not a common thing for most people to do. I suppose I'm arguing from experience (which is not always accurate) but in years of PHP and python programming I've never once had to write a language extension, nor did I ever need to for my very complicated thesis work. So I would say that the general point still stands: if Cueball is crashing any part of the server, he is doing things very wrong or at least very different.Cmancone (talk) 12:58, 1 July 2016 (UTC)
mod_perl (and it likes) are not language extensions (as they do not extend the language) but are plugins that extends the capabilities of the server (so as to be able to execute perl) 22:03, 5 July 2016 (UTC)

Regarding those last two points: sorry for my long unsigned rant. Didn't realize I wasn't logged in. Still haven't figured out how to sign comments. Gonna try it this time. Cmancone (talk) 16:51, 29 June 2016 (UTC)

You made it ;-) --Kynde (talk) 20:22, 29 June 2016 (UTC)

Explanation says "There is no reason for password handling code to access urls" but that is somewhat wrong -- Password handling code frequently perform heuristics on the password to assess the strength, for example checking if part of the password is a dictionary word -- similar heuristics could be done to check thatthe password is not a URL, such as "xkcd.com" applying DNS and other internet resources as an extention of the concept of "dictionary". Spongebob (talk) 16:11, 29 June 2016 (UTC)

I think http://xkcd.com/correct/horse/battery/staple/ would be a perfectly fine password, even though it is also an URL – but a heuristic that just looks at the length of the password and if it only contains alphanumeric characters would probably be fooled. Trying to detect the scheme used to generate the password could be helpful in choosing a relevant heuristic for deciding the password strength. Ont the other hand, I would consider it very bad to actually test whether the URL is resolvable in any way that leaks information about the password to the outside. Pmakholm (talk) 11:11, 1 July 2016 (UTC)
leaking password information to the outside would be bad, but then again any implementation of a password URL resolving scheme would just add emoji salting 22:03, 5 July 2016 (UTC)

Currently the explanation says: "Finally, emoji will often include unicode characters, which means that, if one can effectively salt passwords with emoji, then the passwords should be able to be stored in unicode (although that *probably* doesn't require anything outside the Base Multilingual Plane, so that might not need full unicode support after-all)." I'm fairly convinced that this doesn't make sense and is incorrect. Regardless of what character encoding the password is in, hashing will convert the entire thing into binary. This binary is then typically stored as a base64-encoded string in the database. Ergo, it doesn't matter whether the original password strings were in unicode or not: they will be stored in the database as ascii (or binary), not unicode. I'm going to go ahead and remove this comment from the explanation. I'm pretty certain that there isn't enough information in the comic to figure out why salting passwords with emoji would fix a unicode-handling bug in the URL request library. So I suspect that there is no explanation there: either Cueball is entirely confused and his statement makes no sense, or there is simply not enough information given to help us understand why this solution might fix the problem. However, I'm not going to make any updates to the explanation about this yet, because perhaps I'm missing something someone else will notice. Cmancone 12:50, 29 June 2016 (ETC)

I suspect that the salting with emoji is to ensure that the password does not resolve as a URL (since the library cannot understand the encoding), and thus the issue of the crash is resolved, the issue of unsalted passwords is solved, and the issue of the unicode handling bug is "solved" by virtue of it now being a feature relied on by the system. 20:50, 29 June 2016 (UTC)

I removed from the explanation the discussion about how Cueball's system might be checking passwords to see if they are resolvable URLs as a check against weak passwords. The problem with this explanation is that if that is where the crash was happening, then salting the password with emoji would not fix the crash. For salting to fix the crash (as Cueball suggests it will) requires that the crash be happening during the hashing process, not during password validation. The reason is because password validation is performed on the original password itself, while only hashing happens on the salted password. So for salting to fix the crash it must be happening during hashing, not validation. If the bug is happening while checking passwords for strength then Cueball's suggestion of fixing it by adding in a salt will not actually fix the crash at all. It could be that Cueball is simply completely wrong about everything, but I think it makes more sense to go with an explanation where the title text didn't just get everything wrong.Cmancone (talk) 13:12, 30 June 2016 (UTC)

In the nomenclature I am familiar with mailto:[email protected] is a URL. If you accept that mailto (and other protocols) are also URLs some of the description is untrue. Fortunately the untrue bits are also unnecessary and can be deleted or generalized.-- 04:51, 1 July 2016 (UTC)

Definitely a sequel to 1084108.162.221.92 08:27, 1 July 2016 (UTC)

The explanation of 'Resolveable URL' is very confusing and mostly wrong, it is mixing up technologies like HTTP and DNS and is confusing FQDN's with URLs. Though admittedly, the term 'resolveable URL' is a bit of a misnomer by itself. URLs are typically not resolved, they contain an FQDN that is resolved via DNS to an IP(v6) address and optionally port. The remainder of the URL can be used to identify a resource on that server, but how this is done and signaled is quite application/protocol dependent (and shouldn't be called 'resolving'). So if you hit a 404 the FQDN actually resolved but the HTTP resource could not be found. A non-resolveable URL would give a browser error like 'unknown host'. 09:59, 1 July 2016 (UTC)

Feel free to adjust the text as desired. I think in this case it is best to understand "Resolvable URL" as Cueball meant it, which (I think) is as any valid URL you might stick in your web browser. I don't think he meant any of the more technical possible definitions. In that sense a resolvable URL would be something that points to a server, potentially followed by a resource on the server. At least, that is how I would think to describe it. Feel free to give it a go. -- Cmancone (talk) (please sign your comments with ~~~~)

Around the time this comic was published, Firefox linked on its start-page to https://advocacy.mozilla.org/en-US/encrypt/codemoji/2, where a webapp Codeempji is available. With this webapp one can use emojis to scramble messages with a Caesar-cipher like method. This might be the reason for the emoji reference in the title text. 09:58, 4 July 2016 (UTC)

A good addition could be that the passwords were most probably stored in plaintext, which is even worse. Jonsku99 (talk) 12:30, 26 July 2016 (UTC)

I think the real horror of this bug has not been addressed in the description above: In order to check if a password is a resolvable domain name it has to be sent to a domain name server - and the connection to the next domain name server is traditionally completely unencrypted => The system's ability to check if the password is a resolvable URL implies that all passwords might be known to everyone who has access to a piece of the internet infrastructure. Things cannot get much worse to a sysadmin than this - except perhaps for later finding the leaked passwords for sale in a public place, that is. Gunterkoenigsmann (talk) 07:10, 15 October 2016 (UTC)

Correct me if I'm wrong, but doesn't DNS resolution only send the domain name to the DNS server, not the entire URL? 21:31, 24 July 2021 (UTC)