1676: Full-Width Justification

Explain xkcd: It's 'cause you're dumb.
Revision as of 20:08, 28 September 2018 by (talk) (โ†’โ€ŽExplanation)
Jump to: navigation, search
Full-Width Justification
Gonna start bugging the Unicode consortium to add snake segment characters that can be combined into an arbitrary-length non-breaking snake.
Title text: Gonna start bugging the Unicode consortium to add snake segment characters that can be combined into an arbitrary-length non-breaking snake.

Hello! http://viagrafse.com/ , , http://viagraonlqw.com/ , , http://cialisjqp.com/ , , http://cialistas.com/ , , http://viagralkq.com/ , ,


Strategies for full-width justification
[Below the caption is a column with six boxes, each showing a different "strategy" for justification which is annotated beside it. Here the annotation is written at the top and the text below. The top and bottom of the text is cut of in the middle, but as it can be "read" this is written anyway. Only for hyphenation does an extra word appear at the end. In the last with snakes, a snake is drawn to cover the entire spaaace from the end of between to the right border.]
Giving up
their famous paper
on the relationship
and the growth of
Letter spacing
their famous paper
on the relationship
b  e   t   w   e  e   n
and the growth of
their famous paper
on the relationship
between deindus-
trialization and the
growth of ecological
their famous paper
on the relationship
and the growth of
their famous paper
on the relationship
between crap like
and the growth of
their famous paper
on the relationship
between ๐Ÿ [a snake filling the gap]
and the growth of


  • The full text (with alternate changes) reads:
...their famous paper on the relationship between [crap like]/[ ๐Ÿ ] deindustrialization and the growth of [ecological]...
  • An approach not depicted is to treat justification as part of a spherical typesetting strategy which allows words to move between lines even where this is not locally optimal. Its net effect in a case like this is to pull words from the previous line for use as filler. This approach is used by TeX.
  • In Arabic, it is common to stretch the lines connecting letters as a relatively elegant and satisfying resolution to this problem. This trick is called "kashida" (ูƒุดูŠุฏุฉ). There does in fact exist a Unicode character, U+0640: (ู€), to help with this: using it to extend "ูƒุดูŠุฏุฉ" would result in something like "ูƒู€ู€ู€ู€ู€ุดู€ู€ู€ูŠู€ู€ู€ุฏุฉ" (which, incidentally, looks a lot like a snake).
  • Jim Chapman, developer of Windows 10 e-reader app Freda, has implemented snake-justification in the app, now available on the Windows Store. For best results, use the 'settings' screen to switch 'hyphenation' to 'no', 'use snakes' to 'yes', and choose a large font size (33 or so). Then pick a book with long words and justified text, and read it in a narrow window.
  • The comic has been discussed on the Unicode Mailing List.
  • The typesetting system SILE implemented snake justification on the same day the comic was published.
  • "Line Fillers" depicting animals (including snakes) were widely used in medieval book art.

comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!


I added the emoji snake. Is emoji snake the same as a Unicode snake would be? Azule (talk) 05:46, 4 May 2016 (UTC)

I assumed Unicode snakes would use three different characters: a head, a body segment, and a tail. Your solution is good, but objectively not perfect compared to what's shown in the comic.
So what would be the optimal snake transcription method here? A parenthetical aside saying "A drawing of a snake stretches to the right end of the line."? Or should we just blackmail the Unicode consortium again? ~AgentMuffin
The correct solution is obviously to include a 16 Mpixel image of a snake.Henke37 (talk) 07:41, 4 May 2016 (UTC)
Emoji full snake is already in Unicode as Azule knows. &#x1f40d = 🐍
Segmented snake needs at least three characteres: head, e.g. ยฐ, body e.g ~ and tail, e.g. โ—.
Three segment snake ยฐ~โ—
Four segment snake: ยฐ~~โ—
Demro (talk) 12:45, 4 May 2016 (UTC)

Could the title text also be a reference to the snake in umwelt? Azule (talk) 05:46, 4 May 2016 (UTC)

Amazon is notorious for being bad at this. Here's a somewhat related Computerphile video. Eno (talk) 06:32, 4 May 2016 (UTC)

Also, funnily enough, the filler text and the snakes were used in medieval (hand-written) manuscripts. Although it's not a snake but usually a nondescript wriggle that could only pass as a snake when you're squinting really hard. For filler text it's usually low-content words like "truly", "verily", "indeed", "without fail", "in truth" or stuff like that. So it's really an old problem with no satisfactory solution developed in hundreds of years... 08:19, 4 May 2016 (UTC)

This practice of filling the line with a dingbat carried on into the days of handset letterpress (i.e. up until the early 1900's), although it gradually became more whimsical and so less frequent in serious works. 12:28, 4 May 2016 (UTC)

In practice you reformulate. Not necessarily insert filler words, but just reorder the sentence enough that justification works. That is assuming the automated justification doesn't work, which will try a combination of multiple methods like word-spacing, letter-spacing and hyphenation. Imagine hyphenating at "de-" instead, but adding a little bit extra letter space in "between", and almost double normal word space between "between" and "de-". 08:20, 4 May 2016 (UTC)

Reformulating can only be done with the (tacit or explicit) permission of the author. There are situations where rewording would not be allowed. 12:28, 4 May 2016 (UTC)

While the arabic part is interesting, I don't feel it to be very relevant here. 09:11, 4 May 2016 (UTC)

It is relevant because is yet another solution (useful only in Arabic). Demro (talk) 12:47, 4 May 2016 (UTC)

Sorry- how do add a [citation needed] in superscript? Transuranium (talk)Transuranium

The "snake" option is actually less out there than the current explanation indicates. Snakes proper were not necessarily the go-to, but the same general strategy (decorative filling) was used heavily in illuminated manuscripts in the medieval period. 14:36, 4 May 2016 (UTC)

Came here just to say that. The current explanation needs reworking because that's actually one of the oldest ways of dealing with text justification. Check for example the Book of Kells 20:15, 4 May 2016 (UTC)
Modified the explanation accordingly. 21:44, 4 May 2016 (UTC)

"the Unicode consortium is very specific about which characters are added[citation needed], and always require a good reason[citation needed] before adding a character or set of characters to the standard." Seriously? Then what are all the emoji pages added for? U+1F459 (Bikini) ๐Ÿ‘™, for example... 04:05, 5 May 2016 (UTC)

Emoji were added because Japanese cellphones had introduced them with wild success. A stable standard was badly needed, and the Unicode Consortium, whose job it is to make such standards, complied, after some hesitation. 17:55, 9 May 2016 (UTC)
In case of bikini, I would suspect the gender of Unicode consortium members is the reason ... -- Hkmaly (talk) 17:52, 5 May 2016 (UTC)

I suspect that U+13192 (EGYPTIAN HIEROGLYPH I009A) is actually a "snake building" character in the sense that it is a horned viper coming out of a building. I do not however have easy access to a copy of the original source reference (Gardinerโ€™s "Supplement to the Catalogue of the Egyptian Hieroglyphic Printing Type Showing Acquisitions to December 1953") that was the basis for adding this character in Unicode 5.2. Poslfit (talk) 20:19, 10 May 2016 (UTC)

Found a list online and have updated the main text accordingly. Poslfit (talk) 20:53, 10 May 2016 (UTC)

I changed "Hyphenation is also confusing as it often leaves two partial non-words" with "Hyphenation is confusing in English because its spelling requires full-word recognition". In many (if not most) languages two partial non-words can be easily read. The hyphenation problem is probably unique to English. 13:06, 5 May 2016 (UTC)

In most languages, the cases where the hyphenation will be confusing will be rare. In English, the cases where the hyphenation will NOT be confusing will be rare. -- Hkmaly (talk) 17:52, 5 May 2016 (UTC)
On the contrary, it will generally result in non-words (and hence difficulty reading) regardless of which language you're writing in. Unless maybe you're dealing with logographs, e.g. in written Chinese languages. Flipping Mackerel (talk) 03:32, 6 May 2016 (UTC)

For hyphenation would it make sens to also talk about the case where it create new words which can be offensives ? Ex therapist -> the-rapist 22:37, 9 May 2016 (UTC)

Letter Spacing in German

Hi there...

I guess the statement concerning letter spacing being not available in German isn't (wasn't ever) entirely accurate.

Letter spacing has since the demise of black letter typing become obsolete and is nowadays merely used to emphasise surnames or city names in administrative paperwork. But even in ancient times of German black letter usage, letter spacing wa salso used to achieve justification. If something was to be emphasised in such a line, the spaces would've been even larger, maintaining a certain ratio between regular letter spaces and emphasised letter spaces.

However, since letter spacing is as uncommon in German typing as black letters are, it may be used for justification without any concern. In order to emphasise certain words, italic, bold or underlined text is the means of choice.

Personally, I prefer letter spacing and hyphenation combined, although snakes seem to be the real deal! 14:29, 29 July 2016 (UTC)