Difference between revisions of "1676: Full-Width Justification"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Explanation: this is not limited to narrow spaces, happens in any multiline bit of text)
(Trivia)
Line 37: Line 37:
  
 
==Trivia==
 
==Trivia==
The full text (with alternate changes) reads:
+
*The full text (with alternate changes) reads:
 
+
::''...their famous paper on the relationship between [crap like]/[ 🐍  ] deindustrialization and the growth of ecological...''
''...their famous paper on the relationship between [crap like]/[ 🐍  ] deindustrialization and the growth of ecological...''
 
  
 
{{comic discussion}}
 
{{comic discussion}}

Revision as of 06:43, 4 May 2016

Full-Width Justification
Gonna start bugging the Unicode consortium to add snake segment characters that can be combined into an arbitrary-length non-breaking snake.
Title text: Gonna start bugging the Unicode consortium to add snake segment characters that can be combined into an arbitrary-length non-breaking snake.

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: hasty & impatient placeholder. Still an early draft; needs citations, fact-checking, and it also needs the Wikipedia links to be fixed.
If you can address this issue, please edit the page! Thanks.

The comic refers to an irritating problem in laying out text to fit from edge to edge, the problem of justification. Sometimes, as before a long word like "deindustrialization," there's no universal good way to make the typography work. It is a difficult problem to make text look good and be easily legible especially in a narrow space, with the biggest issue being how to handle words that are too long or too short to fit nicely.

The comic shows several solutions to this problem, some realistic and others less so, but each unsatisfying. "Giving up" is ugly, leaving a line break which doesn't fit with the rest; hyphenating is visually confusing and hard to read ("deindus-" looks like an independent, unfamiliar word, pronounced "deign-duss"); stretching is unnatural, probably hard to code or render, unfamiliar and quite ugly; adding "filler" words, a radical solution, makes the writing worse (in the case of the example, making the tone too informal); and adding a meaningless snake image, just long enough to fill the extra space, is a novel (and quite bizarre) solution which probably wouldn't actually be used by a serious typographer.

The title text suggests that in order to facilitate this last method of "solving" the problem, the Unicode consortium, the organization in charge of the common text standard Unicode, should add "snake-building characters", similar to the ones already available for constructing boxes [add note about that here?], allowing variable-length snakes to be used as filling. This suggestion is quite ridiculous; the Unicode consortium is very specific about which characters are added [citation needed], and always require a good reason before adding a character or set of characters to the standard. Thus, while humourous, Randall's suggestion would likely be rejected.

Note that in Arabic, it is common to stretch the lines connecting letters as a relatively elegant and satisfying resolution to this problem. This trick is called "kashida" (كشيدة) and is explained and illustrated here.

Transcript

Ambox notice.png This transcript is incomplete. Please help editing it! Thanks.

Strategies for full-width justification

[Below this headline is a column of boxes, each showing a different "strategy" which is annotated beside it.]

Giving up

Letter spacing

Hyphenation

Stretching

Filler

Snakes

Trivia

  • The full text (with alternate changes) reads:
...their famous paper on the relationship between [crap like]/[ 🐍 ] deindustrialization and the growth of ecological...


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

I added the emoji snake. Is emoji snake the same as a Unicode snake would be? Azule (talk) 05:46, 4 May 2016 (UTC)

I assumed Unicode snakes would use three different characters: a head, a body segment, and a tail. Your solution is good, but objectively not perfect compared to what's shown in the comic.
So what would be the optimal snake transcription method here? A parenthetical aside saying "A drawing of a snake stretches to the right end of the line."? Or should we just blackmail the Unicode consortium again? ~AgentMuffin
The correct solution is obviously to include a 16 Mpixel image of a snake.Henke37 (talk) 07:41, 4 May 2016 (UTC)
Emoji full snake is already in Unicode as Azule knows. &#x1f40d = 🐍
Segmented snake needs at least three characteres: head, e.g. °, body e.g ~ and tail, e.g. ◝.
Three segment snake °~◝
Four segment snake: °~~◝
Demro (talk) 12:45, 4 May 2016 (UTC)

Could the title text also be a reference to the snake in umwelt? Azule (talk) 05:46, 4 May 2016 (UTC)

Amazon is notorious for being bad at this. Here's a somewhat related Computerphile video. Eno (talk) 06:32, 4 May 2016 (UTC)

Also, funnily enough, the filler text and the snakes were used in medieval (hand-written) manuscripts. Although it's not a snake but usually a nondescript wriggle that could only pass as a snake when you're squinting really hard. For filler text it's usually low-content words like "truly", "verily", "indeed", "without fail", "in truth" or stuff like that. So it's really an old problem with no satisfactory solution developed in hundreds of years... 162.158.85.93 08:19, 4 May 2016 (UTC)

This practice of filling the line with a dingbat carried on into the days of handset letterpress (i.e. up until the early 1900's), although it gradually became more whimsical and so less frequent in serious works.108.162.241.123 12:28, 4 May 2016 (UTC)

In practice you reformulate. Not necessarily insert filler words, but just reorder the sentence enough that justification works. That is assuming the automated justification doesn't work, which will try a combination of multiple methods like word-spacing, letter-spacing and hyphenation. Imagine hyphenating at "de-" instead, but adding a little bit extra letter space in "between", and almost double normal word space between "between" and "de-".162.158.114.222 08:20, 4 May 2016 (UTC)

Reformulating can only be done with the (tacit or explicit) permission of the author. There are situations where rewording would not be allowed.108.162.241.123 12:28, 4 May 2016 (UTC)

While the arabic part is interesting, I don't feel it to be very relevant here. 108.162.249.156 09:11, 4 May 2016 (UTC)

It is relevant because is yet another solution (useful only in Arabic). Demro (talk) 12:47, 4 May 2016 (UTC)

Sorry- how do add a [citation needed] in superscript? Transuranium (talk)Transuranium


The "snake" option is actually less out there than the current explanation indicates. Snakes proper were not necessarily the go-to, but the same general strategy (decorative filling) was used heavily in illuminated manuscripts in the medieval period. 162.158.214.217 14:36, 4 May 2016 (UTC)

Came here just to say that. The current explanation needs reworking because that's actually one of the oldest ways of dealing with text justification. Check for example the Book of Kells 162.158.203.141 20:15, 4 May 2016 (UTC)
Modified the explanation accordingly.162.158.214.217 21:44, 4 May 2016 (UTC)

"the Unicode consortium is very specific about which characters are added[citation needed], and always require a good reason[citation needed] before adding a character or set of characters to the standard." Seriously? Then what are all the emoji pages added for? U+1F459 (Bikini) 👙, for example... 108.162.221.98 04:05, 5 May 2016 (UTC)

Emoji were added because Japanese cellphones had introduced them with wild success. A stable standard was badly needed, and the Unicode Consortium, whose job it is to make such standards, complied, after some hesitation.108.162.219.10 17:55, 9 May 2016 (UTC)
In case of bikini, I would suspect the gender of Unicode consortium members is the reason ... -- Hkmaly (talk) 17:52, 5 May 2016 (UTC)

I suspect that U+13192 (EGYPTIAN HIEROGLYPH I009A) is actually a "snake building" character in the sense that it is a horned viper coming out of a building. I do not however have easy access to a copy of the original source reference (Gardiner’s "Supplement to the Catalogue of the Egyptian Hieroglyphic Printing Type Showing Acquisitions to December 1953") that was the basis for adding this character in Unicode 5.2. Poslfit (talk) 20:19, 10 May 2016 (UTC)

Found a list online and have updated the main text accordingly. Poslfit (talk) 20:53, 10 May 2016 (UTC)

I changed "Hyphenation is also confusing as it often leaves two partial non-words" with "Hyphenation is confusing in English because its spelling requires full-word recognition". In many (if not most) languages two partial non-words can be easily read. The hyphenation problem is probably unique to English. 108.162.221.13 13:06, 5 May 2016 (UTC)

In most languages, the cases where the hyphenation will be confusing will be rare. In English, the cases where the hyphenation will NOT be confusing will be rare. -- Hkmaly (talk) 17:52, 5 May 2016 (UTC)
On the contrary, it will generally result in non-words (and hence difficulty reading) regardless of which language you're writing in. Unless maybe you're dealing with logographs, e.g. in written Chinese languages. Flipping Mackerel (talk) 03:32, 6 May 2016 (UTC)

For hyphenation would it make sens to also talk about the case where it create new words which can be offensives ? Ex therapist -> the-rapist 108.162.228.137 22:37, 9 May 2016 (UTC)

Letter Spacing in German

Hi there...

I guess the statement concerning letter spacing being not available in German isn't (wasn't ever) entirely accurate.

Letter spacing has since the demise of black letter typing become obsolete and is nowadays merely used to emphasise surnames or city names in administrative paperwork. But even in ancient times of German black letter usage, letter spacing wa salso used to achieve justification. If something was to be emphasised in such a line, the spaces would've been even larger, maintaining a certain ratio between regular letter spaces and emphasised letter spaces.

However, since letter spacing is as uncommon in German typing as black letters are, it may be used for justification without any concern. In order to emphasise certain words, italic, bold or underlined text is the means of choice.

Personally, I prefer letter spacing and hyphenation combined, although snakes seem to be the real deal!162.158.85.141 14:29, 29 July 2016 (UTC)