1306: Sigil Cycle

Explain xkcd: It's 'cause you're dumb.
Revision as of 22:48, 20 October 2023 by 162.158.186.70 (talk) (Transcript: g+)
Jump to: navigation, search
Sigil Cycle
The cycle seems to be 'we need these symbols to clarify what types of things we're referring to!' followed by 'wait, it turns out words already do that.'
Title text: The cycle seems to be 'we need these symbols to clarify what types of things we're referring to!' followed by 'wait, it turns out words already do that.'

Explanation

In computer programming, a variable is a way of storing information temporarily, for use later in the program. There are different types of variables, called data types, such as integers, strings, characters, and booleans, all of them holding different types of information. Integers hold whole numbers, strings hold text, and so on. Variables traditionally have names that identify their purpose, and a programmer should usually be able to infer from this variable name what type of variable it is. For example, if you want to store the name of the customer in a catalogue service, you might store the text in a string variable called "NameOfCustomer". Because it is fairly clear that names are made up of text, it is logical that this variable would be a string variable - if you didn't have any other information about it.

A sigil in computer programming is a symbol that appears before the variable name. It is an alternative method of telling someone who is reading the program code what data type the variable is. Rather than relying on logic, then, to know that NameOfCustomer is a string, you might use a sigil "$" before the variable name, as in $NameOfCustomer, which would specify that the variable can hold text. Sigils can also specify the scope of a variable, which refers to where the variable can be used in a program, and which parts of the program can access that variable. Sigils are useful in some ways because you don't have to refer to previous program code or find where the variable is declared (created) to know what data type it is. They also provide some level typing in languages that do not explicitly declare the type of the variable.

Most programming languages have a different method for storing variables, although some languages may use the same variable types under different names. The following are the programming languages referenced in the comic and how they use variables.

QBASIC
Variables of type string end with the $ symbol. Other symbols are used (% for integers, ! for single-precision, # for double-precision and, in some versions of BASIC, & for long integers), however the usual QBASIC program will use only the $ symbol and not any of the others, as the default type if no symbol is used is single-precision and that's OK for most numeric uses.
C++
Pronounced "see plus plus." Variables are just words with regular letters. It is the name of the language itself that includes symbols.
bash
This is not typically thought of as a full-featured programming language, but a Unix shell. However, the shell command syntax is rich enough to be able to write simple (and sometimes really complex) programs called shell-scripts. In this language, all variable dereferences start with the symbol $.
Perl
In Perl, the initial character provides the context of the variable. Scalars (text, numeric and also to references to data) start with the $ character. An @ is for an array. With %, it is a hash (a loose non-sequential array, or 'dictionary' lookup). Functions can be given a preceding &, but rarely need this in straightforward use. You can use the variables $temp, @temp, %temp and &temp simultaneously and independently. There is also the * (not in a mathematical sense) which identifies a 'glob', a way to fuse or use all those types (and more!) in 'interesting' ways if you have a yen to.
A block, with {} surrounding some other suitable statement(s), can potentially be typed to (re)interpret the context within. If you have a $reference which currently points to an @array, @{$reference} will let you use it as a direct array. But in simple cases, like that, this can often be shortened to @$reference, as alluded to by the "@$PERL" of the comic. (Just as $$reference would be a valid way to dereference the $reference when it points to $scalar... or even to $anotherReference that itself points to a %hash, in which case you could even use %$$reference for 'direct' access to that. Perl can be complicated, if you let it!)
Python
Variables are just words with regular letters.
Google
Once upon a time, Google added a social network called "Google+" (pronounced "Google plus") to its many offerings. On this network, accounts were identified and "mentioned" (linked in a message, and sent a notification) with a + prefix. For example, Randall was "+Randall Munroe". Google+ has been defunct since 2019, but it was active and growing in 2013 when this comic was posted.
Twitter
Twitter account IDs are identified by the leading symbol @. When an account is "mentioned" in a tweet using @, it triggers smart behavior. For example, account owners can configure Twitter to forward tweets that mention them. This feature was not present in the early days of Twitter.
Hashtags
In 2007 Twitter users began a convention that a # sign (whose many names include the "hash") can be prepended to words to mark them as keywords. Twitter could then be searched for those words. In 2009 Twitter recognized the existence of hashtags and began hyperlinking them. Some other microblogging services followed suit. Google+ eventually added hashtag support as did Facebook.

As is noted by the comic, the use of sigils to indicate types of variables varies between programming languages, from strict enforcement in languages like Perl, to their complete absence in languages like C++ (but see Hungarian Notation). The comic notes that the use of sigils seems to be cyclic, especially if you count things like hashtags as extensions of the pattern.

The title text describes the two competing influences responsible for the cycle: The first impulse finds sigils useful to elucidate the type of the variable, especially when variable names are not very descriptive, while the latter impulse notes that descriptive variable names are much more useful for that purpose, especially in extensible languages where the built-in types form only a small part of the type system.

Transcript

A sinusoidal curve is shown.
Y axis: Odds that the words I type will start with some weird symbol
X axis: Time
Data labels: [at first peak] $QBASIC, [at first trough] C++, [at second peak] $BASH, @$PERL, [at second trough] PYTHON, [at third peak] +GOOGLE, @TWITTER, #HASHTAGS


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

Shouldn't it be QBASIC$ (or QBASIC%), since in Basic the sigil is attached to the end of variable names? --173.245.53.108 13:19, 20 December 2013 (UTC)

Could not find where categories can be added, here's a list of suitable categories: Charts, Computers, Comics presenting a compromise Internet, Programming 173.245.53.180 13:32, 20 December 2013 (UTC)

This comic de-emphasizes the value of sigils. It's very ironic that Randall chose C++, a language with symbols, to exemplify plain words. And C is a reason for not naming technologies after letters. Same with X. You have to search for "C programming language" or "X window system." It's very helpful to distinguish things with unique sigils, especially in this current age where we depend on full-text search. Just look at my login ID, tbc. I have been tbc on the Internet since 1981. But I eventually had to go by tbc0 (e.g. on Twitter) because tbc isn't unique enough. Google was named after 10^100 (an incomprehensibly large number reflecting their ambition). But that number is spelled googol. They own their spelling. Brilliant. Consider examples: iMac, iPhone iPad, Yahoo (a little weak), Facebook (they own that word). It's all about branding. Google Kleenex or Xerox and you'll see that they're excellent sigils. The problem is, those terms have become generic. Their brand is a little weaker for it. Finally, on Twitter, @ and # unleash powerful features. — tbc (talk) 15:01, 20 December 2013 (UTC)

C++ uses symbols, but it doesn't use one to denote that an identifier is a variable (like PHP) or the type of an identifier (like early BASIC, Perl, and arguably Twitter). And when I search for X, it's either X11 (the protocol) or Xorg (the widely used server implementation). And Barney Google had it first. --Tepples (talk) 15:55, 20 December 2013 (UTC)

Any way we can expand on the history of programming (if applicable)? Did these languages become popular in a certain order, or were they developed as a response to one another? Or is this comic simply Randall's journey through programming, not specifically tied to the popularity (or development) of certain coding languages? -- 108.162.216.227

They pretty much appeared in the order listed. I don't think they represent Randall's experience (or really anything else); the differences in how they handle variable names/types is mostly a function of their different purpose, and Randall picked those specific examples simply to fit the timeline (e.g. sh and ksh have the same syntax as bash, but since they came before QBasic they would break the pattern). 108.162.236.13 (talk) (please sign your comments with ~~~~)

The google mentioning isn't explained well enough imo. Instead if just saying "they have a service called google plus", it should be told how the + sign is used throughout the service, like every other instance in the article. I may do the edit myself, but it's not likely. 141.101.98.237 15:26, 20 December 2013 (UTC)


"Ironically, it is the name if the language itself that includes symbols."

It's not very ironic. Variable names don't include symbols, but commands do. This statement should be rewritten.

int c = 0;

c++;

c += 1;

c = c + 1; 173.245.52.215 (talk) (please sign your comments with ~~~~)

I find it ironic that "C++" in a statement would be interpretted as "C" and only post-incremented (i.e. only incremented when next referenced). Meaning "C++" is effectively the same as "C", in its own context. They should have named it "++C", if they wanted to indicate that it was itself improved upon the original value of C. ;) 141.101.99.229 16:37, 20 December 2013 (UTC)
This is an incorrect interpretation of the statement c++. c++ as a standalone statement, on a line by itself, will result in c being exactly one greater than before the statement (the value stored in that memory location will indeed be one greater); using prefix or postfix ++ in this context is functionally equivalent and most people just prefer using the postfix version. Where the distinction between the prefix and postfix versions come into play is in more complex statements where the operator's return value is not ignored. For example,
int c = 1;
int x = c++;
x will be initialized to 1 because the postfix ++ operator returns the value of c before it was incremented, but the value stored in c will be 2 regardless of further reference. If, instead you initialized x using the prefix version, ++c, x would be 2 because the prefix version of ++ returns the incremented result. (Side note: it's often considered bad practice to rely on the return value of the increment and decrement operators.) 108.162.219.227 20:58, 20 December 2013 (UTC)
When not specifically using the post or pre incrementing nature of c++/++c, and just using it as shorthand for c = c + 1, then ++c is demonstrably superior to c++ as there are 2 fewer machine code operations involved 141.101.99.41 (talk) (please sign your comments with ~~~~)
No, I stand by what I say. I actually agree with your code, but freely parsing "I will use C++ for this project", as a phrase (at least the first time you utter it) might so easily be a statement that gives a direct result equal to "I will use C for this project". (It helps to have just the right geeky sense of humour, of course.) 141.101.99.229 21:56, 20 December 2013 (UTC)
Oh, I assure you, I am quite geeky. I could, for instance, argue that you're mixing the grammars of English and C++, a natural language and context sensitive language. 108.162.219.227 22:21, 20 December 2013 (UTC)
Personally, I see no problem. When you start programming in C++, you are writing code which is effectively C. Only when you program in C++ longer time, the code will improve. -- Hkmaly (talk) 12:13, 21 December 2013 (UTC)
Wrong, as "I will use C++" actually does mean "I will use C++", because the moment you finished uttering it (command break), C indeed becomes one point greater ;) 108.162.222.43 06:29, 24 December 2013 (UTC)
Regarding the name of the language, Bjarne Stroustrup himself has said, "Connoisseurs of C semantics find C++ inferior to ++C." Elsbree (talk) 07:03, 26 December 2013 (UTC)
It's still not ironic that the name includes symbols. I removed the word 'ironically', it doesn't make sense. 173.245.52.215 (talk) (please sign your comments with ~~~~)

Extending the first comment above: Since the strip is known for being rather technically strict, it's odd that it says "word ... will START with", yet QBASIC variables END with symbols, and Google+ ENDS with a symbol.108.162.216.216 18:11, 20 December 2013 (UTC)

That's not a problem with Google, because the sigil comes at the beginning there. But it's a problem with QBASIC, all right. —TobyBartels (talk) 05:01, 21 December 2013 (UTC)

Although C++ doesn't force you to use sigils, by convention programmers would still use sigils. Conventionally, variable names were named nCount, or fCost. The first character in the variable name indicated the data type. This convention was extended by Visual C++, and it started naming interfaces starting with I. Eventually, this convention fell by the wayside because IDEs started getting smarter and you would get code complete and some sort of information via a tooltip that eliminated the need for the Sigil --173.245.56.24 18:16, 20 December 2013 (UTC)

Hungarian Notation (and similar schemes) aren't "sigils" (according to wiktionary, a sigil in this context is non-alphanumeric, and the comic would seem to imply this also). --108.162.219.186 22:45, 20 December 2013 (UTC).

I think this explanation could do with some better explanation of the programming concepts it describes. Not every xkcd reader will be familiar with programming languages. --Mynotoar (talk) 21:20, 20 December 2013 (UTC)

I've expanded the introduction for now to more fully explain programming languages and variables - it wasn't very clear to non-programmers - but I think the rest could use some work too. --Mynotoar (talk) 18:29, 21 December 2013 (UTC)

If "C++" "started" with a symbol, then I would agree that it is ironic that it appears in the graph in the position that it does. Since it does not, however, I must dispute your use of the word "ironic". 108.162.238.117 03:14, 21 December 2013 (UTC)

How could 'see plus plus' be pronounced any other way? 141.101.98.239 11:15, 23 December 2013 (UTC)

'see add add'? --Mynotoar (talk) 22:33, 26 December 2013 (UTC)
'sea-cross squared'? 173.245.52.215 (talk) (please sign your comments with ~~~~)
Dispute about the explanation

The explanation is very misleading. Why on earth does the explanation begin with a big chunk of talk about variables? The comic strip is entirely about probability that a word you encounter will begin with some sigil. Therefore, the explanation should be about WHY the chart is plotted the way it is -- why does QBASIC have such a high probability, and why C++ does not. Everything else will just confuse anyone who comes to this page.--108.162.231.238 15:36, 8 January 2014 (UTC)

I did move this discussion to the bottom where it belongs to; new statements should not be posted at the top. And back to your comment: Sigil (computer programming) is very well explained at the beginning, read the Wiki article. --Dgbrt (talk) 21:21, 8 January 2014 (UTC)
Technically, the strip is about the probability that a word you type will begin with some sigil. Since there's a change you'll be programming as you'll type, there's sense in explaining the programming context. I'll make it clearer by exposing my unbiased explanation of the strip:Early on, Randall programmed in QBASIC, so the words he typed then had a higher chance of containing sigils. Later on, he programmed in C++, so the chances decreased (in my opinion, did not reached zero due to directives). Later on, he programmed in Perl and wrote Bash scripts, so the chances increased. Later on, he programmed in Python, so the chances decreased again. Later on, he used Google+, Twitter and hashtags in general, so the chances increased again. 108.162.219.125 03:10, 9 February 2015 (UTC)