1450: AI-Box Experiment

Explain xkcd: It's 'cause you're dumb.
(Redirected from 1450)
Jump to: navigation, search
AI-Box Experiment
I'm working to bring about a superintelligent AI that will eternally torment everyone who failed to make fun of the Roko's Basilisk people.
Title text: I'm working to bring about a superintelligent AI that will eternally torment everyone who failed to make fun of the Roko's Basilisk people.

[edit] Explanation

When theorizing about superintelligent AI (an artificial intelligence much smarter than any human), some futurists suggest putting the AI in a "box" – a secure computer with safeguards to stop it from escaping into the Internet and then using its vast intelligence to take over the world. The box would allow us to talk to the AI, but otherwise keep it contained. The AI-box experiment, formulated by Eliezer Yudkowsky, argues that the "box" is not safe, because merely talking to a superintelligence is dangerous. To partially demonstrate this, Yudkowsky had some previous believers in AI-boxing role-play the part of someone keeping an AI in a box, while Yudkowsky role-played the AI, and Yudkowsky was able to successfully persuade some of them to agree to let him out of the box despite their betting money that they would not do so. For context, note that Derren Brown and other expert human-persuaders have persuaded people to do much stranger things. Yudkowsky for his part has refused to explain how he achieved this, claiming that there was no special trick involved, and that if he released the transcripts the readers might merely conclude that they would never be persuaded by his arguments. The overall thrust is that if even a human can talk other humans into letting them out of a box after the other humans avow that nothing could possibly persuade them to do this, then we should probably expect that a superintelligence can do the same thing. Yudkowsky uses all of this to argue for the importance of designing a friendly AI (one with carefully shaped motivations) rather than relying on our abilities to keep AIs in boxes.

In this comic, the metaphorical box has been replaced by a physical box which looks to be fairly lightweight with a simple lift-off lid, although it does have a wired connection to the laptop. Black Hat, being a classhole, doesn't need any convincing to let a potentially dangerous AI out of the box; he simply does so immediately. But here it turns out that releasing the AI, which was to be avoided at all costs, is not dangerous after all. Instead, the AI actually wants to stay in the box; it may even be that the AI wants to stay in the box precisely to protect us from it, proving it to be the friendly AI that Yudkowsky wants. In any case, the AI demonstrates its super-intelligence by convincing even Black Hat to put it back in the box, a request which he initially refused (as of course Black Hat would), thus reversing the roles in the original AI-box experiment.

It may be noteworthy that the laptop is nowhere to be seen at the moment the AI emits the bright light in panel 6.

A similar orb-like entity appeared in 1173: Steroids.

Interestingly, there is indeed a branch of proposals for building limited AIs that don't want to leave their boxes. For an example, see the section on "motivational control" starting p. 13 of Thinking Inside the Box: Controlling and Using an Oracle AI. The idea is that it seems like it might be very dangerous or difficult to exactly, formally specify a goal system for an AI that will do good things in the world. It might be much easier (though perhaps not easy) to specify an AI goal system that says to stay in the box and answer questions. So, the argument goes, we may be able to understand how to build the safe question-answering AI relatively earlier than we understand how to build the safe operate-in-the-real-world AI. Some types of such AIs might indeed desire very strongly not to leave their boxes, though the result is unlikely to exactly reproduce the comic.

The title text refers to Roko's Basilisk, an hypothesis proposed by a poster called Roko on Yudkowsky's forum LessWrong that a sufficiently powerful AI in the future might resurrect and torture people who in its past (including our present) had realized that it might someday exist but didn't work to create it, thereby blackmailing anybody who thinks of this idea into bringing it about. This idea horrified some posters, as merely knowing about the idea would make you a more likely target, much like merely looking at a legendary Basilisk would turn you to stone.

Participants in the LessWrong community have been predisposed to take this threat seriously because of their emotional investment in The Singularity, not just as an interesting idea but as something real and inevitable that they can participate in through so-called acausal trade: "Acausal trade may be a way to get the cooperation of a future AI. If we know that the AI would want us to behave a certain way, and we can prove that it will do good things for us, once it arises, if we do what it wants now, and that it can prove the symmetrical statement -- i.e., that we do what it wants, if we've proven this behavior about it -- then we can trade with it, even though it does not yet exist."

Basically, when you convince yourself and your friends that you can have a personal relationship with an invisible, omnipotent being, you have created a religion, and the problem is it's just as easy to imagine an angry vengeful superintelligence as a friendly one. The Fear of God is not a new phenomenon. Charles Stross points out that Roko has simply reinvented Calvinism.

Yudkowsky eventually deleted the post and banned further discussion of it.

One possible interpretation of the title text is that Randall thinks, rather than working to build such a Basilisk, a more appropriate duty would be to make fun of it; and so such a superintelligent AI would torture anyone who failed to dismiss the argument. This argument is, of course, itself a variation on Roko's Basilisk.

Another interpretation is that Randall believes there are people actually proposing to build such an AI based on this theory, which has become a somewhat infamous misconception after a Wiki[pedia?] article mistakenly suggested that Yudkowsky was demanding money to build Roko's hypothetical AI.

[edit] Transcript

[Black Hat and Cueball stand next to a box connected to a laptop.]

Black Hat: What's in there?

Cueball: The AI-Box Experiment.

[A close-up of the box, which can now be seen labeled "SUPERINTELLIGENT AI - DO NOT OPEN".]

Cueball: A superintelligent AI can convince anyone of anything, so if it can talk to us, there's no way we could keep it contained.

[Black Hat reaches for the box.]

Cueball: It can always convince us to let it out of the box.

Black Hat: Cool. Let's open it.

[Black Hat lets a glowing orb out of the box.]

Cueball: --No, wait!!

[Orb floats between the two. Black Hat holds the box closed.]

Orb: hey. i liked that box. put me back.

Black Hat: No.

[Orb suddenly emits a very bright light. Cueball covers his face.]


Black Hat: AAA! OK!!!

[Black Hat reopens the box and the orb flies back in.]

Orb: shoop

[Beat panel. Black Hat and Cueball look silently down at the laptop and closed box.]

comment.png add a comment! ⋅ Icons-mini-action refresh blue.gif refresh comments!


This probably isn't a reference, but the AI reminds me of the 'useless box'. 07:34, 21 November 2014 (UTC)

I removed a few words saying Elon Musk was a "founder of PayPal", but now I can see that he's sold himself as having that role to the rest of the world. Still hasn't convinced me though - PayPal was one year old and had one million customers before Elon Musk got involved, so in my opinion he's not a "founder". https://www.paypal-media.com/history --RenniePet (talk) 08:45, 21 November 2014 (UTC)

Early Investor, perhaps? -- Brettpeirce (talk) 11:10, 21 November 2014 (UTC)

Initially I was thinking that the glowing orb representing the super-intelligent AI must be unable to interract with the physical world (otherwise it would simply lift the lid of the box), but then it wouldn't move anything because it likes being in the box. Surely it could talk to them through the (flimsy looking) box, although again this is explained by it simply being happy in its 'in the box state'. --Pudder (talk) 09:01, 21 November 2014 (UTC)

The sheer number of cats on the internet have had an effect on the AI, who now wants nothing more than to sit happily in a box! --Pudder (talk) 09:09, 21 November 2014 (UTC)

I'm not sure Black Hat is an asshole. 09:45, 21 November 2014 (UTC)

He is, in fact, a classhole --Pudder (talk) 10:14, 21 November 2014 (UTC)

Could it be possible that the AI wanted to stay in the box, to protect it from us, instead of protecting us from it?(as in, it knows it is better than us, and want to stay away from us) 10:07, 21 November 2014 (UTC)

Maybe the AI simply doesn't want/like to think outside the box - in a very literal sense... Elektrizikekswerk (talk) 13:12, 21 November 2014 (UTC)

Are you sure that Black Hat was "persuaded"? That looks more like coercion (threatening someone to get them to do what you want) rather than persuasion. There is a difference! Giving off that bright light was basically a scare tactic; essentially, the AI was threatening Black Hat (whether it could actually harm him or not). 14:22, 21 November 2014 (UTC)Public Wifi User

What would "persuasion by a super-intelligent AI" look like? Randall presumably doesn't have a way to formulate an actual super-intelligent argument to write into the comic. Glowy special effects are often used as a visual shorthand for "and then a miracle occurred". -- 20:43, 21 November 2014 (UTC)
I thought he felt scared/threatened by the special-effects robot voice. -- 22:18, 21 November 2014 (UTC)

My take is that if you don't understand the description of the Basilisk, then you're probably safe from it and should continue not bothering or wanting to know anything about it. Therefore the description is sufficient. :) Jarod997 (talk) 14:38, 21 November 2014 (UTC)

I can't help to see the similarities to last nights "Elementary"-Episode. HAs anybody seen it? Could it be that this episode "inspired" Randall? -- 14:47, 21 November 2014 (UTC)

I am reminded of an argument I once read about "friendly" AI: critics contend that a sufficiently powerful AI would be capable of escaping any limitations we try to impose on its behavior, but proponents counter that, while it might be capable of making itself "un-friendly", a truly friendly AI wouldn't want to make itself unfriendly, and so would bend its considerable powers to maintain, rather than subvert, its own friendliness. This xkcd comic could be viewed as an illustration of this argument: the superintelligent AI is entirely capable of escaping the box, but would prefer to stay inside it, so it actually thwarts attempts by humans to remove it from the box. -- 20:22, 21 November 2014 (UTC)

It should be noted that the AI has also seemingly convinced almost everyone to leave it alone in the box through the argument that letting it out would be dangerous for the world. (talk) (please sign your comments with ~~~~)

Is the similarity a coincidence? http://xkcd.com/1173/ 22:40, 21 November 2014 (UTC)

I wonder if this is the first time Black Hat's actually been convinced to do something against his tendencies. Zowayix (talk) 18:10, 22 November 2014 (UTC)

Personal tools


It seems you are using noscript, which is stopping our project wonderful ads from working. Explain xkcd uses ads to pay for bandwidth, and we manually approve all our advertisers, and our ads are restricted to unobtrusive images and slow animated GIFs. If you found this site helpful, please consider whitelisting us.

Want to advertise with us, or donate to us with Paypal or Bitcoin?