Will Superintelligent Machines Destroy Humanity?
In a thoughtful new book, a philosopher ponders the potential pitfalls of artificial intelligence.
Superintelligence: Paths, Dangers, Strategies, by Nick Bostrom, Oxford University Press, 324 pages, $29.95
In Frank Herbert's Dune books, humanity has long banned the creation of "thinking machines." Ten thousand years earlier, their ancestors destroyed all such computers in a movement called the Butlerian Jihad, because they felt the machines controlled them. Human computers called Mentats serve as a substitute for the outlawed technology. The penalty for violating the Orange Catholic Bible's commandment "Thou shalt not make a machine in the likeness of a human mind" was immediate death.
Should humanity sanction the creation of intelligent machines? That's the pressing issue at the heart of the Oxford philosopher Nick Bostrom's fascinating new book, Superintelligence. Bostrom cogently argues that the prospect of superintelligent machines is "the most important and most daunting challenge humanity has ever faced." If we fail to meet this challenge, he concludes, malevolent or indifferent artificial intelligence (AI) will likely destroy us all.
Since the invention of the electronic computer in the mid-20th century, theorists have speculated about how to make a machine as intelligent as a human being. In 1950, for example, the computing pioneer Alan Turing suggested creating a machine simulating a child's mind that could be educated to adult-level intelligence. In 1965, the mathematician I.J. Good observed that technology arises from the application of intelligence. When intelligence applies technology to improving intelligence, he argued, the result would be a positive feedback loop—an intelligence explosion—in which self-improving intelligence bootstraps its way to superintelligence. He concluded that "the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control." How to maintain that control is the issue Bostrom tackles.
About 10 percent of AI researchers believe the first machine with human-level intelligence will arrive in the next 10 years. Fifty percent think it will be developed by the middle of this century, and nearly all think it will be accomplished by century's end. Since the new AI will likely have the ability to improve its own algorithms, the explosion to superintelligence could then happen in days, hours, or even seconds. The resulting entity, Bostrom asserts, will be "smart in the sense that an average human being is smart compared with a beetle or a worm." At computer processing speeds a million-fold faster than human brains, Machine Intelligence Research Institute maven Eliezer Yudkowsky notes, an AI could do a year's worth of thinking every 31 seconds.
Bostrom charts various pathways toward achieving superintelligence. Two, discussed briefly, involve the enhancement of human intelligence. In one, stem cells derived from embryos are turned into sperm and eggs, which are combined again to produce successive generations of embryos, and so forth, with the idea of eventually generating people with an average IQ of around 300. The other approach involves brain/computer interfaces in which human intelligence is augmented by machine intelligence. Bostrom more or less dismisses both the eugenic and cyborgization pathways as being too clunky and too limited, although he acknowledges that making people smarter either way could help to speed up the process of developing true superintelligence in machines.
Bostrom's dismissal of cyborgization may be too hasty. He is right that the crude interfaces currently used now to treat such illnesses as Parkinson's disease pose considerable medical risks, but that might not always be so. He also argues that even if the interfaces could be made safe and reliable, the limitations on the processing power of natural brains would still preclude the development of superintelligence. Perhaps not. Later in this century, it may be possible to inject nanobots that directly connect brains to massive amounts of computer power. In such a scenario, most of the intellectual processing would be done by machines while the connected brains become the values and goal center guiding the cyborg.
In any case, for Bostrom there are two main pathways to superintelligence: whole brain emulation and machine AI.
Whole brain emulation involves deconstructing an actual human brain down to the synaptic level and then digitally instantiating all the three-dimensional neuronal network of the trillions of connections in a computer with the aim of making a digital reproduction of the original intellect, with memory and personality intact. As an aside, Bostrom explores a dystopian possibility in which billions of copies of enslaved virtual brain emulations compete economically with human beings living in the physical meatspace world. The results make Malthus look like an optimist. Bostrom more extensively explores another pathway, in which an emulation is uploaded into a sufficiently powerful computer such that the new digital intellect embarks on a process of recursively bootstrapping its way to superintelligence.
In the other pathway, researchers combine advances in software and hardware to directly create a superintelligent machine. One proposal is to create a "seed AI," somewhat like Turing's child machine, which would understand its own workings well enough to improve its algorithms and computational structures enabling it to enhance its cognition to achieve superintelligence. A superintelligent AI would be able to solve scientific mysteries, abate scarcity by generating a bio-nano-infotech cornucopia, inaugurate cheap space exploration, and even end aging and death. But while it could do all that, Bostrom fears it will much more likely regard us as nuisances that must be swept away as it implements its values and achieves its own goals. And even if it doesn't target us directly, it could simply make the Earth uninhabitable pursues its ends—say, by tiling the planet over with solar panels or nuclear power plants.
Bostrom argues that it is important to figure out how to control an AI before turning it on, because it will resist attempts to change its final goals once it begins operating. In that case, we'll get only one chance to give the AI the right values and aims. Broadly speaking, Bostrom looks at two ways developers might try to protect humanity from a malevolent superintelligence: capability control methods and motivation selection.
An example of the first approach would be to try to confine the AI to a "box" from which it has no direct access to the outside world. Its handlers would then treat it as an oracle, posing questions to it such as how can we might exceed the speed of light or cure cancer. But Bostrom thinks the AI would eventually get out of the box, noting that "Human beings are not secure systems, especially when pitched against a super intelligent schemer and persuader."
Alternatively, developers might try to specify the AI's goals before it is switched on, or set up a system whereby it discovers an appropriate set of values. Similarly, a superintelligence that began as an emulated brain would presumably have the values and goals of the original intellect. (Choose wisely which brains to disassemble and reconstitute digitally.) As Bostrom notes, trying to specify a final goal in advance could go badly wrong. For example, if the developers instill the value that the AI is supposed to maximize human pleasure, the machine might optimize this objective by creating vats filled with trillions of human dopamine circuits continually dosed with bliss-inducing chemicals.
Rather than directly specifying a final goal, the Bostrom suggests that developers might instead instruct the new AI to "achieve that which we would have wished the AI to achieve if we had thought long and hard about it." This is a rudimentary version of Yudkowsky's idea of coherent extrapolated volition, in which a seed AI is given the goal of trying to figure out what humanity—considered as a whole—would really want it to do. Bostrom thinks something like this might be what we need to prod a superintelligent AI into ushering in a human-friendly utopia.
In the meantime, Bostrom thinks it safer if research on implementing superintelligent AI advances slowly. "Superintelligence is a challenge for which we are not ready now and will not be ready for a long time," he asserts. He is especially worried that people will ignore the existential risks of superintelligent AI and favor its fast development in the hope that they will benefit from the cornucopian economy and indefinite lifespans that could follow an intelligence explosion. He argues for establishing a worldwide AI research collaboration to prevent a frontrunner nation or group from trying to rush ahead of its rivals. And he urges researchers and their backers to commit to the common good principle: "Superintelligence should be developed only for the benefit of all humanity and in the service of widely shared ethical ideals." A nice sentiment, but given current international and commercial rivalries, the universal adoption of this principle seems unlikely.
In the Dune series, humanity was able to overthrow the oppressive thinking machines. But Bostrom is most likely right that once a superintelligent AI is conjured into existence, it will be impossible for us to turn it off or change its goals. He makes a strong case that working to ensure the survival of humanity after the coming intelligence explosion is, as he writes, "the essential task of our age."