A Tentative Typology of AI-Foom Scenarios
“If a foom-like explosion can quickly make a once-small system more powerful than the rest of the world put together, the rest of the world might not be able to use law, competition, social norms, or politics to keep it in check.” — Robin Hanson
As Robin Hanson recently discussed, there is a lack of clarity about what an “AI Foom” looks like, or how likely it is. He says “In a prototypical “foom,” or local intelligence explosion, a single AI system…” and proceeds to describe a possibility. I’d like to explore a few more, and discuss what qualifies as a “foom” a bit more. This is not intended as a full exploration, or as a prediction; it merely captures my current thinking.
First, it’s appropriate to briefly mention the assumptions made here;
- Near Term — Human-intelligence AI is possible in the near term, say 30 years.
- No Competitive Apocalypse — A single system will be created first, and other groups will not have resources sufficient to quickly build another system with similar capabilities.
- Unsafe AI — The I launched will not have a well-bounded and safe utility function, and will find something to maximize other than what humanity would like.
These assumptions are not certainties, and are not part of the discussion — but I will condition the rest of the discussion on them, so that debating them is reasonable, elsewhere.
What’s (enough for) a “foom”?
With preliminaries out of the way, what would qualify as a “foom,” an adaptation or change that makes the system “more powerful than the rest of the world put together”?
Non-Foom AI X-Risk
There are a few scenarios which lead more directly to existential risk, without passing through a stage of gathering power. (Beyond listing them, I will not discuss these here. Also, names of scenarios given here do not imply anything about the belief of the namesake.)
a) Accidental Paperclipping — The goals specified allow the AI system to do something destructive, and is irreversible or not noticed. The AI is not sufficiently risk-aware or intelligent to avoid doing so.
b) Purposeful Paperclipping — The goals specified allow the AI system to achieve them, or attempt to do so, by something destructive which the AI can do directly, and is irreversible or not easily noticed in time.
c) Yudkowskian Simplicity-foom — There are relatively simple methods of vastly reducing the complexity of the systems the AI needs to deal with, allowing the system to better perform its goals. At near-human or human intelligence levels, one or more of those methods becomes feasible. (These might include designing viruses, nano-assemblers, or other systems that could wipe out humanity.)
Fooms
There are a few possibilities I would consider for an AI to become immensely powerful;
a) Yudkowskian Intelligence-foom — The AI is sophisticated enough to make further improvements on itself, and quickly moves from human-level intelligence to super-Einstein levels, and beyond. It can now make advances in physics, chemistry, biology, etc. that make it capable of arbitrarily dangerous behaviors.
b) Hansonian-Em foom — The AI can make efficient and small copies of, or variations on itself rapidly and cheaply, and is unboxed (or unboxes itself.) These human-level AI can run on little enough hardware, or run enough faster than humans, that the machines can rapidly amass resources and hack/exploit/buy resources that allow it to quickly gain direct control of financial and then physical resources.
c) Machiavellian Intelligence-foom — The AI can manipulate political systems surreptitiously, and amasses power directly or indirectly via manipulating individual humans. (Perhaps the AI gains resources and control via blackmail of specific individuals, who are unaware on whose behalf the operate.) The resulting control can prevent coordinated action against the AI, and allow it to gather resources to achieve its unstated nefarious true goal.
d) Asimovian Psychohistory-foom — The AI can build predictive models of human reactions well enough to manipulate them over the medium and long term. (This is different than a Machiavellian-foom only because it relies on models of humans and predictive power rather than humanlike manipulation.)
This is almost certainly not a complete or comprehensive list, and I would be grateful for additional suggestions. What it does allow is a discussion of what makes various types of fooms likely, and consider which might be pursued.
AI Complexity and Intelligence Range
The first critical question among these is the complexity of intelligence — I won’t try to estimate this, but others are researching and discussing it. Here, complexity refers to something akin to computational complexity, and refers to the difficulty of running an artificial intelligence of a given capacity. If emulating a small mammal’s brain is possible, but increasing the intelligence of AI from there to human requires an exponential increase in complexity and computing speed, we will say it is very complex, while if it requires only doubling, it is not. (I assume the computational complexity matters here, and there are no breakthroughs in hardware, quantum computing, or computational complexity theory.)
The related question is the range of intelligence. If beyond human-level AI is not possible given the techniques used to achieve human-level intelligence, or requires an exponential or even a large polynomial increase in computing power, we will consider the range small — even if not bounded, there are near-term limits. Moore’s law (if it continues) implies that the speed of AI thought will increase, but not quickly. Alternatively, if the techniques used to achieve human level AI can be extended easily to create even more intelligent systems by adding hardware, the range is large. This gives us a simplified set of possibilities.
Intelligence vs. Range — Cases
Low-Complexity Intelligence within Large Range — If humans are, as Eliezer Yudkowsky has argued, relatively clustered on the scale of intelligence, the difficulty of designing significantly more intelligent reasoning systems may within, or not be far beyond, human capability. Rapid increases in intelligence of AI systems above human levels would be a critical threshold, and an existential risk.
Low-Complexity Intelligence within Small Range — If human minds are near a peak of intelligence, near-human or human-level Hansonian Ems may still be possible to instantiate in relatively little hardware, and their relative lack of complexity make them a potential existential risk.
High-Complexity Intelligence within Small Range — Relatively little existential risk from AI seems to exist, and instead a transition to an “Age of Em” scenario seems likely.
High-Complexity Intelligence within Large Range — A threshold or Foom is unlikely, but incremental AI improvements may still pose existential risks. When a single superintelligent AI is developed, other groups are likely to follow. A singularity may be plausible, where many systems are built with superhuman intelligence, posing different types of existential or other risks.
Human Complexity and Manipulability
The second critical question is human psychology. If human minds can be manipulated more easily by moderately complex AIs than by other humans (which is already significant,) AIs might not need to “foom” in the Yudkowskian sense at all. Instead, the exponential increase in AI power and resources can happen via manipulation at an individual level or at a group level. Humans, individually or en masse, may be convinced that AI should be given this power.
Even if perfect manipulation is impossible, classical blackmail or other typical counterintelligence-type attacks may be possible, leading to a malevolent system to be able to manipulate humans. Alternatively, if human-level cognition can be achieved with much less resources than a human mind, Hansonian-fooms are possible, but so is predictive modeling of individual human minds by a manipulative system.
Alternatively, if very predictive models can be made that approximate human behavior, much like Asimov’s postulated psychohistory. This seems unlikely to be as rapid a threat, but AIs in intelligence, marketing, and other domains may specifically target this ability. If human psychology can be understood more easily than expected, these systems may succeed beyond current expectations, and the AI may be able to manipulate humans en-masse, without controlling individuals. This is similar to an unresolved debate in history about the relative importance of individuals (a la “Great Man Theory”) versus societal trends.
Conclusion
We don’t know when human-level AI will occur, or what form it will take. Focus on AI-safety may depend on the type of AI-foom that we are concerned with, and a better characterization of these uncertainties could be useful for addressing existential risks of AI deployment.
All of this is speculation, and despite my certain-sounding claims above, I am interested in reactions or debate.