A Tentative Typology of AI-Foom Scenarios

“If a foom-like explosion can quickly make a once-small system more powerful than the rest of the world put together, the rest of the world might not be able to use law, competition, social norms, or politics to keep it in check.” — Robin Hanson

As Robin Hanson recently discussed, there is a lack of clarity about what an “AI Foom” looks like, or how likely it is. He says “In a prototypical “foom,” or local intelligence explosion, a single AI system…” and proceeds to describe a possibility. I’d like to explore a few more, and discuss what qualifies as a “foom” a bit more. This is not intended as a full exploration, or as a prediction; it merely captures my current thinking.

First, it’s appropriate to briefly mention the assumptions made here;

  • Near Term — Human-intelligence AI is possible in the near term, say 30 years.
  • No Competitive Apocalypse — A single system will be created first, and other groups will not have resources sufficient to quickly build another system with similar capabilities.
  • Unsafe AI — The I launched will not have a well-bounded and safe utility function, and will find something to maximize other than what humanity would like.

These assumptions are not certainties, and are not part of the discussion — but I will condition the rest of the discussion on them, so that debating them is reasonable, elsewhere.

What’s (enough for) a “foom”?

Non-Foom AI X-Risk

a) Accidental Paperclipping — The goals specified allow the AI system to do something destructive, and is irreversible or not noticed. The AI is not sufficiently risk-aware or intelligent to avoid doing so.

b) Purposeful Paperclipping — The goals specified allow the AI system to achieve them, or attempt to do so, by something destructive which the AI can do directly, and is irreversible or not easily noticed in time.

c) Yudkowskian Simplicity-foom — There are relatively simple methods of vastly reducing the complexity of the systems the AI needs to deal with, allowing the system to better perform its goals. At near-human or human intelligence levels, one or more of those methods becomes feasible. (These might include designing viruses, nano-assemblers, or other systems that could wipe out humanity.)

Fooms

a) Yudkowskian Intelligence-foom — The AI is sophisticated enough to make further improvements on itself, and quickly moves from human-level intelligence to super-Einstein levels, and beyond. It can now make advances in physics, chemistry, biology, etc. that make it capable of arbitrarily dangerous behaviors.

b) Hansonian-Em foom — The AI can make efficient and small copies of, or variations on itself rapidly and cheaply, and is unboxed (or unboxes itself.) These human-level AI can run on little enough hardware, or run enough faster than humans, that the machines can rapidly amass resources and hack/exploit/buy resources that allow it to quickly gain direct control of financial and then physical resources.

c) Machiavellian Intelligence-foom — The AI can manipulate political systems surreptitiously, and amasses power directly or indirectly via manipulating individual humans. (Perhaps the AI gains resources and control via blackmail of specific individuals, who are unaware on whose behalf the operate.) The resulting control can prevent coordinated action against the AI, and allow it to gather resources to achieve its unstated nefarious true goal.

d) Asimovian Psychohistory-foom — The AI can build predictive models of human reactions well enough to manipulate them over the medium and long term. (This is different than a Machiavellian-foom only because it relies on models of humans and predictive power rather than humanlike manipulation.)

This is almost certainly not a complete or comprehensive list, and I would be grateful for additional suggestions. What it does allow is a discussion of what makes various types of fooms likely, and consider which might be pursued.

AI Complexity and Intelligence Range

The related question is the range of intelligence. If beyond human-level AI is not possible given the techniques used to achieve human-level intelligence, or requires an exponential or even a large polynomial increase in computing power, we will consider the range small — even if not bounded, there are near-term limits. Moore’s law (if it continues) implies that the speed of AI thought will increase, but not quickly. Alternatively, if the techniques used to achieve human level AI can be extended easily to create even more intelligent systems by adding hardware, the range is large. This gives us a simplified set of possibilities.

Intelligence vs. Range — Cases

Image for post
Image for post

Low-Complexity Intelligence within Large Range — If humans are, as Eliezer Yudkowsky has argued, relatively clustered on the scale of intelligence, the difficulty of designing significantly more intelligent reasoning systems may within, or not be far beyond, human capability. Rapid increases in intelligence of AI systems above human levels would be a critical threshold, and an existential risk.

Low-Complexity Intelligence within Small Range — If human minds are near a peak of intelligence, near-human or human-level Hansonian Ems may still be possible to instantiate in relatively little hardware, and their relative lack of complexity make them a potential existential risk.

High-Complexity Intelligence within Small Range — Relatively little existential risk from AI seems to exist, and instead a transition to an “Age of Em” scenario seems likely.

High-Complexity Intelligence within Large Range — A threshold or Foom is unlikely, but incremental AI improvements may still pose existential risks. When a single superintelligent AI is developed, other groups are likely to follow. A singularity may be plausible, where many systems are built with superhuman intelligence, posing different types of existential or other risks.

Human Complexity and Manipulability

Even if perfect manipulation is impossible, classical blackmail or other typical counterintelligence-type attacks may be possible, leading to a malevolent system to be able to manipulate humans. Alternatively, if human-level cognition can be achieved with much less resources than a human mind, Hansonian-fooms are possible, but so is predictive modeling of individual human minds by a manipulative system.

Alternatively, if very predictive models can be made that approximate human behavior, much like Asimov’s postulated psychohistory. This seems unlikely to be as rapid a threat, but AIs in intelligence, marketing, and other domains may specifically target this ability. If human psychology can be understood more easily than expected, these systems may succeed beyond current expectations, and the AI may be able to manipulate humans en-masse, without controlling individuals. This is similar to an unresolved debate in history about the relative importance of individuals (a la “Great Man Theory”) versus societal trends.

Conclusion

All of this is speculation, and despite my certain-sounding claims above, I am interested in reactions or debate.

Why not try for Aumann Consensus instead?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store