The case against the existential danger of an intelligence explosion

Wherein I attempt to assuage my terror at the possibility of a fast artificial general intelligence takeoff by grasping at what may be straws.
Posted on May 31, 2023
Tags: artificial intelligence, superintelligence, existential risk


In ‘Superintelligence’, Nick Bostrom presents compelling arguments that once a certain threshold of artificial computer intelligence is reached, and given the ability for self-improvement, such intelligences will escape the possibility of human control, and likely converge on instrumental goals that are incompatible with human survival.

I think it is difficult to come away from this book without a strong sense that artificial general intelligence (AGI) is not only possible, but will likely come into existence before the end of the current century (assuming no serious interruption of the status quo, e.g. a global war); and that this might be an existential risk for humanity. This sense has only been increased by the recent successes of DALL-E and ChatGPT. On the other hand, and contrary to the general theme of that excellent book, this may in fact not be the case. In this essay I expand on this possibility, motivated in part by my desire to assuage the dread which has afflicted me since reading ‘Superintelligence’ some years ago; and in part by my fear that the true danger (increased capital concentration and the associated social upheaval) of the current crop of ‘weak’ AI will be overlooked.

The fear is that a superintelligence will rapidly destroy us once it has emerged; the arguments in favour of this are compelling once examined (for this I direct the reader to a thorough reading of ‘Superintelligence’), even if the premise initially sounds like bad science fiction. I will not, however, address this issue directly, as I find Bostrom’s arguments uncomfortably compelling. I will instead focus my attention on earlier parts of the chain of reasoning; the difficulty of AGI, and whether we truly have reason to believe that the creation of AGI will inevitably and quickly lead to a superintelligence in the first place (the so-called fast or exponential takeoff scenario, also known as an intelligence explosion). I will additionally cast some doubt on the coherency of the notions of ‘general intelligence’ and ‘superintelligence’ in the form they currently appear in debates regarding existential risk.

Despite my hopeful scepticism, it seems to me that even if the terminal conclusion is avoided, it is almost certain that the limited type of AI that remains will serve primarily to further the interests of the capitalist class; thereby cementing current social inequities; speeding the destruction of the biosphere; and making any fundamental social change increasingly unlikely, perhaps impossible. Since this is the logical continuation of the prevailing mode of capitalism, I deem this the most likely scenario.

What are artificial general intelligence and superintelligence anyway?

The term ‘intelligence’ is often used in a rather nebulous manner, and conflated with concepts like ‘consciousness’, ‘self-awareness’, and ‘cognition’. This conflation continues to hamper sober discussions around possible existential dangers posed by Artificial General Intelligence (AGI). Cognition, broadly construed as the set of mental faculties that allow us to understand and interact with the world, is arguably the term that most adequately captures the concerns of those who fear a fast AI takeoff; no one would be worried about an AI that gets perfect scores on all conceivable IQ tests, but could not figure out how to interface with the physical world.

The minimalist notion of AGI I assume is something like ‘an artificial intelligence that has broadly the same types of cognitive capabilities as a typical, healthy human in its prime’. In this sense AlphaGo, while demonstrating superhuman intelligence in playing the game of Go, is merely an artificial intelligence, and not a ‘general’ artificial intelligence.

As for a ‘superintelligence’, I will use Bostrom’s definition (chapter 2 of ‘Superintelligence’):

any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest.

This conflates ‘cognition’ and ‘intelligence’. For the sake of consistency I might therefore refer to a ‘supercognitive agent’, but I will avoid using the term to maintain continuity with existing discussions, and instead use ‘superintelligence’.

The above definition is also very vague. What are ‘the domains of interest’, and how will we know when we have exhausted ‘virtually all’ of them? To crystallise the existential concerns around superintelligence, it is best to focus on pragmatic outcomes - in other words, we take ‘the domains of interest’ to refer to the types of actions that could be performed in the physical world by a putative superintelligent agent. This is closer to capturing the nature of the feared risk, but has only shifted the difficulty elsewhere.

One benefit we do obtain in framing the question in such terms is that metaphysical issues related to the status of qualia, the nature of consciousness, or any other such contentious topics are irrelevant; a system may have or exhibit some, all, or none of these things, and still qualify as an AGI or a superintelligence

Strictly speaking a system doesn’t even need to have ‘understanding’ or ‘knowledge’ in the sense that humans do to be ‘superintelligent’ (this is relevant for framing recent public disagreements about whether ChatGPT exhibits ‘understanding’ or the ability to reason). We may also imagine that things like consciousness or sentience can come into being as an emergent property of a complex AI system; but we may also imagine that this is not the case. These are interesting questions, but not strictly related to existential risks related to superintelligence.

The weakness of arguments from analogy

All of the arguments in favour of the feasibility of a superintelligence or its potential as an existential risk are, of necessity, arguments by analogy; some of these arguments are sophisticated enough that the reader forgets this as they are carried along by the vision sketched by Bostrom and others.

I like to think of this phenomenon as a ‘plausibility cascade’ - the reader is led from one plausible argument to the next, all the while forgetting that each creative leap makes the conclusion less likely and more speculative. In this essay I will attempt the inverse (an ‘implausibility cascade’), by pointing out some gaps in the analogical arguments in favour of existential risk from superintelligence.

Analogical arguments are not necessarily false, but are also not necessarily true. It was similarly plausible that the first atomic tests would ignite the entire atmosphere, killing us all; fortunately it turns out this fear was unfounded. In that case the relevant calculations were eminently physical, and involved the binding energies of the constituent particles of the atmosphere. In the case of a superintelligence we know of no fundamental calculations we can perform to assess the risk, because we do not know how much computing power, or what type of algorithm, is necessary, nor even what precisely a superintelligence would entail, beyond definitions such as Bostrom’s. It is true that the literature contains various scaling ‘laws’ that aim to capture the capabilities of various AI systems (largely of the machine learning type) as a function of their parameter count, training data size, etc, but these are empirical observations, and not fundamental. Or, if they do point to fundamental laws, we still lack the intellectual framework with which to understand in what sense they are fundamental.

Even if we accept as a given that there is a certain level of artificial intelligence at which a system could quickly improve itself along metrics that are relevant to existential fears around superintelligence, it could be the case that this level is far beyond current human scientific knowledge or engineering capacity (and at the moment we don’t even know what these metrics would be). It could also quite plausibly be the case that such a system could only improve itself along a very narrow stretch of the space of cognitive abilities - claiming that such a system could exhibit ‘general’ intelligence (or cognition) is a form of circular reasoning that begs the question of what exactly ‘general intelligence’ is. As an example - perhaps an algorithm could be found that quickly allows DALL-E type models to improve at their ‘cognitive task’ of creating images far beyond their current capabilities, until they exhibit literally none of the shortcomings they currently do (e.g. when drawing hands or faces), and become so good at this that creating another model is essentially superfluous (except for reasons related to profit). It does not follow that such a system would then also be capable of the ‘cognitive tasks’ of driving a car, or indeed of developing agency and foresight, and escaping human control.

The notion of ‘general intelligence’ is singularly unhelpful for framing this difficulty; since it assumes that the notion of a ‘general intelligence’ is coherent in the first place. We are biased by our own understanding of what is ‘difficult’ versus ‘easy’ (see e.g. Moravec’s paradox), and have our own well documented cognitive biases. We should be aware of the possibility that since our own cognitive hardware evolved under very specific historical conditions, there are vast areas of cognitive space that are invisible to us - why would we imagine that our intelligence is truly ‘general’? Insights from computational complexity theory may be helpful in examining such questions, but bounds on the difficulty of certain complexity classes yields little practical insight into specific problems or algorithms when you don’t know what they are - ‘you don’t know what you don’t know’.

The fundamental issue is that we actually have no idea how hard it is to go from building an AGI to a true superintelligence, nor to go from the types of single application AIs that now exist to AGI. It could be that building an AGI (or even ‘just’ an ordinary AI that can bootstrap itself to something we would have to call an AGI) is fundamentally different, and much more difficult, than building an AGI that can bootstrap itself to something that would classify as a superintelligence. These things are simply not known.

The necessary framework for mathematical proofs of relevant issues is missing

There is an approach to the philosophy of mathematics called constructivism, which requires the explicit construction of mathematical objects in proofs, and deems proofs lacking such constructions to be invalid (as a result existential proofs by contradiction are not permitted, which I have always found somehow hilarious).

In the same spirit we may remain sceptical as to the existence of AGI, either until such a system has been shown to be theoretically possible, or brought into existence. There is serious concern from a small but vocal and influential group of people that such an approach is insane and sure to doom us all to death (this essay is not for them).

Since an AGI is not a mathematical object, but a physical, constructed object, it is not entirely clear what a theoretical proof would look like. Taking humans as the reference, clearly nature could produce something like intelligence, given billions of years of natural selection (but beware anthropic biases). A theoretical exposition would presumably entail indicating the type of algorithms and/or data that are sufficient, and the associated computational requirements, to create an artificial intelligence. Currently I am not even aware of a theoretical treatment demonstrating that human-like intelligence can come in to being; of course such a treatment would be superfluous because we already know we exist, but it does highlight the type of fundamental insight that is lacking.

It is true that there have been various brave attempts to axiomatise ‘intelligent agents’ (in terms of e.g. Kolmogorov complexity, Bayesian inference, etc; see AIXI for a particularly interesting example) and these are undoubtedly meaningful examples of progress in the field; but as far as I can see they are still simple mathematical models and do not constitute the conceptual framework necessary to convince us that a ‘proof’ has been found demonstrating what is required to construct an AGI (or an AI that can construct an AGI), much less that we are now very near. Similarly, many of the attempts I have seen that attempt to quantify the amount of computational resources required to recapitulate the process of billions of years of natural selection via computer simulation seem laughably optimistic (but not, I hasten to add, therefore valueless).

Maybe we lack the appropriate type of hardware

It took some 4 billion years from the emergence of life to the evolution of the human brain. It is noteworthy for not containing a single transistor, having no DIMM slots, and running on less energy than an incandescent light bulb.

While there is currently a lot of optimism (or dread, depending on your position) regarding the imminence of AGI, we do not truly know how far away it is (nor, as I pointed out above, exactly what it would entail). The situation would change dramatically if we had some reason to believe that the physical hardware we currently use to build computers (i.e. silicon based CPUs and GPUs containing large numbers of transistors) has a bottleneck that makes a fast AGI takeoff infeasible, and that we require a fundamentally different physical substrate (e.g. photonics, or some biological system) to overcome this bottleneck. In such a scenario a fast AGI takeoff would be delayed until the manufacturing base shifted to the production of a sufficient number of these units of the requisite quality. The unit economics would have to scale to make it worthwhile; if a different type of hardware substrate is required, we don’t know that the same incentives that drove Moore’s law for transistors would apply to the new technology. Perhaps these components would remain incredibly expensive to build, and AGI might enter a fusion type scenario, where the joke is that fusion has always (i.e. for the last 70 years) only been a decade away.

It does strain credulity somewhat to imagine this is the case, and none of the suggestions I know of (e.g. Penrose’s misguided suggestion that quantum effects are responsible for human ‘consciousness’) seem even remotely plausible. Given the universal nature of computation we expect that the physical basis of the hardware shouldn’t matter; but certainly the speed at which computations take place, and the extent to which they can be parallelised does; and perhaps there is an unexpected bottleneck which must be surpassed for ‘general intelligence’ to be economically feasible, even for the tech giants of today.

Some arguments against the imminence of AGI

If we fear the emergence of an unaligned superintelligence from self-improving systems AI, the argument against the imminence of such an event is some combination of the following.

The ceiling may be higher than we think

The ceiling between the type of self-improving AI we can build, and what it would take for this self-improving AI to reach AGI levels may be orders of magnitude larger than the optimists currently expect. This could be for any numbers of reasons, including arguments related to complexity theory; the capacity of the global industrial base; some obscure scaling law of social systems that limits the complexity of what human societies can build; and maybe physics if we’re really lucky, but I know of no reason to believe that this is the case, despite some valiant but kooky attempts in the literature.

‘General’ intelligence is a misleading notion

The notion of an AGI may be poorly grounded; perhaps even we humans do not have a ‘general’ intelligence in a sense that an extension of such an intelligence would pose an existential danger to us (although non-existential dangers are still possible, even likely). Differently phrased: the space of cognitive ‘motions’ available to us is constrained, and there is a vast sea of other possibilities that we are blind to. Any attempt to artificially create an intelligence would have to navigate a space of unknown size to find something comparable cognitively to what took biological evolution around 4 billion years, and this will inevitably involve a large number of tradeoffs.

AGI to superintelligence may be hard

Even if humanity (or, let us be realistic, a large American or Chinese corporation) constructs something that everyone would have to agree is an AGI, it does not follow that such an AGI could easily bootstrap itself to a ‘superintelligence’ (a term which is even more in need of grounding than AGI). Assumptions that fast takeoffs are inevitable, or even very likely, rely on handwavy analogies and a large number of assumptions that are poorly examined. Maybe an AGI could indeed build a ‘more powerful’ AGI; but it may very well be difficult to find a path in the space of possible intelligences that leads to general superintelligence, rather than e.g. ‘an AGI that is slightly better at this one thing but inevitably worse in this other important way’. To the rejoinder that maybe a lot of ‘ordinary’ AGIs that incrementally become better at various tasks eventually qualify as the feared superintelligence - yes, I agree; ‘Maybe’, ‘eventually’. The sun will also ‘eventually’ burn out - I am not overly concerned by this, unless I am already in a dark mood.

Some more specific reasons to be sceptical

None of the impressive advances in AI in recent years have involved sensory integration or embodiment (aka robots). Although I am far from an expert in the field, I would not be surprised if the advances required to make progress in these fields change the apparent trends in scaling laws. If embodiment is a prerequisite for true AGI (in a pedantic sense it must be, since otherwise it could not not perform the types of cognitive behaviours we do when navigating the world), the current trend of performance versus resources could fall off a cliff.

We may still be in the era of low-hanging fruit, analogous to physics in the first half of the 20th century, and the incredible advances of deep learning will soon start to hit diminishing returns, requiring different approaches. Sam Altman of OpenAI has hinted as much in a recent suggestion that the time of ‘giant models’, i.e. models with large parameter counts, is over.


Having struggled with the concept in a personal and informal manner now for close to eight years, I have come around to the boring position that our greatest threat continues to be the ordinary human foibles that have always troubled us. I am more concerned with the emergence of increasingly totalitarian systems, enabled by the developments of capital-driven information processing systems, than I am that those systems will escape the control of the individuals and companies that built them, and literally kill all of us. The potential for propaganda and advertising (these are synonymous) is enormous. I expect that the web will become filled with even more garbage, that web search will degrade, and that some companies will use the new tools to extract immense profits from people while making society worse. There will also undoubtedly be positive developments, and I expect that we will not all be turned to grey goo by a rogue superintelligence within the next hundred years. I am willing to place a bet to this effect, but good luck collecting if I’m wrong.

This position may be false; but it is the best I have, and the only attitude that seems tenable to me, given the way I exist in the world at present. Given that either way I am not in a position where I can do much to change the outcome, this may be a comfortable illusion; but it is a more empowering one than that espoused by proponents who proclaim the inevitability of our deaths at the hand of superintelligent machines. Let us look for ways to engage with the reality we face, rather than descend to doom-mongering.