The
Foundation
Layer

A philanthropic strategy for the AGI transition

by TYLER JOHN

AI Conciousness

Today there is a small emerging field of research on AI consciousness — whether AI systems might or will one day be conscious and have experiences, and how to respond to this possibility ethically. AI systems increasingly act like us and can do the things that we can do, and this raises the question whether they have minds like ours with similar experiences. This is an area of active philanthropic interest for some AI safety funders, and it merits explanation.

Understanding AI Consciousness

A leading — but by no means uncontroversial — theory of mind in philosophy and cognitive sciences is called “computational functionalism.” According to computational functionalism, the mind is just computer software, but software that is implemented in brain matter instead of in computer chips. It takes inputs (in the form of sensory information), transforms these inputs with various kinds of calculations, and produces outputs (in the form of motor behaviour). If this theory of mind is correct, then sophisticated computers could in theory do everything that brains can do, including have conscious experiences.

There are alternative scientific theories of mind that stipulate that human consciousness requires certain kinds of biological processes that happen inside our brains or bodies, and religious conceptions that stipulate that human consciousness requires a soul or immaterial self. These theories would imply that today’s AI systems are certainly not conscious, but different architectures, one day, might be. But to many researchers in the cognitive sciences, the most successful, simple, and naturalistically plausible theory of mind is that the brain is just a different kind of computer.

If this is right, then software can be conscious. And what’s more, neural networks seem to do a lot of the things that our brains are doing, like perception, attention, learning, memory, self-awareness, problem solving, reasoning, and autonomous decision-making. So it is at least plausible that neural networks are actually doing roughly the same kinds of computing that our brains are doing to do the same tasks. If consciousness is just this kind of computation, then it would follow that neural networks are conscious.

Whether today’s AI systems are conscious — and even whether large language models could ever be conscious in the future — is extremely controversial. And most researchers think that today’s AI systems are probably not conscious. But if AI systems were conscious this would be of extreme importance, and the likelihood that AI systems are conscious will steadily increase with time.

The best estimates indicate that the processing power required to simulate the entire human brain is somewhere between 10^16 and 10^25 floating point operations, or FLOPS (the most prominent measure of processing power in computers). The best AI models today are trained using about 10^25 FLOPS. So we’re using more processing power to train these models than it takes to simulate the human brain (if we only knew the human brain’s algorithm) — possibly by more than one million times. If AI models, now or in the future, have similar efficiency to the human brain, then training one frontier AI model (to say nothing of deploying it to billions of users) could be morally similar to creating one to millions of humans. Each generation of new AI model uses more processing power, so the moral importance of AI would continue to grow: to tens to tens of millions of humans, to hundreds to hundreds of millions of humans, and so on. Eventually, most of the consciousness in the world could be in AIs, not in humans or in any other animal.

There are a lot of important “ifs” and caveats in this argument. But it demonstrates that it is at least plausible that AI models today and in the future will be extremely morally important in their own right, and to an increasing extent the larger and better models we train. Moreover, it underscores the enormous amount of philosophical, ethical, and scientific uncertainty we have about AI consciousness.

Given the substantial uncertainty, there are significant risks of both over- and underattribution of consciousness to AI systems. If we overattribute consciousness to AI systems, it can lead to inappropriate social and emotional bonds, and allocating rights, resources, and concern to AI systems unnecessarily. If we underattribute consciousness to AI systems, it can lead to risks of giving insufficient protections to ethically significant entities. As a result, I expect that questions of AI rights and consciousness will take up increasing political and social attention in the years ahead, leading to confusion and intractable disagreements.

Philanthropy and AI Consciousness

The most important things that philanthropists can do about AI consciousness are twofold: (1) work on deconfusing the world about the issue by making scientific and conceptual progress on the question and communicating it clearly, and (2) create governance and preparatory frameworks that prepare us for living with conscious AI, should we one day find ourselves in that world.

To date, philanthropists have had a substantial positive influence on the field of AI consciousness. I helped start NYU’s Center for Mind, Ethics, and Policy (CMEP), an academic institution run by NYU professor Jeff Sebo — and with affiliates like David Chalmers, Ned Block, and Anthropic’s Sam Bowman — which supports foundational research on AI consciousness. The other major organization in this space is Eleos AI, a nonprofit organization run by consciousness researcher Robert Long, dedicated to understanding and addressing the potential wellbeing and moral patienthood of AI systems. Together, CMEP and Eleos have run consultations with Anthropic and OpenAI to help them plan for future AI welfare concerns. Following these consultations and the release of CMEP and ELeos’s flagship report, “Taking AI Welfare Seriously, “ Anthropic hired their first full-time AI welfare researcher, Kyle Fish, an author on the flagship report. CMEP has had four talks at Google on AI consciousness, and Eleos now collaborates with Anthropic to evaluate their AI models for welfare concerns.

Taken together, the work of these two small nonprofit organizations, with total spending on the order of a couple million dollars, has definitively shaped the field of AI consciousness — by writing some of the defining papers in the early field, creating welfare evaluations for frontier models, and helping the top AI companies figure out how they should approach the issue.

The field of AI consciousness is extremely nascent, and there is much work left to do. It is largely a research problem, and philanthropists who want to drive that research forward can help by supporting work in several key areas.

First, the field has yet to develop a rigorous methodology for studying AI consciousness. To create this groundwork, philanthropists can fund rigorous, well-controlled studies of AI consciousness drawing on the lessons from a century of scientific study of animal consciousness. We have confidence that a fish is conscious in part because of studies like “place aversion tests” — injuring a fish and observing that it robustly avoids the stimulus that injured it. We should see rigorously controlled studies on this for AI too.

Second, there is significant room for work extending and refining philosophical theories of consciousness so that they are clear enough that we can study whether AI systems have the properties that those theories specify. Many theories of consciousness are vague and we don’t know when exactly a system would satisfy the requirements of that theory. Work in this area can build on the excellent work of Patrick Butlin, Rob Long, et al. in “Consciousness in Artificial Intelligence: Insights from the Science of Consciousness.”

Third, philanthropists can support technical machine learning research on AI systems to check whether they have functional characteristics that may be required for consciousness. For example, in human pain science, we study patients who experience “ablations” — damages in parts of their brain that prevent them from experiencing pain. We can check whether AI systems display the same behaviour as humans and other animals by trying to find AI neurons that, when ablated, lead to the same result. If AI systems do have the exact same pain functions as humans, that is strong evidence that they (on the computational functionalist theory of mind) experience pain.

Finally, philanthropists can support preparedness for emerging consciousness in AI systems. This includes developing good corporate governance plans for AI consciousness. Should AI companies prevent people from unnecessarily creating pain responses in AI systems? Should they commit resources to studying AI consciousness? At what point in AI development should they start taking these questions seriously? This also includes supporting good thinking about law and policy — about whether AI should be able to own property, have liberal rights, or even participate in democracy, and other protections we might provide AI systems. While uncomfortable, these are questions users and policymakers will increasingly be asking and we need to have clear and compelling answers.

← Previous

Overview

The Foundation Layer is a project by Tyler John, with generous support from
Effective Institutions Project and Juniper Ventures.