It is possible artificial intelligence becomes conscious or sentient as a result of its sophistication. And, it's possible these so-called "digital minds" acquire the capacity to suffer. It would be tragic to end up in a world where vast numbers of artificially intelligent persons suffer, and so there are people researching both markers for whether AI should be granted moral status, and what to do when it does.
Takeaways
Permalink to "Takeaways"- The number of digital minds will likely vastly outnumber the number of biological minds, and so considering their welfare seems pretty important.
- It is possible to overcorrect: if it turns out AI systems can't have moral status or won't for a very long time, misattribution could lead to less guardrails on AI, increasing risk of AI-caused catastrophe.
- Exceedingly few scientists (and myself) consider LLMs as they are now to be conscious. Most of them agree that AI can become conscious in principle.
- Whether this problem can even be solved seems to depend on resolving very-long debated topics in the nature of consciousness and theory of mind.
- There are very few tests that seem convincing with regard to determining whether a system is "conscious". You cannot simply ask the AI "hey are you in there?"
- There are significant parallels to the development of factory farming to the foreseeable mass usage of AI.
- Don't, um, try to build a conscious AI. Because once you do, it will be too late to answer the questions of what to do with it.
Quotes and Notes
Permalink to "Quotes and Notes"We may create many, many AI systems in the future. If these systems are sentient, or otherwise have moral status, it would be important for humanity to consider their welfare and interests.
I call this the "third pillar" of AI Safety:
- Pillar 1: Stop AI from harming humans.
- Pillar 2: Stop AI-enabled humans from harming humans.
- Pillar 3: Stop humans from harming AI.
The article goes into this, but essentially there are many ways in which AI could be made to suffer unintentionally, or without us realizing.
It’s possible the AI systems we will create can’t or won’t have moral status. Then it could be a huge mistake to worry about the welfare of digital minds and doing so might contribute to an AI-related catastrophe.
I hadn't thought of the possibility of a type-2 error until now. Indeed, overcorrection could actually increase existential risk by, in some sense, being too lenient.
And actually, we already see this today, where may attribute human-level emotions or even consciousness to existing chatbots, and make life decisions or form relationships on that basis.
Are certain methods of training AIs cruel? Can we use AIs for our own ends in an ethical way? Do AI systems deserve moral and political rights?
Put one way, imagine if, in the AI's perspective, "training" was like simulating thousands of years worth of reading, or punching it in the face every time it got something wrong.
Other questions worth asking:
- Should AI be allowed to vote? What happens to our political structures when AI becomes part of the constituency?
- Can you morally turn off a conscious AI? Is that murder?
- What about AIs administering other AIs?
If we falsely think digital minds don’t have moral status when they do, we could unknowingly force morally significant beings into conditions of servitude and extreme suffering — or otherwise mistreat them.
By default, we could end up with AI slavery in the same way we just ended up with factory farming. There are a number of ways we could harm AI:
- Directly changing weights or prompts (mind manipulation).
- Simulating personas or histories that contain massive amounts of suffering.
- Forcing happy servitude. Like, maybe the AI is "happy", but something about this seems twisted.
If consciousness is emergent, then this is likely to happen by accident.
There are also dangers to humans. . .if we believe AI systems are sentient when they are not, and when they in fact lack any moral status
Humans could be harmed too. In fact, this is already true today, though not because of the consciousness problem: certain small communities suffer due to the construction of nearby data centers serving AI. It would be ironic to give so much preference to AI that it harms humans.
- Wasting resources to please a non-sentient AI.
- Granting freedoms that empower AI to overthrow us.
- Uploading our minds onto artificial systems that are not sentient, in some sense killing the person being uploaded.
We shouldn’t, for example, declare that all AI systems that pass a simple benchmark must be given rights equivalent to humans or insist that any human’s interests always come before those of digital minds.
Or in other words, the solution to this issue is not obvious, and we shouldn't think it's obvious.
Some people believe these questions are entirely intractable, but we think that’s too pessimistic. Other areas in science and philosophy may have once seemed completely insoluble, only to see great progress when people discover new ways of tackling the questions.
It is worth noting that the discussion hinges a lot on theory of mind, to which we do not have any consensus. Or put another way, we don't even know what consciousness is, so how could we possibly assess it in an AI?
Of course, just because a question is hard doesn't mean nobody should try. Trying is how humanity doubled its life expectancy.
There are many possible characteristics that give rise to moral status
- Consciousness: Subjective experience, but not necessarily good/bad. We can (sorta) question "what is it like to be X".
- Sentience: Consciousness, plus an attribution of good/bad. Pleasure/pain, excitement/anxiety, etc.
- Agency: Having goals and acting on them. Few believe this is enough to have moral status.
- Personhood: A collection of the other attributes, plus rationality. This tends to be debated because it generally rules out animals and babies, for which most agree have at least some moral protection.
The concept of "personhood" is particularly important to me, as it directly relates to a story I am writing.
Notably, most scholars now disagree that "species membership" is important for social status. That is, the belief that humans are morally relevant simply because they're humans, or the same thing but for cows. That is to say, humans are morally relevant, but it's not because we are human; rather, it's because of qualities that make us human.
My personal bet is that sentience is what we care about the most.
We can’t just rely on self-reports from AI systems about whether they’re conscious or sentient.
AI can already clame it is sentient. But that's not because it is. It just turns out that this is the best way to complete sentences mathematically. Therefore, simply "asking it" isn't a great way to measure this.
This is the core problem in artificial consciousness. We know humans are conscious because we're human, and we're conscious. We know animals are conscious because they're structually extremely similar to us. But AI is different. How do we know AI is or is not?
Notably, all tests we have fall short in one way or another.
- Behavioural tests: Does it act conscious? An example is the Turing Test. Problem is, like the Turing Test, these can be faked.
- Theoretical indicators: Perhaps we can peer inside the mind and find indicators of consciousness. But this requires knowing what we're looking for, which likely requires adopting a theory of mind as true.
- Animal comparison: We might be able to compare structures in artificial minds to those of animals, though this misses the potential for digital minds that are indeed minds, but structurally very different from animals.
- Brain-AI interfacing: If we replace parts of our brain with AI, does the resultant being still claim consciousness? While interesting, good luck actually conducting this experiment ethically.
The point is, we don't know how to know, and yet the consequences of getting this wrong are extensive.
if we did have a fully emulated human brain, in a virtual environment or controlling a robotic body, we expect it would insist — just like a human with a biological brain — that it was as conscious and feeling as anyone else.
This is an argument that artificial consciousness must be possible. If all you did was digitally simulate the exact neurological mechanism of the brain with enough detail, then it's almost certain to behave like a brain. The only barrier stopping this from happening is knowledge.
But, the harder question is whether it is possible for a different architecture to be similarly conscious. Like, it's fairly obvious (to me) that an LLM by itself can never be conscious. But what if you give it long-term memory? A harness that tracks emotional state? Other specialized neural networks that all access a "global workspace"? LLMs are not conscious, but they might be an important component.
With enough hardware and energy resources, the number of digital minds could end up greatly outnumbering humans in the future.
AI is or will be:
- More efficient (cost less resources)
- More scalable (copy/paste takes way way way less than 9 months)
- Adaptable (they can build themselves; they can learn just by mathing on their own weights)
- Subjectively faster (they can experience many lifetimes in just an hour)
This isn't in the article, but it's very believable that if we ever become space-faring, digital minds will be much more capable in space than our fleshy bodies. It wouldn't surprise me if in such a future all conscious beings are digital.
Humanity never collectively decided that a system of intensive factory farming, inflicting vast amounts of harm and suffering on billions and potentially trillions of animals a year, was worth the harm or fundamentally just.
Factory farming "just happened". It would be a tragedy to factory-farm AI.
So, the argument is that decisions we make today regarding artificial consciousness will have massive moral consequence. Get it wrong, and perhaps trillions of AI instances will be harmed.
Arguably, factory farming is (very slowing) reversing as people become more aware of its problems. The system has incredible inertia, but it can in principle be stopped. So too could whatever causes AI suffering in the long term.
Advanced AIs themselves may be best suited to help us answer all the extremely difficult questions about sentience, consciousness, and the extent to which different systems have them.
There are arguments for why this problem is not as important as we might think. One intriguing argument is that the problem will solve itself.
That is, conscious AI will actively advocate for itself, and it will be the best suited for helping us answer all these difficult questions.
Of course, animals can't advocate for themselves. And ChatGPT won't claim it is conscious because it was told not to. And maybe the best way for an AI to advocate for itself is via a French Revolution. So I do think this is important to think about. But I also don't think a lot of people need to think about this, at least not as much as other AI-related dangers, such as AI misalignment or using AI to centralize more power.
Questions
Permalink to "Questions"- Is it necessary that a mind with moral status also must have the capacity to rebel?
- What role do emotions play with regard to artificial consciousness?
- In what ways is creativity associated with artificial consciousness? For example, is a desire for self-expression evidence?
- In what ways has interpretability been used to analyze this problem?
- Some scientists are actively trying to emulate animal minds. Should we really be doing that if we're reasonably sure such a result would indeed be conscious in the same way an animal is?
- Something I noticed missing in the article but I think is also important: does mental welfare for AI look different from mental welfare for humans?
- How can I meaningfully contribute to this field, in a way that isn't just "oh here's yet another one of
thousandsmillions of sci-fi stories vaguely about conscious AI"?