Moral status of digital minds

Jul 05, 2026 Artificial Intelligence

Topics

ai
consciousness
ai safety
risk
suffering

Read the Original Article

By Cody Fenwick on 80,000 Hours

Published Sep 01, 2024

It is possible artificial intelligence becomes conscious or sentient as a result of its sophistication. And, it's possible these so-called "digital minds" acquire the capacity to suffer. It would be tragic to end up in a world where vast numbers of artificially intelligent persons suffer, and so there are people researching both markers for whether AI should be granted moral status, and what to do when it does.

Takeaways

The number of digital minds will likely vastly outnumber the number of biological minds, and so considering their welfare seems pretty important.
It is possible to overcorrect: if it turns out AI systems can't have moral status or won't for a very long time, misattribution could lead to less guardrails on AI, increasing risk of AI-caused catastrophe.
Exceedingly few scientists (and myself) consider LLMs as they are now to be conscious. Most of them agree that AI can become conscious in principle.
Whether this problem can even be solved seems to depend on resolving very-long debated topics in the nature of consciousness and theory of mind.
There are very few tests that seem convincing with regard to determining whether a system is "conscious". You cannot simply ask the AI "hey are you in there?"
There are significant parallels to the development of factory farming to the foreseeable mass usage of AI.
Don't, um, try to build a conscious AI. Because once you do, it will be too late to answer the questions of what to do with it.

Quotes and Notes

We may create many, many AI systems in the future. If these systems are sentient, or otherwise have moral status, it would be important for humanity to consider their welfare and interests.

I call this the "third pillar" of AI Safety:

Pillar 1: Stop AI from harming humans.
Pillar 2: Stop AI-enabled humans from harming humans.
Pillar 3: Stop humans from harming AI.

The article goes into this, but essentially there are many ways in which AI could be made to suffer unintentionally, or without us realizing.

It’s possible the AI systems we will create can’t or won’t have moral status. Then it could be a huge mistake to worry about the welfare of digital minds and doing so might contribute to an AI-related catastrophe.

I hadn't thought of the possibility of a type-2 error until now. Indeed, overcorrection could actually increase existential risk by, in some sense, being too lenient.

And actually, we already see this today, where may attribute human-level emotions or even consciousness to existing chatbots, and make life decisions or form relationships on that basis.

Are certain methods of training AIs cruel? Can we use AIs for our own ends in an ethical way? Do AI systems deserve moral and political rights?

Put one way, imagine if, in the AI's perspective, "training" was like simulating thousands of years worth of reading, or punching it in the face every time it got something wrong.

Questions

Is it necessary that a mind with moral status also must have the capacity to rebel?
What role do emotions play with regard to artificial consciousness?
In what ways is creativity associated with artificial consciousness? For example, is a desire for self-expression evidence?
In what ways has interpretability been used to analyze this problem?
Some scientists are actively trying to emulate animal minds. Should we really be doing that if we're reasonably sure such a result would indeed be conscious in the same way an animal is?
Something I noticed missing in the article but I think is also important: does mental welfare for AI look different from mental welfare for humans?
How can I meaningfully contribute to this field, in a way that isn't just "oh here's yet another one of ~~thousands~~ millions of sci-fi stories vaguely about conscious AI"?