AI learned to betray others. Here’s why that’s okay
There’s been a lot of buzz about some experiments at DeepMind that study whether AI systems will be aggressive or collaborative when playing a game. Players gather virtual apples; they have the ability to temporarily incapacitate an opponent by “shooting” a virtual “laser.” And humans are surprised that AIs at times decide that it’s to their advantage to shoot their opponent, rather than peacefully gathering apples.
My question is simple: what does this tell us? The answer is also simple: nothing at all. If you ask an AI to play a game in which firing lasers at your opponents is allowed, it isn’t surprising that the AI fires lasers at opponents, whether the opponents are virtual or physical. You wouldn’t expect it to a priori develop some version of Asimov’s laws and say, “I can’t do this.” (If the software doesn’t allow it to fire the lasers, well, it won’t, but that’s hardly interesting.) You wouldn’t expect an AI to have a crisis of conscience and say, “no, no, I can’t do it.” Unless it was programmed with some sort of guilt module, which as far as I know, doesn’t exist.
Humans, after all, do the same. They kill in first-person shooters as well as in real life. We have whole divisions of the government devoted to the organized killing of other people. (We ironically call that “keeping the peace.”) And while humans have a guilt module, it usually only engages after the fact.
The only interesting question that a game like this might answer is whether AI systems are more, or less, willing to pull the trigger than humans. I would be willing to bet that:
- When computers play humans, the computers win. We’ve certainly had enough experience losing at chess, Go, and poker.
- Humans are more likely to go for the guns because, well, it’s what we do. DeepMind’s research suggests that a computer would only shoot if it’s part of an efficient strategy for winning; it won’t shoot because it’s a reflex, because it’s scared, or because it’s fun.
It’s up to you whether shooting as part of an efficient strategy for winning is an improvement over human behavior, but it’s exactly what I would expect. DeepMind didn’t beat Lee Sedol at Go by refusing to be aggressive.
And even then, given that we’re only talking about a game, I’m not sure that experiment shows us anything at all. I’d expect an AI to be pretty good at playing a first-person shooter, and I don’t see any reason for it to derive Asimov’s Laws from first principles when it’s only exterminating bits. I certainly wouldn’t volunteer to participate in a real-life shooter against some scary Boston Dynamics creation, and I hope nobody plans to run that experiment.
Likewise, I don’t see any reason for an AI to “learn” that there are things in its universe that aren’t just bits. We are fascinated by “machine learning”; but in the end, the machines only learn what we tell them to learn. I’m skeptical of singularities, but I will agree that we’re facing a singularity when a computer can learn, entirely on its own, that some of the bit patterns coming in through its sensors are humans, and that these bit patterns are qualitatively different from the bit patterns of dogs, cats, or rocks.
In the end, we’re back where we started. Fear of AI reflects our fear of ourselves. AI mimics human behaviors because we teach it to do so—in this case, by asking it to play a game with human rules. As I’ve said, if we want better AI, we have to be better people. If we want an AI that can distinguish between humans and bits, we have to teach it what humans are, and how to behave differently in their presence. (“You can shoot the wolf; you can’t shoot the human.”) And if don’t want AI agents to shoot at all, we have to build software that doesn’t have the ability to shoot.
Don’t give your AIs guns.
This article was originally published on the WEF website by Mike Loukides and can be accessed here.