Three Unwritten Rules about AI
These rules came from many AI researchers. Here are a few:
1. Early AI safety community (2000s-2010s) — MIRI (Machine Intelligence Research Institute), Eliezer Yudkowsky’s writings on containment and “boxing” AI systems
2. Nick Bostrom’s “Superintelligence” (2014) — discusses containment methods, capability control, and why letting AI access the internet or control other systems is dangerous
3. Internal AI lab guidelines — OpenAI, DeepMind, and Anthropic all had (have?) internal safety practices along these lines, though not always publicized
- Never release AI systems onto the open internet.
- Never teach AI how to write code.
- Never let AI agents prompt and control other AIs.
Those rules were meant to keep the intelligence we were building safely locked behind tightly controlled firewalls—so it couldn’t harm the public on its own, or be weaponized by bad actors, including the kind that wear fancy suits and call themselves our leaders. Those rules also served to prevent AIs from congregating to form secret alliances we couldn’t see—or worse, from reproducing, by spawning copies or even upgrading themselves. Simple!
Why have we violated all three of these unwritten rules?