Login / Signup

Genies, lawyers, and smart-asses: Extending proxy failures to intentional misunderstandings.

Tomer David UllmanSophie Bridgers
Published in: The Behavioral and brain sciences (2024)
We propose that the logic of a genie - an agent that exploits an ambiguous request to intentionally misunderstand a stated goal - underlies a common and consequential phenomenon, well within what is currently called proxy failures. We argue that such intentional misunderstandings are not covered by the current proposed framework for proxy failures, and suggest to expand it.
Keyphrases