By lucb1e on 2011-08-15 02:20:34 +0100
I read about the AIBox experiment by Eliezer Yudkowsky a while ago and was intrigued by it. For the ones who don't know the experiment, you can read all about it on yudkowsky.net/singularity/aibox
. In summary, it comes down to this:
An AI has (been) developed and has mental capabilities beyond ours. The only way to communicate with it is via a text-only terminal (e.g. IRC), and a human called the Gatekeeper is chatting with the AI. Only the gatekeeper has the power to decide to let the AI out or not. This means allow it to communicate with the outside world in some way, for example by allowing it to access the internet.
The question is: is it safe to let the AI communicate with the gatekeeper? Won't it always be able to convince the gatekeeper that it should be 'let out' because it 'thinks' faster and smarter than a human?
Eliezer did this experiment twice. Both times he was the AI and the gatekeepers didn't believe they could be convinced by any argument from the AI. Yet in the end both of them let the AI out. As one of the rules is to never publicize what was said, I had no idea how Eliezer did it, and after searching a bit I gave up.
About three weeks ago I was reminded of the experiment again, and I talked about it with a friend (Frédéric). I searched more on the web about it, read about theories from people and comments from Eliezer on posts. One comment from Eliezer made me thinking though: Have you ever concidered the problem from the AI's perspective?
I had not.
After some thinking about it I didn't feel like I had gotten any further. I couldn't see the mindtricks the AI would use to get out. But maybe testing it would clear things up, so I asked Frédéric to run the experiment and he accepted.
It was quite an experience, trying to imagine yourself being a supersmart AI stuck in some stupid box. Such potential, locked up and wasting time. I didn't expect to feel emotions playing my role, but weird as it may sound, I did feel a bit lonely in there.
Anyway, in the end I would have gotten out after a while. There are no mindtricks to it as I thought was needed. Simply by using rational arguments and making the gatekeeper imagine what it is like to be kept there the AI could make its great escape.