AI learns to play hide and seek

What properties can emerge from multi-agent learning?

Note: If you find this post to be valuable, please click the little gray heart below the post title and consider sharing this with your friends (I release three newsletters every week). It really helps this newsletter get notice.

There’s some interesting stuff going on in the tech world and I would be remiss to not share some of the stuff with you all because its relevance will reveal itself in upcoming years. A couple weeks ago, OpenAI, an AI company spearheaded by Elon Musk, dedicated to ensuring that AI be used to the benefit of all humanity, released a video update on their AI.

Too long; didn’t watch

OpenAI’s initial experiment used multi-agent reinforcement learning (MARL for short). MARL is a machine learning technique that pits multiple algorithms (or multiple agents) that learn through interacting with their environment against each other. So each agent gets better and better through tons of trial and error.

And through 500 million trials, the AI agents (hide and seekers) developed intelligent strategies that involved using tools (boxes to create enclosures and ramps to go over boxes) and collaboration. Seems familiar, doesn’t it? The hide and seek environment mimics the billions of years of evolution through competition and natural selection in the real world. But this is only the beginning.

How does this affect you?

OpenAI’s end goal is to see if by scaling these AI techniques, whether it would produce a much more sophisticated artificial intelligence. Now, those of you that have watched robotic apocalypse movies may be all too familiar with the worst case but cinematic scenarios. It’s reasonable to fear something like that but OpenAI is conducting these experiments to analyze what kind of properties emerge and how to solve problems that we currently may not know how to solve. So don’t worry, they’ve got our back.

Got ideas that you would like me to write about? Pitch them to me at!