… demonstrations is our ability to abstract and generalize a single demonstration to new situations. In the case of Montezuma’s Revenge, rather than developing a general-purpose solution to game playing (as the two DeepMind papers titles suggest), what has really been developed is an intelligent method for exploiting the game’s key weakness as an experimental platform: its determinism.
Great article Arthur!
The problem of determinism seems relatable to every game. With enough knowledge of the environment, doesn’t everything become deterministic?
Even for incredibly complex games such as StarCraft II and team-based games such as Dota 2. Over time, their seemingly stochastic environment becomes deterministic.
Every action happens because of some prior. Similar to outside of the game world. It’s where the argument to free will existing comes in.
Are our decisions our own or are they based on a collection of priors? Can an agent learn at all without any priors what so ever? Even a lack of priors could be considered a prior.
I’m out of my intellectual depth here and I realise my comment has no real pragmatic use other than understanding every environment can be deterministic.
Perhaps that’s the key to replicating our intelligence. We keep exploiting our own weakness, our determinism.