03 Oct 2015 05:00

Pong is a rather simple though fascinating game, iconic for being among the first ever video games produced. An astounding achievement for its time, pong is used nowadays as a sort of exercise for acolyte programmers. Everything about it is perfect for beginners, including even a good artificial intelligence implementation. A young coder only needs to crunch out some physics calculations for the computer to play perfect pong. However, what if the computer had no idea how to play pong? Is it possible to train it, for it to build its own code?

That was the goal of this project: starting with nothing but the game's rules, have the computer develop its own strategy rather than pre-programmed one. How can this be done, though?

Well, let's say that we had not one AI but many many AIs that all wanted to play Pong. None of them know how to play the game, though, and so they will all have a very bad strategy. Nonetheless, we'll let them play Pong against the one we shall call The Master, an AI that uses physics to predetermine where the ball will go. We'll watch all those poor little AI's suffer and crumble before the might of The Master.

What we will notice, though, is that some of these AIs will do better than others. Yes, they will all fail because they are moving randomly, but some random movements might actually hit the ball. If we could measure an AI's "goodness", which we will now call fitness, we could determine which of them was best. In fact, we could chart their performances as in the graph below:

Generation 1

Image Unavailable

They are pretty dismal, I have to admit.

Let's say that after about 20 years, the best of those AIs married and had children who were also capable of playing Pong. These children would, possibly, have the best characteristics of both parents, and hence they might be better at Pong than either parent. Again, we'll have both the best parents and the children play against The Master, and again they will become crushed. However, if we chart their performances again, we can notice something:

Generation 2

Image Unavailable

Noticeably better than the first generation.

The overall performance of the group seems to have gotten better! If we keep doing this for many many generations, then over time, we should expect the children to get better and better, and perhaps one day they shall overcome The Master.

That was a fairly high-level description of what is called a Genetic Algorithm. Based on the principles of natural selection, the idea is to test a wide variety of possibilities and keep only the "fittest" of those for the next generation. By combining parents, we can obtain children that are possibly better than either parent, and therefore as the number of generations increase, we can eventually develop a solution that 1) works, and 2) is better than a classically designed solution.

This is exactly what I did in order to train the AIs in Genetic Pong. The trainer will test multiple strategies, and by combining the best of each set of 32 strategies, it eventually hones in on and refines a particular methodology.

In the video below, I had trained two different AIs using the same genetic algorithm up to 10 generations. Out of curiosity, I pitted them against one another to see what would happen. What I find fascinating is that the left AI found an offense strategy, while the right AI found a defense strategy.

Genetic algorithms are an old subset of AI research in machine learning. The point is that if we want to develop stronger and better computers, ones that can actually think like humans do, then we need to make computers that can learn to do things, not just do things. It is a different paradigm in programming. In traditional programming, we make solutions. In machine learning, we find solutions. That, in my opinion, is part of what makes the field so interesting.

Comments