In March 2016, Lee Sedol, the Korean Go 18-time world champion, played and lost a five-game match against DeepMind’s AlphaGo, a Go-playing program that used deep learning networks to evaluate board positions and possible moves. Go is to Chess in difficulty as chess is to checkers. If chess is a battle, Go is a war. A 19×19 Go board is much larger than an 8×8 chessboard, which makes it possible to have several battles raging in different parts of the board. There are long-range interactions between battles that are difficult to judge, even by experts. The total number of legal board positions for Go is 10170, far more than the number of atoms in the universe.
It came as a shock to many when AlphaGo won the first three of five games, exhibiting an unexpectedly high level of play. In addition to several deep learning networks to evaluate the board and choose the best move, AlphaGo had a completely different learning system, one used to solve the temporal credit assignment problem: which of the many moves were responsible for a win, and which were responsible for a loss? On January 4, 2017, a Go player on an Internet Go server called “Master” was unmasked as AlphaGo 2.0 after winning sixty out of sixty games against some of the world’s best players, including the world’s reigning Go champion, the nineteen-year-old prodigy Ke Jie of China.
The next chapter in the Go saga is even more remarkable, if that is possible. AlphaGo was jump-started by supervised learning from 160,000 human Go games before playing itself. Some thought this was cheating – an autonomous AI program should be able to learn how to play Go without human knowledge. In October, 2017, a new version, called AlphaGo Zero, was revealed that learned to play Go starting with only the rules of the game, and trounced AlphaGo Master, the version that beat Ke Jie, winning 100 games to none. Moreover, AlphaGo Zero learned 100 times faster and with 10 times less compute power than AlphaGo Master. By completely ignoring human knowledge, AlphaGo Zero became super-superhuman.
Although a pocket calculator can crush me in an arithmetic contest, it will never improve its speed or accuracy, no matter how much it practices. It doesn’t learn: for example, every time I press its square-root button, it computes exactly the same function in exactly the same way. The ability to learn is arguably the most fascinating aspect of general intelligence. Fluid intelligence follows a developmental trajectory, reaching a peak in early adulthood and decreasing with age, whereas crystallized intelligence increases slowly and asymptotically as you age until fairly late in life. AlphaGo displays both crystallized and fluid intelligence in a rather narrow domain, but within this domain, it has demonstrated surprising creativity. Professional expertise is also based on learning in narrow domains. We are all professionals in the domain of language and practice it every day.
Finding the answer to a difficult question corresponds to computing a function, and that appropriately arranged matter can calculate any computable function. For matter to learn, it must instead rearrange itself to get better and better at computing the desired function – simply by obeying the laws of physics. There is no known limit to how much better AlphaGo might become as machine learning algorithms continue to improve. What is fuelling these advances is gushers of data. Data are the new oil. Learning algorithms are refineries that extract information from raw data; information can be used to create knowledge; knowledge leads to understanding; and understanding leads to wisdom. Welcome to the brave new field of deep learning.
Deep learning is a branch of machine learning that has its roots in mathematics, computer science, and neuroscience. Deep networks learn from data the way that babies learn from the world around them, starting with fresh eyes and gradually acquiring the skills needed to navigate novel environments.
The origin of deep learning goes back to the birth of artificial intelligence in the 1950s, when there were two competing visions for how to create an AI: one vision was based on logic and computer programs, which dominated AI for decades; the other was based on learning directly from data, which took much longer to mature.
Today computer power and big data are abundant and solving problems using learning algorithms is faster, more accurate, and more efficient. The same learning algorithm can be used to solve many difficult problems; its solutions are much less labor intensive than writing a different program for every problem.
The reinforcement learning algorithm used by AlphaGo can be applied to many problems. This form of learning depends only on the reward given to the winner at the end of a sequence of moves, which paradoxically can improve decision made much earlier. When couples with many powerful deep learning networks, this leads to many domain-dependent bits of intelligence. And, indeed, cases have been made for different domain-dependent kinds of intelligence: social, emotional, mechanical, and constructive, for example. The age of cognitive computing is dawning. Soon we will have self-driving cards that drive better than we do. Our homes will recognize us, anticipate our habits and alert us to visitors. With cognitive computing, doctor’s assistants will be able to diagnose even rare diseases and raise the level of medical care. There are thousands of applications like these, and many more have yet to be imagined. To make the point that less is more even more dramatically, AlphaGoZero, without changing a single learning parameter, learned how to play chess at superhuman levels, making alien moves that no human had ever made before. AlphaGoZero did not lose a game to Stockfish, the top chess program already playing at superhuman levels. In one game, AlphaGoZero made a bold bishop sacrifice, sometimes used to gain positional advantage, followed by a queen sacrifice, which seemed like a colossal blunder until it led to a checkmate many moves later that neither Stockfish nor humans saw coming. The aliens have landed and the earth will never be the same again.