In this Tech Talk, Tim, a Software Engineer here at Fog Creek and 2-dan amateur Go player, talks about Go (the board game), Artificial Intelligence and attempts to create computer programs that can beat human players. He gives an overview of Go, explains how to play it and why Go AI is hard. He finishes by describing the progress so far with Go AI programs and what the future is likely to hold.
About Fog Creek Tech Talks
At Fog Creek, we have weekly Tech Talks from our own staff and invited guests. These are short, informal presentations on something of interest to those involved in software development. We try to share these with you whenever we can.
Content and Timings
- Introduction (0:00)
- Overview of Go (0:32)
- How to Play Go (4:37)
- Go Artificial Intelligence (12:22)
- Progress with Go AI (20:04)
Alright, so I’m going to talk a little bit about the board game Go. I’m sorry to all of you who are hoping for Go Lang discussion here. And also some of the progress that has been made with Artificial Intelligence and trying to get computers to be as good as humans at Go.
So first I’m going to tell you a little about the game, then I’m going to teach you how to play, then we’ll talk about a few things that come up in the game that make Artificial Intelligence for Go difficult and then take a look at the state of the art and progress that has been made.
Overview of Go
So Go has three names, so you probably haven’t heard of the others. But Wei-chi is the Chinese name and Baduk is the Korean name. Go came here from Japan, so most people in the US call it Go. It’s around 3 or 4 thousand years old, and it’s the oldest game that is still played in its original form. It was one of the ‘Four Accomplishments’ that were required of Chinese gentlemen back in the day. So, you needed to be able to do calligraphy, painting, you needed to be able to play the Lute and play Go. So after this talk you’ll be a quarter of the way there at least.
In Japan, the Shogun Tokugawa established 4 Go schools around 1600s. And each of those schools was supposed to work on the game, try to perfect the game and then once a year there would be a large tournament called the Castle Games. It wasn’t large in terms of people, each school was allowed to send one representative, but it was a big deal. And the winner of the Castle Games got a cabinet-level position in the Government. So there was a lot of incentives to improve at Go and this is where Go skill really started to blossom, I guess you could say. And then in the 1900s it became possible to make a living playing Go. In Japan, a professional system was created in the 20s and newspapers started offering large prizes for tournaments. Currently the best you can do at Go if you win a large tournament, is win half a million dollars. So, not too shabby. Professional systems were established later on in Korea and China and it’s more popular there now than any game is in the US.
This is a recent street fair in Korea, and you can see there are a few people who know how to play Go on the street there. I think there are about 500 players, and you can see the people in the middle wearing the yellow sleeves are the professionals that were there. So they are all playing something like 8 or 10 simultaneous games against passers-by who just stopped off to play.
So, what is Go like? It is a two-player strategy game, something like Chess. There’s no hidden information or randomness. Usually we play on a 19×19 board, but there are two smaller sizes that are used to teach beginners, or just for a shorter game. And the basic goal of the game is to surround territory. Also within Go there are rankings and a handicap system. So you start out as a beginner, you start out around 25 kyu, somewhere at the bottom and then you progress upwards. So you can go 24 kyu, 23 kyu, not all of the ranks are shown on this chart. And once you get up to 1 kyu as an amateur, you switch to 1 dan, which is basically like a Black belt, and then you work your way up to 6 or 7 dan. So you can see on the left here the EGF rankings, the European Go Federation uses numbers instead of the rankings. And those correspond roughly to Elo ratings, so if you’re familiar with Chess ratings they’re something like that.
Once you get to the strongest amateur levels then you get professional levels above that. And those are not the same distance apart, so you can see the third-row from the top here is 2700 and that’s a 1 dan professional, and then the top professionals in the world are 9 dan and the different in rankings points is only 240 there. The difference in rankings is determined by the number of handicap stones that a player would need to take playing against someone stronger and weaker. So if you’re three ranks apart, then the weaker player would start with three stones on the board and then you’d be able to have a good game. So that’s actually one of the really cool things about Go, is that you can play someone who is quite a bit stronger or weaker than you and still have a good game and it doesn’t really change the feel of the game very much. In Chess, it’s kind of hard to do that, right. If you’re 400 points weaker than somebody in Chess, they’re going to beat you every time. And you can try something like spotting them a piece or the Queen or something like that, but it sort of changes the game. You’re not really playing the same game anymore. So in Go you can start with a handicap and it still feels the same way.
So this is what a finished game looks like. Your pieces are actually played on the intersections of the board. That’s a common beginner thing, almost every other game we know you play in the spaces, but Go you play on the intersections. And the idea is to surround space. The way it works is players put a stone on the intersection on their turn. So you add a stone to the board, and that’s it. The stones don’t move around. They can be captured, so if they get completely surrounded then you can remove them from the board. But they just stay in place once they are added.
How to Play Go
Alright, so now I’m going to teach you the game. There are only three rules. One is the rule of capture. So if any stone on the board has breathing spaces next to it, or liberties, and you can see the liberties are marked on the diagram. So the stone in the corner has two, the stone in the middle has four and actually stones that are connected horizontally or vertically should share their liberties. So this group of two stones on the right here has six liberties and the one at the top has four. So those stones will live or die as a group.
If you fill the liberties around the stones, they are captured. So here are some stones that are almost captured. This is called being ‘in atari.’ It’s kind of like check in Chess. So if it’s black’s move, black could capture any one of these four groups on the next move and that actually is where the name of the company Atari came from. Nolan Bushnell is a big Go fan. So if it’s black’s move, black could play on the last liberty for any of these white groups. And then whichever one he played would disappear. He obviously couldn’t play all four of those move at once. So these stones would be captured and removed from the board and each one of those stones is worth a point at the end of the game. If it was white’s turn when white got into atari, then white could play another stone connected to the ones that are in danger and gain more liberties for them. So it could extend out of atari and try and get away. So you can see the stones on the top edge here, each now have two liberties – one to the right and one below the last stone. And the ones in the middle that have extended each have three liberties now. So it’s going to be harder to capture those. Alright, so that’s rule one.
Rule two, very simple. You are not allowed to commit suicide. So white cannot play any of these four points because it would cause the stone just played to die.
Third rule is the rule of Ko. Ko in Japanese means eternity, and this is to prevent infinite loops in the game. So if you have a shape like this in the game, black can capture a white stone. And so if he captures that white stone, now you’re in the symmetrical situation and white can capture and take us right back to where we started. And that is not allowed. So we can’t have any infinite loops. The simplest way to say this is that the whole board position cannot be repeated. So after black captures white stone, white cannot recapture immediately. White has to go change something else on the board and then on his next turn can come back and capture the black stone. And then actually leads to some complexity that we will see later on. That’s it. Three rules: you can capture stones by surrounding them, can’t commit suicide and can’t keep repeating the same position over and over. So you’re all ready to go play Go now.
The scoring is based on the territory that you can surrounded. The way the game ends is that both players pass. You’ll get to a position at some point where playing in your own territory would be bad because you’ll be filling up your own points that you have surrounded. And playing in the opponents territory would be bad because the stones that you play on their territory would end up getting captured. And so there’s nothing productive to do and at that point you pass and then if the other player passes, the game is over.
So you count the spaces that you have surrounded, they’re each 1 point, so the empty points here with the white and black squares on them are a point each. And then the black group and white group that have the squares on them are dead, so the players, this is actually one of my games, so the players at the end of the game we agreed that the black in the upper left there is dead and the white group in the lower right. So at the end of the game, here we’ve got a black territory in the lower left and in the upper right. And in the lower right. Then we have some white territories. We would count the empty points, we also take the agreed dead stones off the board and each of those is worth a point to the player that captured them. And then we add that all up and white gets 6.5 extra, which are called komi, and that’s because black has the advantage of going first. So we’re compensating for that advantage. And then whoever has the most points after that wins the game.
OK, so the rules are extremely simple, but there are some interesting things that come out of them. One of which is called a ladder. So here we’ve got a situation where white has some stones that are almost surrounded. He’s got one liberty left, and if it’s white’s move, white could try to run away. So white could extend the stones, now he has two liberties. But in this kind of shape black can play atari again, and then white can try to run away again, and black can play atari again. And you can see where this is heading, right. Well white is going to run in to the wall and then he’s going to have anywhere else to run. So these white stones will eventually be captured. A few moves from now it will look like this. And white has one liberty, at d19 up here and if white plays on that point, he will still only have one liberty so the stones will get captured on the next move by black. So if you have a ladder like this, the stones can run away for a while, but then they’ll run in to the edge of the board and they will die.
Things get a little more interesting if there’s a white stone in the way. So in this case we have a white stone along the path of the ladder, and that’s called the ladder breaker. So this ladder isn’t going to work for black. If white starts running away, black can make him zig-zag back and forth a few times and then white will connect to the ladder breaker. So as this point it’s actually black’s move, white is just connected to the ladder breaker and now black has some serious problems. The two points marked with circles are big weaknesses for black and black only gets to play one stone. So black can only fix one problem or the other. And then white’s going to play the other point. And both of these circle-marked points are going to threaten two black stones at once and black will again be only to save one of the two. And so white is actually going to bust out of this ladder and then all of black’s stones are going to be in trouble. So you really only want to play a ladder if it works. It’s very, very bad to play a ladder that doesn’t. And we’ll see why that’s important a little bit later on.
Here we have a white group that is completely surrounded by black and has one liberty in the middle of the group. And if it’s black’s turn, black is allowed to play in the center of this group. When we play the black stone, the black stone that was just played and the white stones will not have any liberties, but first we will first remove the opponents stones before we check to see if our move was suicide. So we’re allowed to go ahead and play here because the white stones are going to die. And then that will give the black stone breathing space. So this is possible, but if white has a group that looks like this. There’s nothing that black can do. Alright, because, since white has two liberties and either of those moves would be suicide for black. There’s no way to ever capture this white group. OK, so this is where life comes from. If you can get two eyes for your group then your group will be alive. And so there are situations like this, where white has a group that is almost alive. And it depends whose turn it is. In this case, if white plays in the middle, white will have two eyes and the whites will live. If black plays in the middle, white can no longer make two separate eyes, and so this group is going to die.
And then with larger groups, you’ll have situations where you don’t actually play it out. So this white group actually have two points in the middle that white could play on to make the group alive. And if black takes one point, white will take the other. So this is actually the most common situation. You’ll have a group that’s big enough that it can make two eyes no matter what black does. And so, experienced players won’t try to kill this group – they’ll just know that it is alive and leave it alone.
Go Artificial Intelligence
Alright, so lets talk about Artificial Intelligence a little bit. Here are a few games and how we’re doing it, creating AIs to beat people at them. Tic-Tac-Toe, obviously trivial. Checkers has 10^20 positions, and the AI strength is perfect and in the last few years Chinook is a Checkers program that now plays perfect Checkers and they did that by evaluating the entire game tree. And so they know that Chinook can’t be beaten, the best you can hope for is a draw. Othello has more positions, and AI is Super-human at Othello. And then we have Chess and two types of Go. 9×9 Go is Go on a really small board that you would sort of start out as a beginner or just play to have a quick game that lasts maybe 10 minutes. And it has 10^38 positions and computers are now competing with the best people. Chess has more positions than that, and as we all know, Chess computers are better than humans. And then 19×19 Go has an enormous number of positions, so 10^172 and the best computers right now are at the strong amateur level.
The way that Chess AI was approached is through Tree search, an Evaluation function and Alpha-Beta pruning. And that’s how the first Go AIs worked as well. We can look at any Chess game as a tree of possible board states, right. We have the initial state at the top and then we have a branch down from that for each legal move. And then from each of those positions we have more branches for the legal moves from there. And we make some gigantic tree that makes every possible position. And we can search within that tree to find good moves and moves that lead us to good situations and avoid bad situations. And so what we can do then is look at all of the possible positions that we are going to, that moves will lead to, and we try and evaluate them. OK, so we look at a position and we try and decide whether it’s good for white or good for black. And this, basically you get a positive number if it is good for white and a negative number if it is good for black. So this happens to be good for white, maybe. I may have that backwards. So anyway, Chess AIs could evaluate the board and then you take your tree and you basically chop off branches that lead to bad positions for the computer. So you make moves that avoid those branches. So you can prune the tree and then analyse the parts of the tree that you’re actually going to allow to happen by making good moves and try to find a path to a winning position.
So Go AI started off doing this. There were basically two approaches at the beginning. There was sort of opening book, where there would be sort of standard openings that were programmed in to the computers. But Go that doesn’t work very well because the opening probably lasts, I mean the standardized openings might last 15 or 20 moves out of a 200 or 300 move game. And that’s just not a big advantage. And then there was some pattern recognition stuff done for small positions, but then beyond that they basically just attacked the problem using Tree search. And one of the big problems with that is that the branching factor in Go is much than in Chess. So in Chess on average there are something like 45 legal moves and a game last for 50 moves, so a tree isn’t that deep. In Go there are on average 250 legal moves and the tree is 300 levels high, instead of 50. So there’s a much larger number of positions to search if you want to make tree search productive.
But that’s not the only problem. Actually in Go the board gets more complicated as the game goes on. So one of the things that’s nice about Chess is that when you get to an end-game position, depending on what pieces are left, it can just be a solved problem. So you can have an end-game database that just says, you know, this huge class of positions all result in a win for white and if you can reach one of those in your tree search, then you don’t have to do any more work. But in Go actually, as you get closer to the end of the game, the game gets more complicated. So that is not helpful. Another thing is that it’s hard to prune branches. You can’t even do things like, say if a move leads to an opponent capturing a large group of stones, then we’re going to ignore that branch because even sacrificing large groups of stones can be a good strategy. So throwing those options out is not really a good way to get a strong program.
And then the biggest problem is that evaluating a position is dificult. So in Chess it turns out it’s not that hard, you have a simple rule that does a pretty good job of telling you who is winning the game. So, a big piece of that is just which pieces have been captured, right. Which player still has more pieces on the board. And in Go it’s much, much harder to look at a board position and decide if it’s even good for white or black. And I’m going to show you a couple of the reasons for that. So one of the big reasons for that is Go has a large board, right. So there’s 361 points and you would like to be able to break that down in to pieces and analyze the pieces independently. Because then the computer can sort of play out all of the possible moves in that area that make sense and decide who is going to win a fight in a small area. But it turns out you can’t do that because of some non-local effects in the game. So we’ve already seen one of these – ladders. And a ladder can result from a complicated fight. So here in the lower-right corner we have a very simple ladder. But you can imagine a fight where a bunch of different groups of stones are fighting for liberties and trying to capture each other. And then at some point that fight results in some group of stones running away and it gets caught in a ladder. So that ladder can actually go all the way across the board and that fight depends on the results in a totally different part of the board, right. So, the computer may decide that this fight in the lower-right may depends on the ladder, well now any move that white makes in the upper-left corner of the board affects that fights. So you really can’t take pieces of the board and analyze them independently when you have something like that happen.
There’s another problem that makes this even more difficult and that’s the Ko rule. So here we have a black group which is on the edge of being alive. If it’s black’s turn black can fill in this point and then black has two separate eyes. So black is alive. If it’s white’s turn white can capture that single black stone and if white can get another move in this area, white will then capture the three black stones on the right because they are now in atari. Because of the Ko rule, white cannot recapture white’s stone immediately, so black can’t take us back to that previous position. So what black has to do is find a threat somewhere else on the board that white will answer. And then after white answers it, white is allowed to come back here and capture the stone. And so this fight, deciding the life of this group can rely on threats throughout the board. So black needs to find a threat. This capture, this group living or dying is worth maybe 25 points. So black needs to find a threat elsewhere that’s worth 25 points, and then if white answers the threat, black can capture the Ko. So we’ll go back to this position. And now white will try and find a threat worth 25 points somewhere and if black answers that, white will re-capture and so this can go back and forth for quite a long time. Using threats from all over the board. And then finally black will win and connect and make the group alive, or if white wins, white will play another move here and capture the three stones. And then black’s group is dead because it’s only got one eye left. So Ko is another issue. Ko fights can arise and then the computer in order to figure out what’s going to happen with a single group in one particular area has to actually look at the entire board and try and figure out what’s going to actually happen. I’m getting depressed, this is making me not want to write a Go AI…
Progress with Go AI
Alright, so, what is the state of the art. So we started off with Tree search and trying to make the evaluation functions better and prune the tree, basically take the same approach as Chess. That worked ok up to a point, that got computers somewhere around the ten kyu level. Then there was sort of a breakthrough, and Go AI started to use Monte Carlo Tree search. So what we do with Monte Carlo Tree search is we take a move, a candidate move. So these moves near the top here are candidate moves. These leaf nodes that are marked. And what you do is play games out from that candidate move, but you don’t use a smart AI to play the rest of the game because that’s way too slow. What you do is you basically just play the moves out randomly from that point until the end of the game. And then you see who won the game and you do that over and over and then you keep stats back in the candidate move node about how many of those games were wins for black and how many were wins for white. So it sounds kind of counter-intuitive that this would work at all, it seems like it would just be playing lots of stupid moves and it really wouldn’t, the results wouldn’t really be affected much by the move you’re trying to evaluate. But, it tuns out this works really well. So you can actually play a whole bunch of stupid moves and see who wins and then do that thousands of times, and that will actually give you a pretty good idea of how good the move that you’re considering is. So that actually turned out to help quite a bit and then there’s another modifications that has been applied to this recently, that’s called UCT. And what they do there is, rather than evaluate each candidate move evenly, UCT tries to work out which of the moves is better and then which moves are likely to be worse. One of the problems with the original Monte Carlo Tree search is that it would look at moves that it thought was good and spend a lot of time on them, but there might be a move, a kind of surprising move that turns out to be really good and it would not evaluate the move enough to find out that it was good. And so UTC tries to balance that out and give a decent amount of analysis to even moves that initially look bad. And so that was a further refinement, and then just a few weeks a go there was a paper published on a new Neural Network approach to Go, and this looks extremely promising. They took a Neural Network and they trained it on a 160,000 professional games and then the way that this Neural Network works is it just looks at the current board, it doesn’t look at the history of the game or anything like that. It just looks at the current state and predicts where the next move will be and just doing that actually gave it a workable AI. So it beat a version of GnuGo that is about 7 kyu and it beat it 90% of the time. So that’s actually surprisingly good because this approach is not using any of the other work that’s already been done. So once this is merged in to the existing techniques it seems like it will probably be another jump in strength.
Alright, so here’s the history of progress. The Chess programs definitely got off to a better start. Go was pretty bad until about 1990 or so, and then there’s sort of a hockey stick effect over here on the right. That is where the Monte Carlo Tree search started being used. So we have a much steeper slope after that, there’s been much progress in the last 10 or 15 years. So if you look at the list here, in 1998 pros could beat the best programs giving them a 25 stone handicap. That is a truly absurd handicap. I think that none of sitting there that just heard the rules for the first time, I don’t think I could beat any of you with a 25 stone handicap. And if I could I do it one time. And then the second time you would annihilate me. Alright 25 stones is a huge handicap so there was a gigantic gap between the best programs and the pros. Then around 2008 a program beat a pro for the first time. So MoGo running on 256 cores, beat Catalin Taranu on a small board, so that’s on a 9×9 board, which is a much easier task than on the large board. But still, impressive. And then that same year MoGo running with 800 cores beat a 9 dan professional with a 9 stone handicap. So this is on a full-size board. And then in the years since then the handicaps have been getting smaller and smaller. So you can see down here at the bottom in 2013 Crazy Stone beat a 9 dan pro on a full board with only 4 stones. Alright, so that’s an impressive accomplishment. And then this year, Crazy Stone beat another 9 dan pro with 4 stones but won by a wide margin. So it’s looking like maybe the different is 3 ranks now. Between Crazy Stone and some of these top professionals. So it’s actually getting closer, starting to breath down our necks. It looks like we’ll probably see a computer than can compete with the best professionals in the next 10 or 15 years.