AlphaGo vs. Lee Sedol: time spent pattern comparison
In my latest blogposts on AlphaGo vs. Lee Sedol, I uploaded some graphs that
clearly showed how Lee Sedol and AlphaGo used their time differently in the
Google DeepMind Challenge. Recently, I have come across some really interesting
articles and lecture videos on distance measures including LevenShtein distance,
and thought it would be interesting to see the patterns between human and
machine! (Even though we already know that AlphaGo is a man-made machine).
Thinking Time Remaining
The following custom function ‘preprocessing’ 1) reads the raw csv file, 2)
calculates time spent between each turn index. It’s basically the same code (yet
more concise) from my previous ipython notebook uploaded here.
Stringify(?) the thinking time
To make it easier for my laptop to guess how different AlphaGo is in Game 1 from
Game 2, I’m going to transform the float data to strings according to their time
length like the following custom function.
It really depends on how you want to design your stringified thinking time. I
designed it to cluster hasty moves, normal moves and moves with prolonged
thoughts. Pandas ‘map’ function makes it super easy to apply this to every value
in the DataFrames!
Alright, now that the data is good to go, let’s move on to the distance measure.
Levenshtein Distance measure is one of the distance algorithms we can use to
tell how different two given strings are. Let’s say you are given ‘Dorian’ and
‘Durians’. Dorian is my pet cat, and Durians is the plural form of my least
favourite fruit. Anyway, LevenShtein Distance measure gets higher as you delete,
insert or replace to make one to be exactly the same as the other. So in our
example, we need to change o->u (1 point), delete s at the end (1 point), so
their Levenshtein distance is 2.
Levenshtein distance between Dorian and Durians is 2.
Levenshtein Distance on AlphaGo and Lee Sedol
Okay! We already know from the my previous visualisations that AlphaGo time
spending habit was significantly different from Lee Sedol. So it’s not code
worthy to calculate the distance between the two. What about themselves?
AlphaGo’s thinking time didn’t vary much but how about Lee Sedol? In the second
game he took a highly defensive position, whereas in the third round he started
off with offensive position. Did his time spending habit change as Google
DeepMind Challenge proceeded? (It’s interesting that the same logic is widely
used in bot-detection practices in gaming industry and fraud detection
algorithms in finance trades.)
I’ve made the following custom function to 1) prepare 5 time logs of AlphaGo and
Lee Sedol, 2) calcaulate Levenshtein distance between their games 3) and present
pretty heatmaps side by side. The third parameter ‘length’ means the threshold
to slice the time log in order to make more comparisons. It’s a wide-spread
practice to slice time frames in order to make the difference in distance more
dramatic. The longer the logs are, the more likely they become heterogeneous.
It turns out (as expected) that Lee Sedol’s games were much more diverse than
AlphaGo’s games. The homogeneity of AlphaGo moves is not surprising, but its
Game 5 was quite different from all the other. Lee Sedol’s Levenshtein distances
are all at least higher than 55. Like AlphaGo, the odd one out in his games was
game number 5.
Well, to my surprise, AlphaGo’s LevenShtein distance was a little bit higher
than I expected. If so, how can we detect if it’s AlphaGo playing with a
disguised identity on an online Go match? To bring justice, we need to slice the
time log. Let’s focus on the very first 10 strings.
Now we got it. Just by looking at the first 10 time logs, we can successfully
tell AlphaGo from Lee Sedol! There are more distance measures like hamming
distance, Jaro–Winkler distance and so on! I’ll cover more of these in my future