2016, Jul 07

AlphaGo vs. Lee Sedol: time spent pattern comparison

In my latest blogposts on AlphaGo vs. Lee Sedol, I uploaded some graphs that clearly showed how Lee Sedol and AlphaGo used their time differently in the Google DeepMind Challenge. Recently, I have come across some really interesting articles and lecture videos on distance measures including LevenShtein distance, and thought it would be interesting to see the patterns between human and machine! (Even though we already know that AlphaGo is a man-made machine).

Data Preprocessing

The following custom function ‘preprocessing’ 1) reads the raw csv file, 2) calculates time spent between each turn index. It’s basically the same code (yet more concise) from my previous ipython notebook uploaded here.

In [115]:

In [117]:

In [119]:

In [120]:

In [121]:

Stringify(?) the thinking time

To make it easier for my laptop to guess how different AlphaGo is in Game 1 from Game 2, I’m going to transform the float data to strings according to their time length like the following custom function.

In [122]:

It really depends on how you want to design your stringified thinking time. I designed it to cluster hasty moves, normal moves and moves with prolonged thoughts. Pandas ‘map’ function makes it super easy to apply this to every value in the DataFrames!

In [125]:

In [126]:

Levenshtein Distance

Alright, now that the data is good to go, let’s move on to the distance measure. Levenshtein Distance measure is one of the distance algorithms we can use to tell how different two given strings are. Let’s say you are given ‘Dorian’ and ‘Durians’. Dorian is my pet cat, and Durians is the plural form of my least favourite fruit. Anyway, LevenShtein Distance measure gets higher as you delete, insert or replace to make one to be exactly the same as the other. So in our example, we need to change o->u (1 point), delete s at the end (1 point), so their Levenshtein distance is 2.

Levenshtein distance algorithm from wikibooks

In [156]:

In [157]:

Levenshtein distance between Dorian and Durians is 2.


Levenshtein Distance on AlphaGo and Lee Sedol

Okay! We already know from the my previous visualisations that AlphaGo time spending habit was significantly different from Lee Sedol. So it’s not code worthy to calculate the distance between the two. What about themselves? AlphaGo’s thinking time didn’t vary much but how about Lee Sedol? In the second game he took a highly defensive position, whereas in the third round he started off with offensive position. Did his time spending habit change as Google DeepMind Challenge proceeded? (It’s interesting that the same logic is widely used in bot-detection practices in gaming industry and fraud detection algorithms in finance trades.)

I’ve made the following custom function to 1) prepare 5 time logs of AlphaGo and Lee Sedol, 2) calcaulate Levenshtein distance between their games 3) and present pretty heatmaps side by side. The third parameter ‘length’ means the threshold to slice the time log in order to make more comparisons. It’s a wide-spread practice to slice time frames in order to make the difference in distance more dramatic. The longer the logs are, the more likely they become heterogeneous.

In [246]:

In [247]:

Heatmaps!

It turns out (as expected) that Lee Sedol’s games were much more diverse than AlphaGo’s games. The homogeneity of AlphaGo moves is not surprising, but its Game 5 was quite different from all the other. Lee Sedol’s Levenshtein distances are all at least higher than 55. Like AlphaGo, the odd one out in his games was game number 5.

In [251]:

Well, to my surprise, AlphaGo’s LevenShtein distance was a little bit higher than I expected. If so, how can we detect if it’s AlphaGo playing with a disguised identity on an online Go match? To bring justice, we need to slice the time log. Let’s focus on the very first 10 strings.

In [256]:

In [253]:

In [254]:

In [255]:

Now we got it. Just by looking at the first 10 time logs, we can successfully tell AlphaGo from Lee Sedol! There are more distance measures like hamming distance, Jaro–Winkler distance and so on! I’ll cover more of these in my future blogposts.

In [None]: