One more DeepSeek R1 thread (I promise this is the last from me for now). I got R1 to play a game of go against GnuGo, an old, and by today's standards very weak go ai. I played until I started hitting the rate limit on R1 - 40 moves. This was enough to see that R1 is not very good at go. https://online-go.com/game/71821573
=> More informations about this toot | More toots from aws@mathstodon.xyz
It feels a lot like R1 is not able to visualise the board and e.g. seemed completely blind to the "hole" in its position at R6, thinking nothing of letting white push through there. Maybe this is something that a multimodal model would be better at.
=> More informations about this toot | More toots from aws@mathstodon.xyz
It gave the impression of someone who had read a lot about go but never played a game, so didn't really understand what the words actually meant. It knew that corner enclosures are good but not how to use them effectively. Looking again at the lower right, it consistently believed that it was strong there because of the Q4-Q6 enclosure. When asked to sum up what it thought about the game at the end, it stated that the enclosure was part of a moyo and that it was a weakness of white's to play in that area, which is very far from correct in this case.
=> More informations about this toot | More toots from aws@mathstodon.xyz
I think that reinforcement learning specifically in go would improve its strength. Even with this I wouldn't expect it to get particular strong - this is a very inefficient way to make a go ai. It might be worthwhile anyway, because the llm's do have one big advantage over current go ai's: every move has a natural language explanation with even more detail in the chain of thought.
=> More informations about this toot | More toots from aws@mathstodon.xyz This content has been proxied by September (3851b).Proxy Information
text/gemini