Toot

Written by comex on 2025-01-29 at 18:47

I tried DeepSeek R1 on the same puzzle (as the post I’m replying to).

R1 got question #2 right and #3 mostly right after 436s. This is better than the best result I got out of all the ChatGPT models, which was actually o1-mini, getting #1 right and #3 mostly right after 290s. Same # of right answers, but I think #2 is harder than #1 since it requires realizing that the answer isn’t a number.

o1 got nothing right after 45s.

A second run of R1 got nothing right after 418s, so caveat promptor.

=> More informations about this toot | View the thread | More toots from comex@mas.to

Toot

Written by comex on 2025-01-29 at 18:47

Mentions

Tags