I tried DeepSeek R1 on the same puzzle (as the post I’m replying to).
R1 got question #2 right and #3 mostly right after 436s. This is better than the best result I got out of all the ChatGPT models, which was actually o1-mini, getting #1 right and #3 mostly right after 290s. Same # of right answers, but I think #2 is harder than #1 since it requires realizing that the answer isn’t a number.
o1 got nothing right after 45s.
A second run of R1 got nothing right after 418s, so caveat promptor.
=> More informations about this toot | View the thread | More toots from comex@mas.to
text/gemini
This content has been proxied by September (3851b).