Wow, ChatGPT o1 gets indignant when told it’s wrong.
I was curious if it could solve the puzzle from this video: https://youtu.be/eS6hFs2PWPc
But no.
https://chatgpt.com/share/677c9953-75e8-8010-b56f-35f1a787acb6
=> View attached media | View attached media
=> More informations about this toot | More toots from comex@mas.to
I tried DeepSeek R1 on the same puzzle (as the post I’m replying to).
R1 got question #2 right and #3 mostly right after 436s. This is better than the best result I got out of all the ChatGPT models, which was actually o1-mini, getting #1 right and #3 mostly right after 290s. Same # of right answers, but I think #2 is harder than #1 since it requires realizing that the answer isn’t a number.
o1 got nothing right after 45s.
A second run of R1 got nothing right after 418s, so caveat promptor.
=> More informations about this toot | More toots from comex@mas.to
Update: o3-mini (after 256s) and o3-mini-high (after 385s) both get #2 and #3 right. Basically the same performance as DeepSeek R1 and o1.
Here's the transcript for o3-mini (o3-mini-high is very similar):
https://chatgpt.com/c/679d2df8-60c8-8010-9948-c0e703d5a63b
=> More informations about this toot | More toots from comex@mas.to
@comex that’s hilarious
=> More informations about this toot | More toots from blacktop@mastodon.social This content has been proxied by September (3851b).Proxy Information
text/gemini