Ancestors

Written by comex on 2025-01-07 at 03:18

Wow, ChatGPT o1 gets indignant when told it’s wrong.

I was curious if it could solve the puzzle from this video: https://youtu.be/eS6hFs2PWPc

But no.

https://chatgpt.com/share/677c9953-75e8-8010-b56f-35f1a787acb6

=> View attached media | View attached media

=> More informations about this toot | More toots from comex@mas.to

Toot

Written by comex on 2025-01-29 at 18:47

I tried DeepSeek R1 on the same puzzle (as the post I’m replying to).

R1 got question #2 right and #3 mostly right after 436s. This is better than the best result I got out of all the ChatGPT models, which was actually o1-mini, getting #1 right and #3 mostly right after 290s. Same # of right answers, but I think #2 is harder than #1 since it requires realizing that the answer isn’t a number.

o1 got nothing right after 45s.

A second run of R1 got nothing right after 418s, so caveat promptor.

=> More informations about this toot | More toots from comex@mas.to

Descendants

Written by comex on 2025-01-31 at 20:27

Update: o3-mini (after 256s) and o3-mini-high (after 385s) both get #2 and #3 right. Basically the same performance as DeepSeek R1 and o1.

Here's the transcript for o3-mini (o3-mini-high is very similar):

https://chatgpt.com/c/679d2df8-60c8-8010-9948-c0e703d5a63b

=> More informations about this toot | More toots from comex@mas.to

Written by blacktop on 2025-01-31 at 21:02

@comex that’s hilarious

=> More informations about this toot | More toots from blacktop@mastodon.social

Proxy Information

Original URL: gemini://mastogem.picasoft.net/thread/113913131471465858
Status Code: Success (20)
Meta: text/gemini
Capsule Response Time: 260.546604 milliseconds
Gemini-to-HTML Time: 0.78057 milliseconds

This content has been proxied by September (3851b).