Toot

Written by sc_griffith@awful.systems on 2024-12-24 at 21:14

so openai claims to be doing great on the FrontierMath dataset. I’ve already seen the usual sort of dipshits using this to pump ai on reddit. here’s a post that went to the frontpage on HN:

…wordpress.com/…/can-ai-do-maths-yet-thoughts-fro…

(tl;dr only a few problems from the dataset are public but if representative the problems are about 25% survivable by an undergrad; coincidentally this is the % openai says their models are completing.)

this post is by kevin buzzard. he has a let’s say not widely beloved personality, but I don’t think of him as credulous or grifty, and people in his area regard him as an excellent mathematician.

he points out but I think does not focus enough on how discrediting the secretive nature of the dataset is. the fact that you can’t make it public is necessary to run such experiments in a scientifically reasonable way, but also makes it totally impossible to run the experiment in a scientifically reasonable way. an experiment which cannot be examined, looked into or reproduced is actually the opposite of science. it’s pure grift fuel

=> More informations about this toot | View the thread | More toots from sc_griffith@awful.systems

Mentions

=> View BlueMonday1984@awful.systems profile

Tags

Proxy Information
Original URL
gemini://mastogem.picasoft.net/toot/113709868120029360
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
244.055363 milliseconds
Gemini-to-HTML Time
0.54554 milliseconds

This content has been proxied by September (3851b).