Toot

Written by sc_griffith@awful.systems on 2024-12-24 at 21:14

so openai claims to be doing great on the FrontierMath dataset. I’ve already seen the usual sort of dipshits using this to pump ai on reddit. here’s a post that went to the frontpage on HN:

…wordpress.com/…/can-ai-do-maths-yet-thoughts-fro…

(tl;dr only a few problems from the dataset are public but if representative the problems are about 25% survivable by an undergrad; coincidentally this is the % openai says their models are completing.)

this post is by kevin buzzard. he has a let’s say not widely beloved personality, but I don’t think of him as credulous or grifty, and people in his area regard him as an excellent mathematician.

he points out but I think does not focus enough on how discrediting the secretive nature of the dataset is. the fact that you can’t make it public is necessary to run such experiments in a scientifically reasonable way, but also makes it totally impossible to run the experiment in a scientifically reasonable way. an experiment which cannot be examined, looked into or reproduced is actually the opposite of science. it’s pure grift fuel

=> More informations about this toot | View the thread | More toots from sc_griffith@awful.systems

Mentions

=> View BlueMonday1984@awful.systems profile

Toot

Written by sc_griffith@awful.systems on 2024-12-24 at 21:14

Mentions

Tags