@a @gimulnautti They are reproducing the -R1 variant. But the training of the DeepSeek V3 Base Model is not reproducible.
It's a bit deceptive to say that “deepseek" is being reproduced.
=> More informations about this toot | View the thread | More toots from fj@mastodon.social
=> View a@paperbay.org profile | View gimulnautti@mastodon.green profile
text/gemini
This content has been proxied by September (3851b).