If you're wondering why free software and open source remain at the forefront of innovation and the long-term source of creativity, it seems the Deepseek CEO gets it...
🔗 https://stratechery.com/2025/deepseek-faq/
[#]opensource #freesoftware #ai #llm
=> More informations about this toot | More toots from a@paperbay.org
@a I wish this was better understood by the managerial class 🥹
=> More informations about this toot | More toots from joost@freesewing.social
@a
It's worth noting that the license on DeepSeek v3's model isn't open source; in particular it imposes a range of use restrictions. (But, sure, they're going further than any of the other LLM developers, which is excellent!)
=> More informations about this toot | More toots from 9v1rt@f.rolandturner.com
@9v1rt r1 is MIT licensed and the distilled are following the original license of LLAMA
https://huggingface.co/deepseek-ai/DeepSeek-R1#7-license
=> More informations about this toot | More toots from a@paperbay.org
@a Not to downplay open-source, but actually this model probably won’t say anything critical of the Chinese Communist Party.
Open-sourcing #ai models is not the same as open-source software. They probably won’t let you inspect the training data.
Having this cheaper model being picked for software projects instead of OpenAI is among other things a way to tilt the global knowledge generation machine in favour of their point-of-view.
=> More informations about this toot | More toots from gimulnautti@mastodon.green
@gimulnautti
I don't agree with the argument about country-specific influence. We face the same (and worst) issues with the SaaS-only proprietary model (mostly US-based), where we lack the ability to reproduce the solutions ourselves.
We already have third parties reproducing deepseek. https://huggingface.co/blog/open-r1
=> More informations about this toot | More toots from a@paperbay.org
@a @gimulnautti They are reproducing the -R1 variant. But the training of the DeepSeek V3 Base Model is not reproducible.
It's a bit deceptive to say that “deepseek" is being reproduced.
=> More informations about this toot | More toots from fj@mastodon.social
@fj
I mean -R1, of course.
The papers indeed serve as a basis for -R1 reproducibility. V3 is partially documented but doesn’t include the complete training models needed for full reproducibility.
Still, it’s far better documented than the current SaaS proprietary models that many people rely on.
@gimulnautti
=> More informations about this toot | More toots from a@paperbay.org This content has been proxied by September (3851b).Proxy Information
text/gemini