Ancestors

Toot

Written by Alexandre Dulaunoy on 2025-01-28 at 06:26

If you're wondering why free software and open source remain at the forefront of innovation and the long-term source of creativity, it seems the Deepseek CEO gets it...

🔗 https://stratechery.com/2025/deepseek-faq/

[#]opensource #freesoftware #ai #llm

=> View attached media

=> More informations about this toot | More toots from a@paperbay.org

Descendants

Written by Joost De Cock on 2025-01-28 at 06:31

@a I wish this was better understood by the managerial class 🥹

=> More informations about this toot | More toots from joost@freesewing.social

Written by Roland Turner on 2025-01-28 at 06:39

It's worth noting that the license on DeepSeek v3's model isn't open source; in particular it imposes a range of use restrictions. (But, sure, they're going further than any of the other LLM developers, which is excellent!)

=> More informations about this toot | More toots from 9v1rt@f.rolandturner.com

Written by Alexandre Dulaunoy on 2025-01-28 at 06:54

@9v1rt r1 is MIT licensed and the distilled are following the original license of LLAMA

https://huggingface.co/deepseek-ai/DeepSeek-R1#7-license

=> More informations about this toot | More toots from a@paperbay.org

Written by Toni Aittoniemi on 2025-01-28 at 09:39

@a Not to downplay open-source, but actually this model probably won’t say anything critical of the Chinese Communist Party.

Open-sourcing #ai models is not the same as open-source software. They probably won’t let you inspect the training data.

Having this cheaper model being picked for software projects instead of OpenAI is among other things a way to tilt the global knowledge generation machine in favour of their point-of-view.

=> More informations about this toot | More toots from gimulnautti@mastodon.green

Written by Alexandre Dulaunoy on 2025-01-28 at 10:22

@gimulnautti

I don't agree with the argument about country-specific influence. We face the same (and worst) issues with the SaaS-only proprietary model (mostly US-based), where we lack the ability to reproduce the solutions ourselves.

We already have third parties reproducing deepseek. https://huggingface.co/blog/open-r1

=> More informations about this toot | More toots from a@paperbay.org

Written by Frederic Jacobs on 2025-01-28 at 10:31

@a @gimulnautti They are reproducing the -R1 variant. But the training of the DeepSeek V3 Base Model is not reproducible.

It's a bit deceptive to say that “deepseek" is being reproduced.

=> More informations about this toot | More toots from fj@mastodon.social

Written by Alexandre Dulaunoy on 2025-01-28 at 10:39

@fj

I mean -R1, of course.

The papers indeed serve as a basis for -R1 reproducibility. V3 is partially documented but doesn’t include the complete training models needed for full reproducibility.

Still, it’s far better documented than the current SaaS proprietary models that many people rely on.

@gimulnautti

=> More informations about this toot | More toots from a@paperbay.org

Proxy Information

Original URL: gemini://mastogem.picasoft.net/thread/113904554125213813
Status Code: Success (20)
Meta: text/gemini
Capsule Response Time: 288.185263 milliseconds
Gemini-to-HTML Time: 1.285832 milliseconds

This content has been proxied by September (3851b).