From what I've seen DeepSeek is more efficient though engineering tricks rather than better ML? Doesn't this mean it's not as disruptive as thought? Big players can just use the same tricks now and still have advantage from bigger compute?
=> More informations about this toot | More toots from neuralreckoning@neuromatch.social
@neuralreckoning yes
=> More informations about this toot | More toots from tschfflr@fediscience.org
@neuralreckoning my take (not my field so take with copious amounts of salt) is that the big players held a choke point which was "vast amounts of computing power which can only be available in a data center", so didn't really care about the magic formula being in the open.
Those tricks make training models possible for everyone who would otherwise have been forced to buy them (as-a-service) from them.
Also makes running queries a lot cheaper, which is good for consumers such as meta
=> More informations about this toot | More toots from ehproque@paquita.masto.host
@neuralreckoning this last point is the one you point out which is correct (to my limited knowledge)
=> More informations about this toot | More toots from ehproque@paquita.masto.host
@neuralreckoning
Internal Google memo from 1.5 years ago is relevant:
Google “We Have No Moat, And Neither Does OpenAI” Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI https://semianalysis.com/2023/05/04/google-we-have-no-moat-and-neither/
=> More informations about this toot | More toots from maxpool@mathstodon.xyz This content has been proxied by September (3851b).Proxy Information
text/gemini