Toot

Written by Wim🧮 on 2025-01-07 at 14:41

@nicd There is no direct answer to this. If the model is a GPT-3 style model, then the relationship between number of paramters and energy consumption is close to square root; but for GPT-4, the architecture is different, it is a "mixture of experts" model, effectively a number of GPT-3 style models combined, but not all parameters are contributing (sparse model). But such a model has a higher overhead, so for the same number of effective parameters, energy consumption is higher.

=> More informations about this toot | View the thread | More toots from wim_v12e@scholar.social

Mentions

=> View nicd@masto.ahlcode.fi profile

Tags

Proxy Information
Original URL
gemini://mastogem.picasoft.net/toot/113787591905357557
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
223.132003 milliseconds
Gemini-to-HTML Time
0.356907 milliseconds

This content has been proxied by September (3851b).