@deutrino mhm, that could be good.
I kinda wish I knew more about the "tagging" portion of training LLMs. As far as I've understood, some data (mainly images, video and audio) need to be tagged/processed by "data farms" in lower-income places around the globe before becoming useful?
You can't easily spot poisoned images this way, but text... I don't know... then again, depends on how thorough the whole thing is.
=> More informations about this toot | View the thread | More toots from kln@mstdn.io
=> View deutrino@mstdn.io profile
text/gemini
This content has been proxied by September (ba2dc).