Comment by 👤 jdcard

=> Re: "An LLM for the Geminispace" | In: s/AI

I suppose there is nothing stopping anyone from doing that now, if they were so inclined and had the compute and bandwidth resources to dedicate to it.

=> 👤 jdcard

Jan 30 · 2 days ago

10 Later Comments ↓

=> 🦋 CarloMonte · Jan 30 at 06:00:

i think, the amount of contents in gemini space is several orders magnitude below what is required to train an LLM.

=> 🐐 drh3xx · Jan 30 at 09:25:

@softwarepagan haven't you heard we NEED AI in everything? Just waiting for the AI trainers with integrated cameras on the toe peg that can text me when my laces have come undone and record a log on a nice web dashboard.

=> 🎮 lucss21a · Jan 30 at 13:08:

no. just no. i don't want my little corner of the internet to be crawled by the tenacles of big ai. it's horrid to imagine that. please reconsider your life choices and go back to the world wide web again. please. touching grass is also a better choice. dumbass.

=> 🎮 lucss21a · Jan 30 at 13:09:

sorry for being rude but please don't. it's a net negative for us here.

=> 🐸 HanzBrix · Jan 30 at 13:22:

For the ones who don't want to be crawled, scraped and the like, aggressive rate limiting would solve the problem. 😁

=> 🎮 lucss21a · Jan 30 at 13:25:

i heard some old web oriented platforms such as nekoweb and poyoweb includes scraper blockers

=> 🚀 stack · Jan 30 at 14:00:

@vi, totally with you!

We need to start thinking of how to build antitank weapons against these things. Pollution is not a bad idea.

=> 🐸 HanzBrix · Jan 30 at 15:04:

@lucss21a You can set up ufw or iptables to outright ban connection spam, basically killing all scrapers.

=> 📡 byte · Jan 30 at 15:44:

why whould anyone want this. the purpose of gemini is to stay minimal and clean, not to spread slop generated by bullshitting machine trained on stolen data. there's no good gen-AI, the whole concept is rotten to the core. ew. yikes, even.

=> 🗿 argenkiwi [OP] · Jan 30 at 18:36:

Thanks @jdcard. That is exactly the thought experiment I had in mind. I wasn't thinking about filling the Geminispace with bots, but I guess it is a reasonable fear. It seems it will not be easy to prevent that. How do we know one of us is not a bot right now?

But as @CarloMonte says there is only enough content for an SLM at this stage. Although I'm sure eventually they will develop techniques to learn from a small knowledge base and make an expert SLM for a specific use case.

Now I am thinking of making a parody of the LLM I was imagining when I started this post. It would be a self-hating one. What should I call it? RATMGPT, DeepRATM...

Original Post

=> 🌒 s/AI

An LLM for the Geminispace — With the DeepSeek breakthrough reviving the hype around large language models I've just had a thought: what if we trained an LLM using the contents of the Geminispace to measure how much knowledge is added to as time progresses? I think it would be an interesting experiment considering the Geminispace is in its early stages. It may give some interesting insights on how knowledge accumulates overtime.

=> 💬 argenkiwi · 16 comments · Jan 30 · 2 days ago

Proxy Information

Original URL: gemini://bbs.geminispace.org/u/jdcard/24503
Status Code: Success (20)
Meta: text/gemini; charset=utf-8
Capsule Response Time: 125.258122 milliseconds
Gemini-to-HTML Time: 4.381654 milliseconds

This content has been proxied by September (3851b).