A Mastodon server probably shouldn't pre-fetch media in the toots it receives.
Instead, make the clients request the media from their local server, if it isn't present then the client does a direct request to the source, then sends the result back to their local server for other people to use.
Opportunity for shenanigans of course, if this first client decides to hand back something incorrect. However, you could defend against this in a number of ways, such as random client cache misses to make sure your user population converges on truth, or for smaller servers with a community-oriented base, just accept it.
This seems much more "distributed" than having a central server responsible for everything.
=> More informations about this toot | More toots from yojimbo@hackers.town
@yojimbo why leave the client in charge of handling the cache miss? would it not be better to have the instance software fetch from source?
=> More informations about this toot | More toots from 0x57e11a@void.lgbt
@0x57e11a This is the current situation, and right now upstream servers get hit with a huge number of almost-simultaneous requests from un-coordinated mastodon servers when a toot with media in it goes out. Switching it to a client-driven activity means that this load would be spread out based on when humans are actually rendering the content, and hopefully this reduces the peak load problem.
=> More informations about this toot | More toots from yojimbo@hackers.town
@yojimbo oh ofc, having it only be loaded when the user needs to render it is far better! but to avoid the mess that is trusting the client’s upload, why not have the user ask the server for the cached media, and from then the server can handle making the request to the source url?
a cache hit:
|--cache please->|
|<-cached media--|
a cache miss (with client fetching media, your solution, 3 requests)
| |--cache please->|
| |<-no------------|
|<-media please---|
|----------media->|
| |----cache this->|
| |<-ok------------|
a cache miss (with server fetching media, this ones solution, 2 requests)
|--cache please->| |
| |---media please->|
| |<-media----------|
|<-media---------| |
=> More informations about this toot | More toots from 0x57e11a@void.lgbt
@0x57e11a Actually, I guess there's no particular reason that you couldn't use a normal cache miss mechanism like that :-)
Perhaps my idea was unduly influenced by seeing upstream site operators talking about actively blocking requests coming from a mastodon server user-agent ...
=> More informations about this toot | More toots from yojimbo@hackers.town
@yojimbo that is a valid reason to want a way around it, but ideally fixing the issue before that happens could reduce request amounts
plus good luck blocking the UA of every single fedi software (every misskey fork :neobot_woozy:)
=> More informations about this toot | More toots from 0x57e11a@void.lgbt
@0x57e11a I'm not affected by it ... but it seems like plenty of others are.
https://news.itsfoss.com/mastodon-link-problem/ is a few months old and perhaps not the best ... but it's a good indication of the pain.
JWZ's article uses regexs to look at the user agent ... https://www.jwz.org/blog/2022/11/mastodon-stampede/ and also notes that Mastodon recently changed their user-agent string https://github.com/mastodon/mastodon/pull/31192
=> More informations about this toot | More toots from yojimbo@hackers.town This content has been proxied by September (ba2dc).Proxy Information
text/gemini