This page permanently redirects to gemini://agnos.is/projects/open-webui-filters/gpu-scaling-filter/.
This is a simple filter that reduces the number of GPU layers in use
by Ollama when it detects that Ollama has crashed (via empty response
coming in to OpenWebUI). Right now, the logic is very basic, just
using static numbers to reduce GPU layer counts. It doesn't take into
account the number of layers in models or dynamically monitor VRAM
use.
There are three settings:
โโโโโโโโโโโโโโโโโโโโ
=> โคด๏ธ [/projects/open-webui-filters] | ๐ Home This content has been proxied by September (ba2dc).Proxy Information
text/gemini;lang=en-US