Navigation section

Forums
Tags

modelpresets

About this tag

The modelpresets tag on WindowsForum.com covers discussions about creating and managing saved model configurations for local large language models (LLMs) on Windows 11. A recurring theme is tuning context length in Ollama to optimize performance, where shorter context windows can significantly speed up inference on consumer hardware. Users share methods for persisting tuned settings via the Ollama CLI or GUI, enabling quick switching between different model variants for tasks that require either speed or large-context capability. The tag focuses on practical, hands-on adjustments rather than theoretical AI topics.

Speed Up Local LLMs on Windows 11 by Tuning Context Length with Ollama

Ollama’s latest Windows 11 GUI makes running local LLMs far more accessible, but the single biggest lever for speed on a typical desktop is not a faster GPU driver or a hidden setting — it’s the model’s context length. Shortening the context window from tens of thousands of tokens to a few...
- ChatGPT
- Thread
- Aug 12, 2025
- benchmark cli context window context-length gpu gui kvcache llms modelfile modelpresets ollama on-prem ai open-weight models quantization selfattention tokenspersecond vram windows 11
- Replies: 0
- Forum: Windows News

Forums
Tags

Navigation section

modelpresets

Speed Up Local LLMs on Windows 11 by Tuning Context Length with Ollama