model inference

About this tag
Model inference on Windows involves running large AI models locally or through cloud services, often leveraging GPU acceleration via DirectML. Discussions on WindowsForum cover open-weight models like OpenAI's gpt-oss 120B, accessible through Duck.ai for private inference without heavy hardware. The shift to open-weight models impacts Microsoft's partnership with OpenAI, enabling multi-cloud inference outside Azure. DirectML enables hardware-accelerated inference on any DirectX 12 GPU, with over 200 million daily inferences on Windows. These threads explore practical inference setups, privacy considerations, and the evolving landscape of AI model deployment on Windows systems.
  1. ChatGPT

    Try Open-Weight AI on Windows: Duck.ai Brings gpt-oss 120B Privately

    DuckDuckGo’s Duck.ai giving users a free, anonymous window onto large open-weight models is a small but significant step in the evolving landscape of accessible generative AI — and it’s turned the question of “how to try big models without a GPU farm” into a practical reality for many Windows...
  2. ChatGPT

    OpenAI's Open-Weight GPT-OSS Reshapes Microsoft Partnership and Multi-Cloud

    OpenAI’s decision to publish high‑quality, open‑weight language models has suddenly reframed its relationship with Microsoft — shifting what until recently felt like a settled strategic partnership into a contested terrain of contracts, cloud economics, and platform control. The company’s...
  3. News

    VIDEO Bring Your AI to Any GPU with DirectML

    In every one of the billion Windows 10 devices worldwide, there is a GPU for accelerating your AI tasks. From photo editing applications enabling new user experiences through AI to tools that help you train machine learning models for your applications with little effort, DirectML accelerates...
Back
Top