Install Triton & SageAttention on Windows (RTX 50 Series)

If you’re running one of NVIDIA’s new RTX 50 Series GPUs and want to boost AI model performance, installing Triton and SageAttention is one of the easiest and most effective upgrades you can make. This guide walks you through the full installation process on Windows, optimized for the 50 Series.

If you’re thinking about purchasing a new GPU, we’d greatly appreciate it if you used our Amazon Associate links. The price you pay will be exactly the same, but Amazon provides us with a small commission for each purchase. It’s a simple way to support our site and helps us keep creating useful content for you. Recommended GPUs: RTX 5090, RTX 5080, and RTX 5070. #ad

This is an updated version of my earlier tutorial for the RTX 30 and 40 Series, revised to reflect the latest hardware changes, driver updates, and compatibility tweaks. Whether you’re upgrading from a previous generation or setting up your first 50 Series card, you’ll find all the steps you need here.

Requirements

Python

Python 3.9 ~ 3.13 are supported. In this article, I am going to use 3.12.8 as an example.

PyTorch

Install PyTorch with CUDA 12.8 support. The command is

pip install torch==2.8.0 torch --index-url https://download.pytorch.org/whl/cu128

or

python_embeded\python -m pip install torch==2.8.0 torch --index-url https://download.pytorch.org/whl/cu128

if you are using the python_embeded under ComfyUI.

CUDA

Install CUDA toolkit 12.8 from CUDA toolkit archive. Make sure you choose CUDA Development and CUDA Runtime when installing. You also need to add the CUDA installation folder to the Windows PATH environment variable. The path is C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin.

Visual Studio Build Tools

Dowload Build Tools for Visual Studio 2022 from this link. Add cl.exe to your Windows PATH variable. The path is something like C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.43.34808\bin\Hostx64\x64. Depending on the version you install, the exact path might be different.

Visual C++ Redistributable

Download vcredist from this link and install it.

Installation

The actual installation is quite simple. The command is

pip install -U "triton-windows<3.4"

or

python_embeded\python -m pip install -U "triton-windows<3.4"

for ComfyUI. There is an extra step that you have to do for python_embeded. You will need two folders include and libs from the version of Python for your python_embeded version. If you don’t have the version of Pyhton on your system, but you have Anaconda or miniconda installed. You can get these two folders from Anaconda. Here is what you do. Open up an Anaconda prompt and type:s

conda create -n "Python-3-12-8" python=3.12.8

Remember to replace the version number of your python_embeded version. After the environment is created. Type the following to find the location of the environment:

conda activate Python-3-12-8
conda list system

The location of the environment is listed. The path is like C:\Users\username\.conda\envs\Python-3-12-8

Open up File Explorer and browse to the path. Note that the .conda is a hidden folder, so you might have to copy the path and paste that to the address bar of File Explorer. The two folders are shown in this screenshot.

Copy these two folders and paste them to the python_embeded directory.

Finally, you can install SageAttention. Use these commands:

pip install https://github.com/woct0rdho/SageAttention/releases/download/v2.2.0-windows.post2/sageattention-2.2.0+cu128torch2.8.0.post2-cp39-abi3-win_amd64.whl

or

python_embeded\python -m pip install https://github.com/woct0rdho/SageAttention/releases/download/v2.2.0-windows.post2/sageattention-2.2.0+cu128torch2.8.0.post2-cp39-abi3-win_amd64.whl

for ComfyUI.

Conclusion

Thanks to the triton-windows fork and the latest updates to SageAttention, installing both tools on Windows is now straightforward — no need to switch to Linux or WSL. This means developers, AI researchers, and Stable Diffusion creators can take full advantage of Triton’s GPU optimizations alongside SageAttention’s efficient attention mechanisms, all within a native Windows environment.

With RTX 50 Series hardware, these tools unlock faster deep learning workflows, smoother AI inference, and better GPU utilization for complex projects. If you run into issues, be sure to check each project’s GitHub repository for the latest fixes, updates, and troubleshooting advice.

Now that Triton and SageAttention are running natively on your system, the only question left is — what will you build next?

Be the first to comment

Leave a Reply