CUDA 11.8 with AUTOMATIC1111's Stable Diffusion (HUGE PERFORMANCE)

Published: Jan 26, 2023
Last Edit: Jan 27, 2023

1,220 Words, 6 Minutes.

What you’ll need

Because xformers speeds things up a lot, you can get even better performance with the triton addon… This is ONLY available on Linux.

Windows users, don’t fear, you can still get this. Just install Ubuntu, or another Linux distribution under WSL (Windows Subsystem for Linux). Once you have it up and running head back to this tutorial!

A lot of this article is based on, and improves upon @vladmandic’s discussion on the AUTOMATIC1111 Discussions page.

Preparing your system for Automatic1111’s Stable Diffusion WebUI

Windows

Before we even get to installing A1’s SDUI, we need to prepare Windows. This guide only focuses on Nvidia GPU users.

Make sure you have Nvidia CUDA 11.8 installed, as well as the latest cuDNN.

To install the latest cuDNN, download the zip from Nvidia cuDNN (Note: you will need an Nvidia account to do so, as far as I can remember).

Open it with your favourite zip explorer, like 7-zip, and extract all the folders inside (Just open, don’t navigate around) into C:\Program FIles\NVIDIA GPU Computing Toolkit\CUDA\v11.8. Your path may be different. You will be copying the bin, include and lib folders. If told about duplicates, just click Replace.

I would also recommend using the latest Studio Drivers from Nvidia.

WSL (Ubuntu)

Setting up CUDA on WSL

CUDA is installed on Windows, but WSL needs a few steps as well. Following the Getting Started with CUDA on WSL from Nvidia, run the following commands.

This needs to match the CUDA installed on your computer. In my case, it’s 11.8.

To get updated commands assuming you’re running a different CUDA version, see Nvidia CUDA Toolkit Archive.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sudo apt-key del 7fa2af80

apt-get update
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

Now for cuDNN:

1
conda install -c conda-forge cudnn

Preparing the environment

Following modified installation steps, based on the original, run the following commands.

Install Anaconda (conda)

1
2
3
wget https://repo.anaconda.com/archive/Anaconda3-2022.05-Linux-x86_64.sh
chmod +x Anaconda3-2022.05-Linux-x86_64.sh
./Anaconda3-2022.05-Linux-x86_64.sh

Then download the Automatic1111 repo

1
2
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui

If you have previously installed triton, make sure to uninstall it with pip uninstall triton. If you don’t have Python, don’t worry.

Now, instead of running conda env create -f environment-wsl2.yaml as the guide suggests, instead edit that file. Either head across there in Windows, assuming you’ve used cd /mnt/c/users/tcno/desktop, for example, to get to your desktop… Or just use nano: nano environment-wsl2.yaml.

Remove the cudatoolkit, pytorch and torchvision lines. We will install these later. We just need to create and activate the environment for now.

Mine appears as follows:

environment-wsl2.yaml

1
2
3
4
5
6
7
8
name: automatic
channels:
  - pytorch
  - defaults
dependencies:
  - python=3.10
  - pip=22.2.2
  - numpy=1.23.1

Now, install all the requirements using the command:

1
conda env create -f environment-wsl2.yaml

As suggested by the console, run:

1
conda activate automatic

Now that we’re in the right environment, we will install the latest version of Pytorch nightly (version 2.0+).

Run the command to install them. This will download roughly 2GB+:

1
pip3 install --pre torch torchvision torchaudio torchtriton --extra-index-url https://download.pytorch.org/whl/nightly/cu118 --force

Running pip show torch should return something along the lines of Version: 2.0.0.dev20230125+cu118. This is perfect.

Accelerate

Run:

1
pip install -U accelerate==0.15.0

You will need to edit requirements_versions.txt as well to reflect accelerate==0.15.0. Do note that you may need to delete this file to git pull and update Automatic1111’s SDUI, otherwise just run git stash and then git pull. Just remember to change the accelerate version to match.

Xformers

This important package can do wonders to your image generation speeds. Make sure you install this, if you can and your computer is happy with it. You can always uninstall it later.

Because of how Facebook has this set up, you can download older versions, but you will get much better speed still by building and installing it yourself. As scary as that sounds, you’re already here and it’s just a few more simple commands.

There are 2 ways of doing this… Then, the official way uses Conda, as suggested on the xformers GitHub page, and the second official way is building it yourself… here are both (Windows can only do the second, but WSL should be able to do the first as well).

Pick one:

1
conda install xformers -c xformers/label/dev

1
2
3
pip install ninja setuptools
pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers
python -m xformers.info

Also, if you’re on Linux or WSL, install Triton:

1
pip install triton

This will take some time to put itself together, so be patient. We’re talking many minutes.

Running AUTOMATIC1111’s Stable Diffusion Web UI in WSL

You will need to edit the webui-user.sh file to get the best results.

I have the following as the entire file (By default everything is commented out here).

webui-user.sh

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#!/bin/bash
#########################################################
# Uncomment and change the variables below to your need:#
#########################################################
export MAX_JOBS=16 
export CUDA_PATH="/usr/local/cuda-11.8/lib64"
export PATH=/usr/local/cuda-11.8/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-11.8
export TF_ENABLE_ONEDNN_OPTS=0
export COMMANDLINE_ARGS="--xformers --opt-channelslast --opt-split-attention"
export venv_dir="venvtorch20-cu118"
export TORCH_COMMAND="pip3 install --pre torch torchvision torchaudio torchtriton --extra-index-url https://download.pytorch.org/whl/nightly/cu118 --force"
export ACCELERATE="pip install -U accelerate==0.15.0"
export xformer="pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers"
python -m xformers.info
export ACCELERATE="True"

I have further adjusted the COMMANDLINE_ARGS option to get better performance on my PC (And my PC does some funky things when the VRAM fills up for some reason…)

1
export COMMANDLINE_ARGS="--listen --xformers --opt-channelslast --opt-split-attention --no-half-vae --medvram --always-batch-cond-uncond"

Speaking of, I also need to add torch.backends.cudnn.enabled = False just after import torch in webui.py for it to work on my wonky Windows install…

Starting A1’s SDUI

To start A1’s SDUI, run:

1
2
conda activate automatic
./webui.sh

And that’s it. All of your settings from webui-user.sh will be brought in. I added the activate line just to make sure if you remember any commands, it should be these to restart A1’s SDUI after a WSL restart, or PC restart.

Fixing issues with Dreambooth

Dreambooth works, at least mostly… As mentioned previously I’m having strange things happen with my GPU under Windows and WSL, so I’m not able to test this fully. I assume a Windows reinstall is nessecary…

To fix the TypeError: '<' not supported between instances of 'str' and 'Version' error for Dreambooth in A1’s SDUI, before it’s updated to work with Torch 2.0, open .

The full error is:


    
    File "/root/anaconda3/envs/automatic/lib/python3.10/site-packages/diffusers/utils/import_utils.py", line 207, in <module>
    if torch.__version__ < version.Version("1.12"):
TypeError: '<' not supported between instances of 'str' and 'Version'

So, sudo nano /root/anaconda3/envs/automatic/lib/python3.10/site-packages/diffusers/utils/import_utils.py.

Skipping to line 207 (Hit Ctrl+/ and enter 207), I’ll simply make it look as such:

I’ll simply comment out the lines, as such:

1
2
3
4
5
#if _torch_available:
#    import torch
#
#        if torch.__version__ < version.Version("1.12"):
#            raise ValueError("PyTorch should be >= 1.12")

Ctrl+S and Ctrl+X to save and close.

Optimization

Now, let’s make it even better.

Head across to the discussion linked Here and continue your journey.

Why leave it here? Well, things will change from here. This is still experimental, and following the steps provided in this discussion should take you further.

At this point you’re up to the Optimize section.