in

How to Run Stable Diffusion AI on Mac and Windows: An In-Depth Guide

default image

As an AI enthusiast and engineer, I‘ve been fascinated by the rapid advancements in generative image models like Stable Diffusion. After experimenting extensively with the platform, I wanted to provide this comprehensive guide to help others unlock the full potential of running Stable Diffusion locally.

Why Stable Diffusion Matters

Stable Diffusion represents a pioneering breakthrough in generative AI, as the first open source text-to-image diffusion model.

Diffusion models have proven superior to GANs for image generation by taking a radically different approach. Rather than trying to generate the final image in one shot, diffusion models start with noise and gradually refine the image over successive steps. This allows for more realistic and coherent image generation.

Stable Diffusion builds upon groundbreaking research done by CompVis into encoder-decoder architectures and latent vector manipulation. It was trained on an enormous dataset of text-image pairs across a wide domain of topics.

This sheer breadth of training data is what gives Stable Diffusion its versatility to generate such diverse high-quality images using natural language prompts.

Personally, I think tools like this are democratizing creativity and lowering the barriers for anyone to turn their ideas into stunning visual content.

The applications span art, design, entertainment, education, marketing, and beyond. For creators and entrepreneurs, it unlocks new possibilities.

Running Stable Diffusion locally instead of relying solely on web services also enables more control, customization, and privacy over your generations.

Now let‘s dive into how to configure your own local Stable Diffusion system and start creating!

Hardware Prerequisites

The hardware requirements for smooth Stable Diffusion generation may surprise those new to machine learning. The reason is that generating high-resolution photorealistic images requires immense computational power.

Stable Diffusion leans heavily on the GPU for image generation. CPU can work but will be unbearably slow for all but the simplest of images.

Here are the recommended minimum specs:

GPU: Nvidia GTX 1060 or AMD Radeon RX 580; 6GB VRAM

CPU: Intel Core i3 or equivalent

RAM: 8GB

However, for good performance especially at higher resolutions, I suggest at least:

GPU: Nvidia RTX 3060 Ti or better; 12GB VRAM

CPU: Intel Core i7 / AMD Ryzen 7 or better

RAM: 16GB+

The more powerful your GPU, the better performance will be. You can also utilize multiple GPUs for distributed generation.

As a benchmark, here are sample image generation speeds on different GPU hardware configurations with default settings:

GPU 512×512 1024×1024
Nvidia RTX 3090 0.8s 3.1s
Nvidia RTX 3060 Ti 1.3s 5.2s
Nvidia GTX 1080 2.1s 8.7s

You can expect at least a 3-5X speedup on a high-end modern GPU compared to an older mid-range model.

So in summary, invest in the best GPU you can afford if you plan on really leveraging Stable Diffusion locally.

Software Installation Guide

With capable hardware in place, we can move on to installation and configuration of the necessary software.

I‘ll be providing steps for both Windows 10/11 and MacOS.

The high level process we will walk through is:

  1. Install Python dependencies
  2. Install Stable Diffusion UI
  3. Download model checkpoint
  4. Launch UI
  5. Generate images!

Install Python and Dependencies

Stable Diffusion is built on Python and leverages various packages for the deep learning and image processing capabilities.

You‘ll need Python 3.7 or higher. I recommend installing via Anaconda for the simplest dependency management.

On Windows:

  1. Download and install Anaconda Individual Edition. Make sure to get the Python 3 version.

  2. Open the Anaconda Prompt terminal. This will automatically activate the Conda environment.

  3. Install the Python packages needed:

conda install pytorch torchvision torchaudio pytorch-cuda -c pytorch
conda install PIL numpy lmdb tqdm pytest pandas scikit-learn pyyaml gdown ffmpeg-python

On MacOS:

  1. Download and install Anaconda Individual Edition. Choose Python 3.

  2. Open Terminal and type conda activate to activate the Conda env.

  3. Install the packages:

conda install pytorch torchvision torchaudio pytorch-cuda -c pytorch
conda install PIL numpy lmdb tqdm pytest pandas scikit-learn pyyaml gdown ffmpeg-python

This will get an isolated Python environment set up with all the necessary dependencies to run Stable Diffusion.

Download Stable Diffusion User Interface

In order to interact with the Stable Diffusion model, we need a user interface.

The best option currently is Automatic1111‘s Stable Diffusion web UI.

It‘s an open source front-end that allows you to generate images via a local web page. The repo also includes numerous scripts and extensions created by the community.

On Windows:

  1. Install Git for Windows.

  2. Create a folder, e.g C:\stable-diffusion.

  3. Right click inside the folder, select Git Bash Here. A terminal will open.

  4. Clone the repo:

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

On MacOS:

  1. Install Git if you don‘t already have it:
brew install git 
  1. Create a folder, e.g. /Users/yourname/stable-diffusion

  2. Open Terminal and navigate to the folder:

cd /Users/yourname/stable-diffusion
  1. Clone the repo:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

This will download the UI files into the stable-diffusion-webui folder.

Download Model Checkpoint

The Stable Diffusion UI requires a pre-trained model checkpoint file to run.

You can use any of the available checkpoints from HuggingFace, but I recommend starting with the CompVis Stable Diffusion v1-4 model.

Download steps:

  1. Navigate to CompVis Stable Diffusion v1-4 page on HuggingFace.

  2. Click the sd-v1-4.ckpt file under Model Info to download the 4GB checkpoint.

  3. Copy the .ckpt file into the models/Stable-diffusion folder in your repo.

This provides the UI with the weights and parameters for SD v1.4 model at launch.

Launch Web UI

We now have everything installed and configured correctly. Time to launch the web interface!

On Windows:

Double click the webui-user.bat file in the repo folder. This will open a command prompt that launches the app.

On MacOS:

In Terminal, run:

python launch.py --share

This will launch the web UI and share it on your local network.

In both cases, look for the Local URL (e.g. http://192.168.1.xxx:7860) printed in the terminal window.

Copy and paste that URL into your web browser to access the web interface!

Generating Images

Now for the fun part – using the web UI to start generating images!

The interface includes two main tabs:

txt2img – generate image from text prompt

img2img – modify existing image via text

There are also advanced settings and options to tune like sampling method, # of iterations, output size, and more.

Here‘s an overview of the process:

Text-to-Image Generation

  1. Under txt2img, enter your text prompt, for example:
An astronaut riding a horse on Mars, digital art
  1. Pick an image size like 512×512 to start.

  2. Click Generate. Watch it create the image before your eyes!

Image-to-Image Generation

  1. On the img2img tab, upload an existing image.

  2. Enter text prompts to modify the image:

Make the astronaut holding a flag, add another horse, Mars background
  1. Click Generate and see it apply those edits!

Pro Tips

Here are some tips I‘ve gathered through extensive experimentation with Stable Diffusion:

  • Use detailed and unambiguous language in your prompts
  • Adjust CFG scale to control creative liberty vs fidelity
  • Generate batches with different seeds for more variations
  • Start low resolution, get prompt dialed in before scaling up
  • Occasional bad outputs? Retry with new seed or rephrase prompt
  • Check out img2imgInpainting extension to expand/fill images
  • Enable "Low VRAM" mode if you face GPU memory issues

Take time to really familiarize yourself with all the settings and capabilities. This will allow you to maximize quality and control for your use case.

And now, go create something amazing!

Leveraging Stable Diffusion on MacOS

Up until now, I‘ve focused on Windows and vanilla Python setup. But Mac users have an even easier option with Diffusion Bee.

Diffusion Bee is a free Mac app created by Andreas Refsgaard that serves as a wrapper for running Stable Diffusion and DreamBooth natively on MacOS.

The key advantages are:

  • Simple drag-and-drop style interface
  • Encapsulates all Python environment needs
  • Actively developed and maintained
  • GPU/TPU acceleration support

Overall it provides a streamlined experience for Mac owners to start generating images without any coding required.

Installation

Installation is straightforward:

  1. Download latest version DMG file from diffusionbee.com

  2. Open DMG and drag Diffusion Bee to Applications folder

  3. First launch will download models (~5GB)

Once setup completes, you are ready to generate images!

Image Generation

The workflow is simple and intuitive:

  1. Launch Diffusion Bee

  2. Enter text prompt or upload image on desired tab

  3. Adjust settings like # of iterations, sampling method, etc

  4. Click "Generate" and watch it create the image

They continue adding more advanced features and options with each update as well.

So if you want to hit the ground running with Stable Diffusion on MacOS with minimal fuss, Diffusion Bee is likely your best option.

Closing Thoughts

In closing, I hope this guide was helpful to get up and running with Stable Diffusion AI locally.

Deploying these models involves somewhat complex environment setup, but the results enable you to produce remarkable images limited only by your imagination.

My advice is to take it slow, start with basic prompts and settings, and learn what works best for your use case.

As with any tool, practice makes perfect. Mastering prompts and settings for Stable Diffusion takes experimentation over time.

I‘m excited to see continued open source development and research on generative image models. They offer great potential for creators and I look forward to seeing what the community builds!

Let me know if you have any other questions. Happy creating!

AlexisKestler

Written by Alexis Kestler

A female web designer and programmer - Now is a 36-year IT professional with over 15 years of experience living in NorCal. I enjoy keeping my feet wet in the world of technology through reading, working, and researching topics that pique my interest.