Skip to content

Main Navigation

Puget Systems Logo
  • Solutions
    • Recommended Systems For:
    • Content Creation
      • Photo Editing
        • Recommended Systems For:
        • Adobe Lightroom Classic
        • Adobe Photoshop
        • Stable Diffusion
      • Video Editing
        • Recommended Systems For:
        • Adobe After Effects
        • Adobe Premiere Pro
        • DaVinci Resolve
        • Foundry Nuke
      • 3D Design & Animation
        • Recommended Systems For:
        • Autodesk 3ds Max
        • Autodesk Maya
        • Blender
        • Cinema 4D
        • Houdini
        • ZBrush
      • Real-Time Engines
        • Recommended Systems For:
        • Game Development
        • Unity
        • Unreal Engine
        • Virtual Production
      • Rendering
        • Recommended Systems For:
        • Keyshot
        • OctaneRender
        • Redshift
        • V-Ray
      • Digital Audio
        • Recommended Systems For:
        • Ableton Live
        • FL Studio
        • Pro Tools
    • Engineering
      • Architecture & CAD
        • Recommended Systems For:
        • Autodesk AutoCAD
        • Autodesk Inventor
        • Autodesk Revit
        • SOLIDWORKS
      • Visualization
        • Recommended Systems For:
        • Enscape
        • Lumion
        • Twinmotion
      • Photogrammetry & GIS
        • Recommended Systems For:
        • ArcGIS Pro
        • Agisoft Metashape
        • Pix4D
        • RealityCapture
    • AI & HPC
      • Recommended Systems For:
      • Data Science
      • Generative AI
      • Large Language Models
      • Machine Learning / AI Dev
      • Scientific Computing
    • More
      • Recommended Systems For:
      • Compact Size
      • Live Streaming
      • NVIDIA RTX Studio
      • Quiet Operation
      • Virtual Reality
    • Business & Enterprise
      We can empower your company
    • Government & Education
      Services tailored for your organization
  • Products
    • Computer System Styles:
    • Desktop Workstations
      • AMD Ryzen
        • Ryzen 7000:
        • Mini Tower
        • Mid Tower
        • Full Tower
      • AMD Threadripper
        • Threadripper 7000:
        • Mid Tower
        • Full Tower
        • Threadripper PRO 5000WX:
        • Full Tower
        • Threadripper PRO 7000WX:
        • Full Tower
      • AMD EPYC
        • EPYC 9004:
        • Full Tower
      • Intel Core
        • Core 13th Gen:
        • Small Form Factor
        • Core 14th Gen:
        • Mini Tower
        • Mid Tower
        • Full Tower
      • Intel Xeon
        • Xeon W-2400:
        • Mid Tower
        • Xeon W-3400:
        • Full Tower
    • Custom Computers
    • Laptop Workstations
      • Puget Mobile 17″
    • Rackstations
      • AMD Rackstations
        • Ryzen 7000:
        • R120-4U
        • R550-6U 5-Node
        • Threadripper 7000:
        • T120-4U
        • Threadripper PRO 5000WX:
        • WRX80 4U
        • Threadripper PRO 7000WX:
        • T140-4U
        • EPYC 9004:
        • E140-4U
      • Intel Rackstations
        • Core 14th Gen:
        • C130-4U
        • Xeon W-3400:
        • X140-4U
        • X141-5U
    • Custom Rackmount Workstations
    • Puget Servers
      • Puget Servers
        • AMD EPYC:
        • E200-1U
        • E140-2U
        • E280-4U
        • Intel Xeon:
        • X200-1U
    • Custom Servers
    • Storage Solutions
      • Network Attached Storage
        • QNAP NAS Recommendations
      • Puget Storage
        • Puget Storage:
        • 12-Bay 2U
        • 24-Bay 2U
        • 36-Bay 4U
    • Recommended Third Party Peripherals
      Curated list of accessories for your workstation
    • Puget Gear
      Quality apparel with Puget Systems branding
  • Publications
    • Articles
    • Blog Posts
    • Case Studies
    • HPC Blog
    • Podcasts
    • Press
    • PugetBench
  • Support
    • Contact Support
    • Support Articles
    • Warranty Details
    • Onsite Services
    • Unboxing
  • About Us
    • About Us
    • Contact Us
    • Our Customers
    • Enterprise
    • Gov & Edu
    • Press Kit
    • Testimonials
    • Careers
  • Talk to an Expert
  • My Account
  1. Home
  2. /
  3. Hardware Articles
  4. /
  5. Stable Diffusion Performance – NVIDIA GeForce VS AMD Radeon
NVIDIA RTX 4090 and AMD Radeon RX 7900 XTX on a blue background with a "vs" between them.

Stable Diffusion Performance – NVIDIA GeForce VS AMD Radeon

Posted on July 31, 2023 (November 15, 2023) by Evan Lagergren

Table of Contents

  • Introduction
  • Test Setup
  • Automatic 1111
  • SHARK
  • PugetBench for Stable Diffusion
  • IS NVIDIA GeForce or AMD Radeon faster for Stable Diffusion?

Introduction

Stable Diffusion is a deep learning model which is seeing increasing use in the content creation space for its ability to generate and manipulate images using text prompts. Stable Diffusion is unique among creative workflows in that, while it is being used professionally, it lacks commercially-developed software. It is instead implemented in a variety of different open-source applications. Additionally, unlike similar text-to-image models, Stable Diffusion is often run locally on your system rather than being accessible with a cloud service.

Stable Diffusion can run on a midrange graphics card with at least 8 GB of VRAM but benefits significantly from powerful, modern cards with lots of VRAM. We have published our benchmark testing methodology for Stable Diffusion, and in this article, we will be looking at the performance of a large variety of Consumer GPUs from AMD and NVIDIA that were released over the last five years. If you are interested in the performance of Professional GPUs, we have also published an article covering over a dozen of those.

We want to point out that Tom’s Hardware also published their results in January for an even wider variety of consumer GPUs. However, we could not fully replicate their results, so the numbers we are showing are slightly different. We do not believe that this is due to any issue with testing methodology from each party but rather that Stable Diffusion is a constantly evolving set of tools. How it works today is very different than how it did even six months ago.

NVIDIA RTX 4090 and AMD Radeon RX 7900 XTX on a blue background with a
Image
Open Full Resolution

Below are the specifications for the cards tested:

GPUMSRPVRAMCUDA/Stream ProcessorsBase / Boost ClockPowerLaunch Date
RTX 3090 Ti$2,00024 GB10,7521.56 / 1.86 GHz450 WJan. 2022
RTX 4090$1,60024 GB16,3842.23 / 2.52 GHz450 WOct. 2022
RTX 3090$1,50024 GB10,4961.395 / 1.695 GHz350 WSept. 2020
RTX 4080$1,20016 GB9,7282.21 / 2.51 GHz320 WNov. 2022
RTX 3080 Ti$1,20012 GB10,2401.365 / 1.665 GHz350 WJune 2021
RTX 2080 Ti$1,00011 GB4,3521.35 / 1.545 GHz250 WSept. 2018
Radeon RX 7900 XTX$1,00024 GB6,1442.3 / 2.5 GHz355 WDec. 2022
Radeon RX 6900 XT$1,00016 GB5,1201.825 / 2.015 GHz300 WOct. 2020
RTX 4070 Ti$80012 GB7,6802.31 / 2.61 GHz285 WJan. 2023
GTX 1080 Ti$70011 GB3,5841.481 / 1.582 GHz250 WMarch 2017
RTX 4070$60012 GB5,8881.92 / 2.48 GHz200 WApril 2023
RTX 4060 Ti (8GB)$4008 GB4,3522.31 / 2.54 GHz160 WMay 2023
RTX 3060 Ti$4008 GB4,8641.41 / 1.67 GHz200 WDec. 2020

Test Setup

Test Platform

CPU: AMD Threadripper PRO 5975WX 32-Core
CPU Cooler: Noctua NH-U14S TR4-SP3 (AMD TR4)
Motherboard: ASUS Pro WX WRX80E-SAGE SE WIFI
RAM: 8x Micron DDR4-3200 16GB ECC Reg. (128GB total)
GPUs:
NVIDIA GeForce RTX 3090 Ti 24GB
NVIDIA GeForce RTX 4090 24GB
NVIDIA GeForce RTX 3090 24GB
NVIDIA GeForce RTX 4080 16GB
ASUS GeForce RTX 4070 Ti STRIX 12GB
NVIDIA GeForce RTX 4070 12GB
ASUS GeForce RTX 4060 Ti TUF OC 8GB

NVIDIA GeForce RTX 3060 Ti 8GB

AMD Radeon RX 7900 XTX 24GB
AMD Radeon RX 6900 XT 16GB
PSU: Super Flower LEADEX Platinum 1600W
Storage: Samsung 980 Pro 2TB
OS: Windows 11 Pro 64-bit (22621)

Benchmark Software

Automatic 1111
Version: 1.5.1, xformers: 0.0.17
Checkpoint: v1-5-pruned-emaonly
Automatic 1111 (lshqqytiger AMD fork)
Version: 1.3.1
Checkpoint: v1-5-pruned-emaonly
SHARK
Version: 20230701_796
Checkpoint: stabilityai/stable-diffusion-2-1-base
PugetBench for Stable Diffusion 0.3.0 alpha

In order to test the performance in Stable Diffusion, we used one of our fastest platforms in the AMD Threadripper PRO 5975WX, although CPU should have minimal impact on results. All of our testing was done on the most recent drivers and BIOS versions using the “Pro” or “Studio” versions of the drivers. However, we found compatibility issues between SHARK and the AMD PRO 22.Q4, so testing was redone using AMD Adrenaline drivers instead. Following our test methodology, we used three different implementations of Stable Diffusion: Automatic 1111, SHARK, and our custom in-development benchmark and the prompts given in our methodology article.

It is important to note that our primary goal is to test the latest public releases of the most popular Stable Diffusion implementations. While many would consider these to be cutting-edge already, there are even newer things that could be implemented, such as updated CUDA and PyTorch versions. However, these are not always stable and are not yet integrated into the public releases of Automatic 1111 and SHARK. We want to focus on what end-users would most likely use for real-world, professional applications today, rather than testing right at the bleeding edge. However, We will note that our “PugetBench” version of Stable Diffusion uses the latest versions of CUDA and PyTorch (among others) at the time of this article.

We decided to test nearly every card from NVIDIA’s 4000 series of GPUs alongside the top-tier cards from their last three generations. We also included the 3060 Ti, which is currently the lowest-spec consumer GPU we offer in our product line. Additionally, we tested the Radeon RX 7900 XTX. Although we attempted to benchmark both the Radeon RX 580, we experienced compatibility issues and were not able to get results. It may have been possible to adjust the implementations to improve compatibility, but we wanted to run these tests as close to “stock” as possible.

Call to Action
Looking for a Content Creation Workstation?
Call to Action
Looking for a Content Creation Workstation?

Automatic 1111

Stable Diffusion Automatic 1111 w/ xformers Geometric Mean Iterations per Second - Higher is Better. RTX 4090: 21.04 RTX 4080: 19.41 RTX 4070 Ti: 17.65 RTX 3090 Ti: 16.97 RTX 3090: 16.66 RTX 3080 Ti: 16.55 RTX 4070: 16.02 RTX 2080 Ti: 12.92 RTX 4060 Ti: 12.32 RTX 3060 Ti: 8.62 RX 7900 XTX: 4.67 GTX 1080 Ti: 2.63 RX 6900 XT: 0.77
Image
Open Full Resolution

The implementation we benchmarked first was Automatic 1111, Stable Diffusion’s most commonly used implementation, which usually offers the best performance for NVIDIA cards. NVIDIA soundly outperforms AMD here, with only the GTX 1080 Ti having lower performance than the RX 7900 XTX and the RTX 3060 Ti having twice the iterations per second of the Radeon card.

While most cards perform about as you would expect based on their positioning in NVIDIAs product stack, we see that the newer 4000 series cards offer a clear advantage in image generation speed while also offering a relatively linear increase in performance with price. This is well illustrated by the RTX 4070 Ti, which is about 5% faster than the last-gen RTX 3090 Ti, and the RTX 4060 Ti, which is nearly 43% faster than the 3060 Ti. If you still have a card from the 2000 or 1000 series, even a mid-tier 4000 series card will be a noticeable upgrade in performance.

SHARK

Stable Diffusion SHARK Geometric Mean Iterations per Second - Higher is Better. RX 7900 XTX: 20.76 RTX 4090: 15.24 RTX 4080: 13.96 RTX 4070 Ti: 11.86 RTX 3090 Ti: 11.69 RTX 3090: 11.25 RTX 3080 Ti: 11.01 RTX 4070: 9.46 RTX 2080 Ti: 8.68 RX 6900 XT: 8.61 RTX 4060 Ti: 7.44 RTX 3060 Ti: 6.64 GTX 1080 Ti: 0 RX 7900 XTX: 0
Image
Open Full Resolution

Although less commonly used than Automatic 1111, SHARK is the preferred Stable Diffusion implementation for many AMD users, and it is clear why. The RX 7900 XTX quadruples its performance over the Automatic 1111 implementation, resulting in iterations per second that nearly matches the RTX 4090 under 1111. Similarly, the RX 6900 XT has an even larger 1100% performance increase, although this only makes it competitive with the low-end NVIDIA GPUs we tested.

NVIDIA cards performed about 30% worse with SHARK than Automatic 1111, although they maintained the same relative performance. However, the GTX 1080 Ti could not run the SHARK implementation in our testing. It is important to use the proper implementation of Stable Diffusion as it can greatly impact the number of iterations per second a graphics card can achieve: from a 30% performance decrease to a massive 1100% increase.

PugetBench for Stable Diffusion

Stable Diffusion PugetBench for Stable Diffustion 0.3.0 alpha Geometric Mean Iterations per Second - Higher is Better. RTX 4090: 22.8 RTX 4080: 20.76 RTX 4070 Ti: 19.45 RTX 3090 Ti: 18.72 RTX 3090: 17.63 RTX 3080 Ti: 17.45 RTX 4070: 16.11 RTX 2080 Ti: 13.42 RTX 4060 Ti: 12.69 RTX 3060 Ti: 9.61 GTX 1080 Ti: 4.05
Image
Open Full Resolution

Alongside the two most common packages for Stable Diffusion, we also have our own in-development implementation. As we are focused on benchmarking, it is a more stripped-down version than Automatic 1111 or SHARK. However, it only supports NVIDIA cards, although we plan to add AMD support.

This implementation performs very similarly to the Automatic 1111 implementation with xFormers, although it uses an updated version of PyTorch rather than the xFormers library. The only notable exception is that the GTX 1080 Ti has nearly double the performance with our implementation, although, at only 4.05 it/s, it is still a relatively weak card for this workflow. We expect that we won’t see a significant performance shift when Automatic 1111 and SHARK update to the latest version of PyTorch.

Call to Action
Looking for a Content Creation Workstation?
Call to Action
Looking for a Content Creation Workstation?

IS NVIDIA GeForce or AMD Radeon faster for Stable Diffusion?

Although this is our first look at Stable Diffusion performance, what is most striking is the disparity in performance between various implementations of Stable Diffusion: up to 11 times the iterations per second for some GPUs. NVIDIA offered the highest performance on Automatic 1111, while AMD had the best results on SHARK, and the highest-end GPU on their respective implementations had relatively similar performance.

If you are not already committed to using a particular implementation of Stablie Diffusion, both NVIDIA and AMD offer great performance at the top-end, with the NVIDIA GeForce RTX 4090 and the AMD Radeon RX 7900 XTX both giving around 21 it/s in their preferred implementation. Currently, this means that AMD has a slight price-to-performance advantage with the RX 7900 XTX, but as developers often favor NVIDIA GPUs, this could easily change in the future. AMD has been doing a lot of work to increase GPU support in the AI space, but they haven’t yet matched NVIDIA.

Stable Diffusion is still somewhat in its infancy, and it is worth noting that performance is only going to improve in the coming months and years. While we don’t expect there to be many massive shifts in terms of relative performance, we want to be clear that the exact results in this article are likely to change over time.

If you are looking for a workstation for using Stable Diffusion, you can visit our solutions page to view our recommended workstations for various workflows. If you are a developer, we also have a range of workstations and servers for Machine Learning and AI. Feel free to contact one of our technology consultants for help configuring a workstation that meets the specific needs of your unique workflow.

Tower Computer Icon in Puget Systems Colors

Looking for a content creation workstation?

We build computers tailor-made for your workflow. 

Configure a System
Talking Head Icon in Puget Systems Colors

Don’t know where to start?
We can help!

Get in touch with one of our technical consultants today.

Talk to an Expert

Related Content

  • DaVinci Resolve Studio 18.6 – Consumer GPU Performance Analysis
  • Effects of CPU speed on GPU inference in llama.cpp
  • Puget Mobile 17″ vs M3 Max MacBook Pro 16″ for AI Workflows
  • Mac vs PC for Content Creation (2024)
View All Related Content

Latest Content

  • DaVinci Resolve Studio 18.6 – Consumer GPU Performance Analysis
  • Effects of CPU speed on GPU inference in llama.cpp
  • PC Gaming Performance Tweaks
  • How to View Your Windows 10 and 11 Product Key
View All
Tags: 1080 Ti, 2080 Ti, AMD, GPU, NVIDIA, Radeon RX 7900 XTX, RTX 3060 Ti, RTX 3080 Ti, RTX 3090, RTX 3090 Ti, RTX 4060 Ti, RTX 4070, RTX 4070 Ti, RTX 4080, RTX 4090, stable diffusion

Who is Puget Systems?

Puget Systems builds custom workstations, servers and storage solutions tailored for your work.

We provide:

Extensive performance testing
making you more productive and giving better value for your money

Reliable computers
with fewer crashes means more time working & less time waiting

Support that understands
your complex workflows and can get you back up & running ASAP

A proven track record
as shown by our case studies and customer testimonials

Get Started

Browse Systems

Puget Systems Mobile Laptop Workstation Icon

Mobile

Puget Systems Tower Workstation Icon

Workstations

Puget Systems Rackmount Workstation Icon

Rackstations

Puget Systems Rackmount Server Icon

Servers

Puget Systems Rackmount Storage Icon

Storage

Latest Articles

  • DaVinci Resolve Studio 18.6 – Consumer GPU Performance Analysis
  • Effects of CPU speed on GPU inference in llama.cpp
  • PC Gaming Performance Tweaks
  • How to View Your Windows 10 and 11 Product Key
  • When the Windows Store App Simply Won’t Cooperate
View All

Post navigation

 Stable Diffusion Benchmark Testing MethodologyStable Diffusion Performance – NVIDIA RTX vs Radeon PRO 
Puget Systems Logo
Build Your Own PC Site Map FAQ
facebook instagram linkedin rss twitter youtube

Optimized Solutions

  • Adobe Premiere
  • Adobe Photoshop
  • Solidworks
  • Autodesk AutoCAD
  • Machine Learning

Workstations

  • Content Creation
  • Engineering
  • Scientific PCs
  • More

Support

  • Online Guides
  • Request Support
  • Remote Help

Publications

  • All News
  • Puget Blog
  • HPC Blog
  • Hardware Articles
  • Case Studies

Policies

  • Warranty & Return
  • Terms and Conditions
  • Privacy Policy
  • Delivery Times
  • Accessibility

About Us

  • Testimonials
  • Careers
  • About Us
  • Contact Us

© Copyright 2024 - Puget Systems, All Rights Reserved.