Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Kilmar Abrego Garcia pleads not guilty to human smuggling charges

    June 13, 2025

    Appeals court rejects Trump’s bid to challenge $5 million E. Jean Carroll judgment

    June 13, 2025

    Appendix cancer rising among younger generation, new study, Health News, ET HealthWorld

    June 13, 2025
    Facebook X (Twitter) Instagram
    • Demos
    • Buy Now
    Facebook X (Twitter) Instagram YouTube
    14 Trends14 Trends
    Demo
    • Home
    • Features
      • View All On Demos
    • Buy Now
    14 Trends14 Trends
    Home » TensorRT Boosts Stable Diffusion 3.5 on RTX GPUs
    AI Trends

    TensorRT Boosts Stable Diffusion 3.5 on RTX GPUs

    adminBy adminJune 12, 2025No Comments5 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Generative AI has reshaped how people create, imagine and interact with digital content.

    As AI models continue to grow in capability and complexity, they require more VRAM, or video random access memory. The base Stable Diffusion 3.5 Large model, for example, uses over 18GB of VRAM — limiting the number of systems that can run it well.

    By applying quantization to the model, noncritical layers can be removed or run with lower precision. NVIDIA GeForce RTX 40 Series and the Ada Lovelace generation of NVIDIA RTX PRO GPUs support FP8 quantization to help run these quantized models, and the latest-generation NVIDIA Blackwell GPUs also add support for FP4.

    NVIDIA collaborated with Stability AI to quantize its latest model, Stable Diffusion (SD) 3.5 Large, to FP8 — reducing VRAM consumption by 40%. Further optimizations to SD3.5 Large and Medium with the NVIDIA TensorRT software development kit (SDK) double performance.

    In addition, TensorRT has been reimagined for RTX AI PCs, combining its industry-leading performance with just-in-time (JIT), on-device engine building and an 8x smaller package size for seamless AI deployment to more than 100 million RTX AI PCs. TensorRT for RTX is now available as a standalone SDK for developers.

    RTX-Accelerated AI

    NVIDIA and Stability AI are boosting the performance and reducing the VRAM requirements of Stable Diffusion 3.5, one of the world’s most popular AI image models. With NVIDIA TensorRT acceleration and quantization, users can now generate and edit images faster and more efficiently on NVIDIA RTX GPUs.

    Stable Diffusion 3.5 quantized FP8 (right) generates images in half the time with similar quality as FP16 (left). Prompt: A serene mountain lake at sunrise, crystal clear water reflecting snow-capped peaks, lush pine trees along the shore, soft morning mist, photorealistic, vibrant colors, high resolution.

    To address the VRAM limitations of SD3.5 Large, the model was quantized with TensorRT to FP8, reducing the VRAM requirement by 40% to 11GB. This means five GeForce RTX 50 Series GPUs can run the model from memory instead of just one.

    SD3.5 Large and Medium models were also optimized with TensorRT, an AI backend for taking full advantage of Tensor Cores. TensorRT optimizes a model’s weights and graph — the instructions on how to run a model — specifically for RTX GPUs.

    FP8 TensorRT boosts SD3.5 Large performance by 2.3x vs. BF16 PyTorch, with 40% less memory use. For SD3.5 Medium, BF16 TensorRT delivers a 1.7x speedup.

    Combined, FP8 TensorRT delivers a 2.3x performance boost on SD3.5 Large compared with running the original models in BF16 PyTorch, while using 40% less memory. And in SD3.5 Medium, BF16 TensorRT provides a 1.7x performance increase compared with BF16 PyTorch.

    The optimized models are now available on Stability AI’s Hugging Face page.

    NVIDIA and Stability AI are also collaborating to release SD3.5 as an NVIDIA NIM microservice, making it easier for creators and developers to access and deploy the model for a wide range of applications. The NIM microservice is expected to be released in July.

    TensorRT for RTX SDK Released

    Announced at Microsoft Build — and already available as part of the new Windows ML framework in preview — TensorRT for RTX is now available as a standalone SDK for developers.

    Previously, developers needed to pre-generate and package TensorRT engines for each class of GPU — a process that would yield GPU-specific optimizations but required significant time.

    With the new version of TensorRT, developers can create a generic TensorRT engine that’s optimized on device in seconds. This JIT compilation approach can be done in the background during installation or when they first use the feature.

    The easy-to-integrate SDK is now 8x smaller and can be invoked through Windows ML — Microsoft’s new AI inference backend in Windows. Developers can download the new standalone SDK from the NVIDIA Developer page or test it in the Windows ML preview.

    For more details, read this NVIDIA technical blog and this Microsoft Build recap.

    Join NVIDIA at GTC Paris

    At NVIDIA GTC Paris at VivaTech — Europe’s biggest startup and tech event — NVIDIA founder and CEO Jensen Huang yesterday delivered a keynote address on the latest breakthroughs in cloud AI infrastructure, agentic AI and physical AI. Watch a replay.

    GTC Paris runs through Thursday, June 12, with hands-on demos and sessions led by industry leaders. Whether attending in person or joining online, there’s still plenty to explore at the event.

    Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NVIDIA NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations. 

    Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

    Follow NVIDIA Workstation on LinkedIn and X. 

    See notice regarding software product information.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website

    Related Posts

    NVIDIA and Deutsche Telekom Partner to Advance Germany’s Sovereign AI

    June 13, 2025

    Germany Builds Its AI Autobahn With NVIDIA

    June 12, 2025

    NVL72 GB200 Systems Accelerate Journey to Quantum Computing

    June 12, 2025

    European Financial Services Industry Goes All In on AI

    June 12, 2025

    European Financial Services Industry Goes All In on AI

    June 12, 2025

    NVIDIA Advances Quantum Algorithms for Fluid Dynamics

    June 12, 2025
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views

    Laws, norms, and ethics for AI in health

    May 1, 20252 Views
    Don't Miss

    Kilmar Abrego Garcia pleads not guilty to human smuggling charges

    June 13, 2025

    Kilmar Abrego Garcia pleaded not guilty Friday to human smuggling charges, one week after he…

    Appeals court rejects Trump’s bid to challenge $5 million E. Jean Carroll judgment

    June 13, 2025

    Appendix cancer rising among younger generation, new study, Health News, ET HealthWorld

    June 13, 2025

    ‘Significant’ expansion of nuclear waste compensation now in Trump’s megabill

    June 13, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    Demo
    Top Posts

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    Demo
    About Us
    About Us

    Your source for the lifestyle news. This demo is crafted specifically to exhibit the use of the theme as a lifestyle site. Visit our main page for more demos.

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Kilmar Abrego Garcia pleads not guilty to human smuggling charges

    June 13, 2025

    Appeals court rejects Trump’s bid to challenge $5 million E. Jean Carroll judgment

    June 13, 2025

    Appendix cancer rising among younger generation, new study, Health News, ET HealthWorld

    June 13, 2025
    Most Popular

    ChatGPT’s viral Studio Ghibli-style images highlight AI copyright concerns

    March 28, 20254 Views

    Best Cyber Forensics Software in 2025: Top Tools for Windows Forensics and Beyond

    February 28, 20253 Views

    An ex-politician faces at least 20 years in prison in killing of Las Vegas reporter

    October 16, 20243 Views

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    14 Trends
    Facebook X (Twitter) Instagram Pinterest YouTube Dribbble
    • Home
    • Buy Now
    © 2025 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.