NVIDIA and Microsoft Drive Innovation for Windows PCs in New Era of Generative AI #10684
Replies: 12 comments 26 replies
-
There is some more info on Microsoft: https://github.com/microsoft/Olive/blob/main/examples/directml/stable_diffusion/README.md |
Beta Was this translation helpful? Give feedback.
-
So we're going to have to wait for people to make optimized models or maybe an extension to convert them. |
Beta Was this translation helpful? Give feedback.
-
I made an extension that can be used to test the effects of Olive (Direct-ML). |
Beta Was this translation helpful? Give feedback.
-
This ui runs ONNX models: https://github.com/ForserX/StableDiffusionUI I havent tested it with the olive models though. TensorRT and these optimizations have been out for so long but so far no one cared enough about the performance boost to have it properly integrated, a great pity. But I am sure the future will hold many more surprises, and one of them might just be real-time image generation |
Beta Was this translation helpful? Give feedback.
-
Okay, I'm obviously missing something. Nvidia claims about 2X performance gain with optimized models "with popular Automatic1111 distribution", but in practice, these models are not compatible with auto1111. And to use these models it was necessary to install some other incomprehensible forks or UI. Why then they mentioned Auto1111, if it does not work on it? |
Beta Was this translation helpful? Give feedback.
-
My question exactly. Seems odd they mention it, and not a specific fork...
|
Beta Was this translation helpful? Give feedback.
-
Anyway, after I updated the driver, there was practically no change in speed. |
Beta Was this translation helpful? Give feedback.
-
NVidia are working on releasing a webui modification with TensorRT and DirectML support built-in. They say they can't release it yet because of approval issues. Meanwhile, I made an extension to make and use TensorRT engines for Unet: https://github.com/AUTOMATIC1111/stable-diffusion-webui-tensorrt My performance gains for 512x512 pictures is about 50-100% faster (depending on the weather) compared to sdp-no-mem optimization. On larger resolutions, gains are smaller. After NVidia releases their version I would probably integrate the differences that make the performance better (according to the doc they have shown me TensorRT was 3 times as fast as xformers). Edit: the TensorRT support in the extension is unrelated to Microsoft Olive. |
Beta Was this translation helpful? Give feedback.
-
Can someone please summarize this all in english? Finally is it possible for this to be implemented into Automatic1111. It's open source? Can someone put it in? Or does it need a different fork? What is a fork? Is it basically Automatic1111 with changes that differ from the community that handle this open source version? |
Beta Was this translation helpful? Give feedback.
-
So, no Speed Up for PASCAL GPU's like 1080 TI. |
Beta Was this translation helpful? Give feedback.
-
Once deployed, generative AI models demand incredible inference performance. RTX Tensor Cores deliver up to 1,400 Tensor TFLOPS for AI inferencing. Over the last year, NVIDIA has worked to improve DirectML performance to take full advantage of RTX hardware.
On May 24, we’ll release our latest optimizations in Release 532.03 drivers that combine with Olive-optimized models to deliver big boosts in AI performance. Using an Olive-optimized version of the Stable Diffusion text-to-image generator with the popular Automatic1111 distribution, performance is improved over 2x with the new driver.
https://blogs.nvidia.com/blog/2023/05/23/microsoft-build-nvidia-ai-windows-rtx/
Beta Was this translation helpful? Give feedback.
All reactions