Speeding up generation on RTX cards via Tensor cores #4500

Kiyos · 2022-11-08T21:40:01Z

Kiyos
Nov 8, 2022

I’m still a noob in ML and AI stuff, but I’ve heard that Nvidia’s Tensor cores were designed specifically for machine learning stuff and are currently used for DLSS. And that got me thinking about the subject. So, I have searched the interwebz extensively, and found this one article, which suggests that there, indeed, is some way:
Making stable diffusion 25% faster using TensorRT
What do you guys think?

grexzen · 2022-11-09T16:00:45Z

grexzen
Nov 9, 2022

There are a couple threads about it already. It is true, the kernels for TRT can speed up the whole process by some margin.

There are a couple implementations out there, but not sure it has made it to autos yet.

0 replies

aliencaocao · 2022-11-10T02:59:44Z

aliencaocao
Nov 10, 2022

TensorRT has nothing to do with tensor cores. If you enable fp16, you will already be using tensor cores, and this repo allows for that by default.
See #4161

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speeding up generation on RTX cards via Tensor cores #4500

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Speeding up generation on RTX cards via Tensor cores #4500

Kiyos Nov 8, 2022

Replies: 2 comments

grexzen Nov 9, 2022

aliencaocao Nov 10, 2022

Kiyos
Nov 8, 2022

grexzen
Nov 9, 2022

aliencaocao
Nov 10, 2022