Extension for Visualizing Cross-Attention #5043
Replies: 1 comment 2 replies
-
This extension seems really interesting. Some of the results seem off, but in general it does show a decent hint of what the system "sees" in the image. Would be neat if there was a way to batch process keywords and then get a result gallery like SD itself. So it's quick and easy to compare things. That would have to use that "big white panel with some text on it" format, though. In general seems like it really could do with a little more documentation (which selections affect what, etc) and some small changes to avoid errors. Like non 1x1 images cause an error and no image causes an error. Masked grid is sometimes really hard to see, especially if it's darker content. Grayscale fixes that, but you need to switch and the original image is not visible. Would it be hard to do something like size changing circles or give the squares different thickness outlines or such? Also, you can probably fuse all three bottom selections into one row. Great work overall, always interesting to see what affects the results in what way. Keep it up! |
Beta Was this translation helpful? Give feedback.
-
Hi all,
I made a simple extension to visualize the cross-attension in the UNet and found it interesting.
Visualize Cross-Attention
It seems that the behaviour of some textual embeddings can be easily revealed by this approach.
Beta Was this translation helpful? Give feedback.
All reactions