STT Extension #4847
Replies: 2 comments 2 replies
-
Thanks for sharing.
The model type is hardcoded here: https://github.com/152334H/sd-webui-whisper/blob/master/main.py#L18-L19 Could make a UI dropdown option for it.
It kind of does but also not really. Ticking the checkbox will make the extension listen for a voice instruction ( I encountered two main problems in attempting to implement always-listen:
Feel free to send a PR (or just fork the extension) if you have better code. |
Beta Was this translation helpful? Give feedback.
-
I appreciate the shoutout 😀. I'm always feeling like I should improve that API documentation and demo script, but not sure what to add. |
Beta Was this translation helpful? Give feedback.
-
I wanted to share something that allows you to prompt by speaking out loud.
(it's not mine, it's by @152334H )
https://github.com/152334H/sd-webui-whisper
working great, though I wasn't able to switch to using a larger model (for better quality)
-According to @152334H it has many bugs, which I do not see. You may need more than 4gb vram if you want to run the whisper model accelerated on your gpu.
I don't think it has always-listen mode, and since I always wanted a feature like that, I used @mallorbc https://github.com/mallorbc/whisper_mic with the api demo script @Kilvoctu made to constantly listen and generate to output.png.
Beta Was this translation helpful? Give feedback.
All reactions