Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added TensorRT #205

Merged
merged 1 commit into from
Oct 19, 2023
Merged

added TensorRT #205

merged 1 commit into from
Oct 19, 2023

Conversation

contentis
Copy link
Contributor

Info

Adding TensorRT acceleration for UNet.

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 12, 2023

@contentis ahhh...... I think the repository is set to private

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

there are some issues

  1. One I have made a draft PR about, whitch can cause the TRT UI to not load

  2. The use of shared.opt in installer, not only that share is a very heavy import whitch can slow down the load time of webui, but more imortently, it may not be able to get the correct config file so won't really work

    you shoud be able to add sd_unet to quick settings at before_ui_callback
    I will make a PR for this later

IMO, it can be good to auto add sd_unet to quick settings, but user should have the choice to remove it they should don't wan't it

  1. the repo name is stable-diffusion-webui-tensorrt same as AUTOMATIC1111's https://github.com/AUTOMATIC1111/stable-diffusion-webui-tensorrt, curently how it works it that it will try to install into the folder for the repo name by default, if someone has already installed AUTOMATIC1111's stable-diffusion-webui-tensorrt, it will cause installation to fail
    the best solution is for use to implement something in webui to deal with this issue, I did have a proposal of appending the author owner before the repo name, but there may be some issue with this method so it was not implement

    the easy solution is for you to change the repo name, but I guess there are rules at NVIDIA so not an options

    for now I think the best solution is for AUTOMATIC1111 to his TRT repo name to prevent future issues, but people with auto TRT installed will still have issues

  2. system default encoding issue, fix PR

@contentis
Copy link
Contributor Author

@w-e-w Just wanted to let you know that the repo is public now, but apparently, you were faster :D Thank you for your input and the PRs; this is awesome!

IMO, it can be good to auto-add sd_unet to quick settings, but user should have the choice to remove it they should don't want it

Agreed, that's why I tried to set it in the installer, so it is a one-time call. If shared takes a long to load, I could try to read the config.json and just write it there, but that seemed a bit hacky.

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

Agreed, that's why I tried to set it in the installer, so it is a one-time call.

well, installer is not called one time, it also called on every restart by the launch.py
as I sad, I belive before_ui_callback should work, I see what I can do later when I have time
got to go now

and by the way, I still have trubble getting TRT to work on my PC earlier today, not sure why yet.

@BurnZeZ
Copy link

BurnZeZ commented Oct 17, 2023

@w-e-w Just wanted to let you know that the repo is public now, but apparently, you were faster :D Thank you for your input and the PRs; this is awesome!

IMO, it can be good to auto-add sd_unet to quick settings, but user should have the choice to remove it they should don't want it

Agreed, that's why I tried to set it in the installer, so it is a one-time call. If shared takes a long to load, I could try to read the config.json and just write it there, but that seemed a bit hacky.

It is common for extensions to be easier to use by adding relevant things to the quick settings but to modify the user’s custom settings is unnecessarily intrusive.
I’d hint it’s something they’re probably going to want to do rather than do it for them.

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

but to modify the user’s custom settings is unnecessarily intrusive.

I do agree in part, from my viewpoint it if we just a line of text in a description saying "tip: recommended to add sd_unet to Quick Settings for easy access", but the issue is people don't seem to read, so I don't know

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

potential issues
the names given to the .trts is very long I fear it might approach Windows limit
image

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

one major issue that I'm being experiencing randomly is this
it almost appear every time I try to export an engine
sometimes when this happens the process completely holds other times it continues I managed to get one successfully completed and use it to run sd with trt and I can say that it is significantly faster
but most of the time my engine fails to export, making this almost unusable

Building TensorRT engine for B:\GitHub\stable-diffusion-webui\models\Unet-onnx\Anime_Anything-V3.0_Anything-V3.0-fp16_Anything-V3.0-pruned-fp16_38c1ebe3.onnx
: B:\GitHub\stable-diffusion-webui\models\Unet-trt\Anime_Anything-V3.0_Anything-V3.0-fp16_Anything-V3.0-pruned-fp16_38c1ebe3_cc86_sample=1x4x64x64+2x4x64x64+
8x4x96x96-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
ERROR:asyncio:Exception in callback H11Protocol.timeout_keep_alive_handler()
handle: <TimerHandle when=355099.875 H11Protocol.timeout_keep_alive_handler()>
Traceback (most recent call last):
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 249, in _fire_event_triggered_transitions
    new_state = EVENT_TRIGGERED_TRANSITIONS[role][state][event_type]
KeyError: <class 'h11._events.ConnectionClosed'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Programs\Python\3.10.6\lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 363, in timeout_keep_alive_handler
    self.conn.send(event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
    data_list = self.send_with_data_passthrough(event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 493, in send_with_data_passthrough
    self._process_event(self.our_role, event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 242, in _process_event
    self._cstate.process_event(role, type(event), server_switch_event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 238, in process_event
    self._fire_event_triggered_transitions(role, event_type)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 251, in _fire_event_triggered_transitions
    raise LocalProtocolError(
h11._util.LocalProtocolError: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE

the engines does seem to be generated (saved to disk) but the profiles are just not created so webui doesn't see it

Cause and patch fix

as far that I can see this is caused by yield logging_history
I've made an edit to replace all yield logging_history with print(logging_history) and it seems to work fine

@contentis
Copy link
Contributor Author

the names given to the .trts is very long I fear it might approach Windows limit

A hash could replace this, but this will make it even less readable / almost impossible for a user to delete an engine from disk.

as far that I can see this is caused by yield logging_history
I've made an edit to replace all yield logging_history with print(logging_history) and it seems to work fine

yield was used in this case as it allowed me to "dynamically" output something to the UI (Below the export button is a "Output" field). gr.Info() isn't a great replacement for that IMO... I'm a gradio newbie, so if there is a better way of doing this, please let me know!

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

to be honest I don't consider myself very adept at gradio my self
currently I'm trying to make it work as supposed to try to make it look good
the print thing is more of a hotfix
I don't know the proper way of doing this my self
is the proper way of doing it the method I tried before failed
I have a feeling that the browser has to initiate the request for the element to be updated possibly using some JavaScript

btw
about TODO Dynamically update available profiles. Not possible with gradio?!
I'm not sure if dynamically updating is possible but what I am doing now is rewriting the entire block as a single markdown`
and we should be able to just update the contents of the markdown when the refresh button is pressed

@contentis
Copy link
Contributor Author

The issue you encountered with yield is nothing I have ever seen in our internal testing... Would you mind sharing your OS, Gradio and SD WebIUI version?


I'm not sure if dynamically updating is possible but what I am doing now is rewriting the entire block as a single markdown`
and we should be able to just update the contents of the markdown when the refresh button is pressed

But this will eliminate the option to have the accordions.

Please note that when it comes to changes to the UI/UX, I won't be making the call.

@BurnZeZ
Copy link

BurnZeZ commented Oct 17, 2023

What exactly are the preinstallation requirements?
Perhaps I ran into a bug but if all necessary dependencies were to be installed automatically something was missing as I had complaints of missing DLLs.
I messed with the Windows’ path to see if I could determine what I was missing (I am tired) but finally that ended with this:
class TQDMProgressMonitor(trt.IProgressMonitor):
AttributeError: module 'tensorrt' has no attribute 'IProgressMonitor'

@contentis
Copy link
Contributor Author

@BurnZeZ did you have tensorrt previously installed? Or have tensorrt on your path?

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

But this will eliminate the option to have the accordions.

@contentis no actually example NVIDIA/Stable-Diffusion-WebUI-TensorRT#4

@BurnZeZ
Copy link

BurnZeZ commented Oct 17, 2023

@BurnZeZ did you have tensorrt previously installed? Or have tensorrt on your path?

Yes. Before I corrected the path var I had ended up with:
FileNotFoundError: Could not find: nvinfer.dll.
I’m using TensorRT 8.6.1.6.
Where is IProgressMonitor defined?

@contentis
Copy link
Contributor Author

Ideally this should be discussed in the repository as issue as this is not part of the PR. And people with the similar issues might get something out of it as well.

TensorRT 9 should be installed by the extension automatically. Can you check pip freeze? There shouldn't be the need to download tensorrt not having it on the path.

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

@contentis just to make you aware, I only add extension to the index if they are in working state

(whether or not it gets broken in the future is another matter)

so I don't intend to add this TRT to until at least the "breaking" issues that I found are fixed

unless the issues I found are due to errors on my part

@contentis
Copy link
Contributor Author

I fully understand that. In the above case I wouldn't consider it a bug / not working. As this is a version mismatch that needs to be handled by the user. And suggesting to move this to an issue is fair in my opinion.

Generally it is arguable what is considered working, due to its hackable nature it is hard to provide a working guarantee.
Setting this up on a clean install, with the recommended python version in a clean environment, has been working on multiple machines.

The main issue I see at the moment is the naming collision with the existing plugin.

This being said, I'm really happy for your inputs and would love to make this an awesome extension.

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 17, 2023

@contentis actually I wasn't referring to the version mismatch issue from BurnZeZ
currently the last major bug I see is this one NVIDIA/Stable-Diffusion-WebUI-TensorRT#3 the output yield issue
I provided a very bad looking but at least working, I doubt that you would want to merge that into the code base because it makes the terminal entire mess

I need to go to bed and I may not have time to review in the next couple of days
ping me when you think it extension is ready I'll look at it when I have time

@BurnZeZ
Copy link

BurnZeZ commented Oct 17, 2023

Ideally this should be discussed in the repository as issue as this is not part of the PR. And people with the similar issues might get something out of it as well.

TensorRT 9 should be installed by the extension automatically. Can you check pip freeze? There shouldn't be the need to download tensorrt not having it on the path.

Pip freeze demonstrated that it believed tensorrt was installed within the original automatic1111 tensorrt extension directory (stable-diffusion-webui\extensions\stable-diffusion-webui-tensorrt\TensorRT-8.6.1.6) and after “uninstalling” it (I removed that directory a while ago but maybe the extension had registered it with pip) pip pulled in the correct version.

If other people find similar problems after having previously used the other extension they’ll probably need to pip uninstall tensorrt to clear the stale/outdated reference.

@wywywywy
Copy link
Contributor

wywywywy commented Oct 18, 2023

My tensorrt is v9 and I still see this yield issue.

Details
absl-py==1.4.0
accelerate==0.21.0
addict==2.4.0
aenum==3.1.15
aiofiles==23.1.0
aiohttp==3.8.5
aiosignal==1.3.1
altair==5.0.1
antlr4-python3-runtime==4.9.3
anyio==3.7.1
async-timeout==4.0.2
attrs==23.1.0
basicsr==1.4.2
beautifulsoup4==4.12.2
blendmodes==2022
boltons==23.0.0
cachetools==5.3.1
certifi==2023.7.22
cffi==1.15.1
charset-normalizer==3.2.0
clean-fid==0.1.35
click==8.1.6
clip @ https://github.com/openai/CLIP/archive/d50d76daa670286dd6cacf3bcd80b5e4823fc8e1.zip
cmake==3.27.0
contourpy==1.1.0
cssselect2==0.7.0
cycler==0.11.0
deprecation==2.1.0
dynamicprompts==0.29.0
einops==0.4.1
exceptiongroup==1.1.2
facexlib==0.3.0
fastapi==0.94.0
ffmpy==0.3.1
filelock==3.12.2
filterpy==1.4.5
flatbuffers==23.5.26
fonttools==4.41.1
frozenlist==1.4.0
fsspec==2023.6.0
ftfy==6.1.1
future==0.18.3
fvcore==0.1.5.post20221221
gdown==4.7.1
gfpgan==1.3.8
gitdb==4.0.10
GitPython==3.1.32
google-auth==2.22.0
google-auth-oauthlib==1.0.0
gradio==3.41.2
gradio_client==0.5.0
grpcio==1.56.2
h11==0.12.0
httpcore==0.15.0
httpx==0.24.1
huggingface-hub==0.16.4
idna==3.4
imageio==2.31.1
importlib-metadata==6.8.0
importlib-resources==6.0.1
inflection==0.5.1
iopath==0.1.9
Jinja2==3.1.2
jsonmerge==1.8.0
jsonschema==4.18.4
jsonschema-specifications==2023.7.1
kiwisolver==1.4.4
kornia==0.6.7
lark==1.1.2
lazy_loader==0.3
lightning-utilities==0.9.0
linkify-it-py==2.0.2
lit==16.0.6
llvmlite==0.40.1
lmdb==1.4.1
lpips==0.1.4
lxml==4.9.3
Markdown==3.4.4
markdown-it-py==2.2.0
MarkupSafe==2.1.3
matplotlib==3.7.2
mdit-py-plugins==0.3.3
mdurl==0.1.2
mediapipe==0.10.7
mpmath==1.3.0
multidict==6.0.4
networkx==3.1
numba==0.57.1
numpy==1.23.5
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
oauthlib==3.2.2
omegaconf==2.2.3
onnx==1.14.1
onnx-graphsurgeon==0.3.27
open-clip-torch==2.20.0
opencv-contrib-python==4.8.0.74
opencv-python==4.8.0.74
orjson==3.9.2
packaging==23.1
pandas==2.0.3
piexif==1.1.3
Pillow==9.5.0
platformdirs==3.10.0
polygraphy==0.49.0
portalocker==2.7.0
protobuf==3.20.2
psutil==5.9.5
py-cpuinfo==9.0.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
pydantic==1.10.12
pydub==0.25.1
Pygments==2.15.1
pyparsing==3.0.9
PySocks==1.7.1
python-dateutil==2.8.2
python-multipart==0.0.6
pytorch-lightning==1.9.4
pytz==2023.3
PyWavelets==1.4.1
PyYAML==6.0.1
realesrgan==0.3.0
referencing==0.30.0
regex==2023.6.3
reportlab==4.0.4
requests==2.31.0
requests-oauthlib==1.3.1
resize-right==0.0.2
rich==13.5.2
rpds-py==0.9.2
rsa==4.9
safetensors==0.3.1
scikit-image==0.21.0
scipy==1.11.1
seaborn==0.12.2
semantic-version==2.10.0
Send2Trash==1.8.2
sentencepiece==0.1.99
six==1.16.0
smmap==5.0.0
sniffio==1.3.0
sounddevice==0.4.6
soupsieve==2.4.1
starlette==0.26.1
svglib==1.5.1
sympy==1.12
tabulate==0.9.0
tb-nightly==2.14.0a20230801
tensorboard-data-server==0.7.1
tensorrt==9.0.1.post11.dev4
tensorrt-bindings==9.0.1.post11.dev4
tensorrt-libs==9.0.1.post11.dev4
termcolor==2.3.0
thop==0.1.1.post2209072238
tifffile==2023.7.18
timm==0.9.2
tinycss2==1.2.1
tokenizers==0.13.3
tomesd==0.1.3
tomli==2.0.1
toolz==0.12.0
torch==2.0.1+cu118
torchdiffeq==0.2.3
torchmetrics==1.0.1
torchsde==0.2.5
torchvision==0.15.2+cu118
tqdm==4.65.0
trampoline==0.1.2
transformers==4.30.2
triton==2.0.0
typing_extensions==4.8.0
tzdata==2023.3
uc-micro-py==1.0.2
ultralytics==8.0.199
urllib3==1.26.16
uvicorn==0.23.2
wcwidth==0.2.6
webencodings==0.5.1
websockets==11.0.3
Werkzeug==2.3.6
yacs==0.1.8
yapf==0.40.1
yarl==1.9.2
zipp==3.16.2

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 18, 2023

@contentis
from my perspective I consider the hot_fix branch to be "working"
ping me when the changes is pushed to master and it's okay for me to add this to index

@contentis
Copy link
Contributor Author

Will do, trying to get some more fixes in that have been reported in the last 24h. Especially addressing the installation process and documentation.

@contentis
Copy link
Contributor Author

The discussed issues should be resolved now in the main branch.

@w-e-w
Copy link
Collaborator

w-e-w commented Oct 19, 2023

did a test and didn't explode so hurray

I'm going to merge it into index

thanks for the great work

@w-e-w w-e-w merged commit daaabf6 into AUTOMATIC1111:extensions Oct 19, 2023
1 check passed
github-actions bot pushed a commit that referenced this pull request Oct 19, 2023
@contentis
Copy link
Contributor Author

contentis commented Oct 19, 2023

did a test and didn't explode so hurray

I see this as an absolute win


Thank you for your support and input!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants