Optimizing ComfyUI for Parallel Workflow Processing on a Single GPU #5941

menahem121 · 2024-12-06T08:38:38Z

menahem121
Dec 6, 2024

Hi,

My goal is to serve a large number of users by running multiple workflows in parallel on a single GPU. Currently, ComfyUI processes workflows sequentially within a single server, which works well for one user but becomes restrictive when handling multiple simultaneous requests.

The Challenge

In my tests, generating a single image takes about 1 second. However, with the current sequential queue system, if 30 requests are made, the 30th request takes 30 seconds to complete, which is not ideal for scaling.

What I’ve Tried

To address this, I set up three ComfyUI servers on a single AWS instance with a 48GB NVIDIA GPU. While this reduced generation times to 1–2 seconds per image, I encountered GPU overloading.

The issue seems to stem from how each server loads models independently. For example:

When Server 1 runs Workflow 1, it loads the required models into memory.
When Server 2 starts the same workflow, it reloads the same models, duplicating them in VRAM.
The same happens with Server 3.
This duplication wastes GPU memory and limits efficiency. I understand that this might be intentional if each server requires its own latent space to process workflows.

Question

Is there a way to optimize this setup so that multiple ComfyUI servers can share the same models in VRAM? Or is there another recommended approach for serving parallel workflows more efficiently?

Thank you for your insights!

ltdrdata · 2024-12-08T05:19:46Z

ltdrdata
Dec 8, 2024
Collaborator

Unfortunately, utilizing multiple GPUs simultaneously within a single instance is not currently supported. Please refer to the discussions regarding this improvement.

#3683

1 reply

menahem121 Dec 8, 2024
Author

i dont want to use multiple GPUs but i want to use multiple comfyUI servers on one GPU.
first i tried to split the GPU with vgpu-device-manager but i cant put my hand on him.
So i need to find a way to parallelizes the queue line

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing ComfyUI for Parallel Workflow Processing on a Single GPU #5941

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Optimizing ComfyUI for Parallel Workflow Processing on a Single GPU #5941

menahem121 Dec 6, 2024

The Challenge

What I’ve Tried

Question

Replies: 1 comment · 1 reply

ltdrdata Dec 8, 2024 Collaborator

menahem121 Dec 8, 2024 Author

menahem121
Dec 6, 2024

Replies: 1 comment 1 reply

ltdrdata
Dec 8, 2024
Collaborator

menahem121 Dec 8, 2024
Author