BGM removal unusable #424

starloreh · 2024-12-15T09:45:15Z

Which OS are you using?

OS: Colab
It stays at 0% and then consumes all RAM and finally it says Error:

 /usr/local/lib/python3.10/dist-packages/onnx2pytorch/convert/layer.py:30: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:206.)
  layer.weight.data = torch.from_numpy(numpy_helper.to_array(weight))
/usr/local/lib/python3.10/dist-packages/uvr/utils/fastio.py:46: UserWarning: PySoundFile failed. Trying audioread instead.
  signal, sampling_rate = librosa.load(path, sr=None, mono=False)
/usr/local/lib/python3.10/dist-packages/librosa/core/audio.py:184: FutureWarning: librosa.core.audio.__audioread_load
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)
/usr/local/lib/python3.10/dist-packages/uvr/models_dir/mdx/mdx_interface.py:254: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:278.)
  mix_part = torch.tensor([mix_part_], dtype=torch.float32).to(device)
/usr/local/lib/python3.10/dist-packages/torch/functional.py:704: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error.
Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at ../aten/src/ATen/native/SpectralOps.cpp:873.)
  return _VF.stft(  # type: ignore[attr-defined]
/usr/local/lib/python3.10/dist-packages/uvr/models_dir/mdx/mdx_interface.py:189: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  return stft.inverse(torch.tensor(spec_pred).to(device)).cpu().detach().numpy()

in My cpuonly 32G RAM 5700G, the same thing happened. What could be the cause. Just in case I converted the opus audio into mp3 and ogg vorbis in my CPU and the same thing still happened.

jhj0517 · 2024-12-15T12:21:11Z

Hi @starloreh. The stacktrace doesn't say errors, it just says warnings, which you can ignore.

The reason the progress is stuck at 0% is because it doesn't actually track the progress of the job, I just set it to 0% until the separation job is fully completed.

This confusion should be avoided, I should have said so explicitly in the progress message.

Actually, the model was probably working for separation, but very slowly on the CPU.

The UVR models are recommended to run on GPU, not CPU. I believe the speed difference is almost more than x10.
If you test it with just 2 or 3 seconds of audio, you will see that the job is done within 2 minutes on the CPU.

UVR models are super slow on the CPU, it's recommended to use the GPU only.

If you're using Colab, you can try T4 GPU runtime (free runtime).

starloreh · 2024-12-15T14:34:16Z

But I was using cuda with T4!
Though the Japanese video lasts 1 hour 40 min. The System RAM gets full, but the GPU RAM doesn't seem to budge.

Great WebUI, btw! faster-whisper-large-v3-turbo-ct2 is pretty fast on my cpu, and good enough for Western European languages transcriptions.
I wanted to use faster-whisper-large-v2 with silero and BGM removal to improve japanese transcription as well translation to English.

jhj0517 · 2024-12-15T15:27:40Z

I wanted to use faster-whisper-large-v2 with silero and BGM removal

There's a suspicious bug that seems to be related to this,
If you try to run really long audio ( more than 1hour, just like yours ) with VAD, it makes OOM error.

Fixes OOM Errors - too high RAM usage by VAD SYSTRAN/faster-whisper#1198

So I guess it's related to the VAD, not the BGM separation.

This is now fixed in faster-whisper, but the new version hasn't really been released yet.

I'm considering about installing faster-whisper directly from the repository.

jhj0517 · 2024-12-18T09:59:47Z

This should be fixed in #428.

Please feel free to reopen!

starloreh · 2024-12-21T02:30:15Z

Actually, it still doesn't work. Just tried it on the new colab, and it runs out of RAM without touching the GPU. Even though cuda option was chosen.

jhj0517 · 2024-12-21T15:54:11Z

@starloreh I just tried to reproduce it myself with 2 hours of video, but it was not reproducible. Everything worked fine on my end, even with 2 hours of video.

I'm not sure what caused the problem.
Just to be sure, are you sure you're using the latest version of the notebook here?

https://colab.research.google.com/github/jhj0517/Whisper-WebUI/blob/master/notebook/whisper-webui.ipynb

starloreh · 2024-12-21T16:48:42Z

Yes. To compare I tried https://github.com/Eddycrack864/UVR5-UI with UVR-MDX-NET-Inst_HQ_5, and it almost finished till it crashed but it was another audio of 2h 20 m

I'll keep on trying and report back.

starloreh added the bug Something isn't working label Dec 15, 2024

starloreh assigned jhj0517 Dec 15, 2024

jhj0517 mentioned this issue Dec 15, 2024

Update progress message #425

Merged

jhj0517 added this to the faster-whisper-upgrade milestone Dec 17, 2024

jhj0517 mentioned this issue Dec 17, 2024

Install faster-whisper directly from repository #428

Merged

jhj0517 closed this as completed Dec 18, 2024

jhj0517 reopened this Dec 21, 2024

jhj0517 mentioned this issue Dec 21, 2024

Getting OOM (out of memory) when using big files SYSTRAN/faster-whisper#1206

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BGM removal unusable #424

BGM removal unusable #424

starloreh commented Dec 15, 2024

jhj0517 commented Dec 15, 2024

starloreh commented Dec 15, 2024

jhj0517 commented Dec 15, 2024

jhj0517 commented Dec 18, 2024

starloreh commented Dec 21, 2024

jhj0517 commented Dec 21, 2024

starloreh commented Dec 21, 2024

BGM removal unusable #424

BGM removal unusable #424

Comments

starloreh commented Dec 15, 2024

jhj0517 commented Dec 15, 2024

starloreh commented Dec 15, 2024

jhj0517 commented Dec 15, 2024

jhj0517 commented Dec 18, 2024

starloreh commented Dec 21, 2024

jhj0517 commented Dec 21, 2024

starloreh commented Dec 21, 2024