Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer overflow in slat decoding #95

Open
RuiningLi opened this issue Dec 27, 2024 · 3 comments
Open

Integer overflow in slat decoding #95

RuiningLi opened this issue Dec 27, 2024 · 3 comments

Comments

@RuiningLi
Copy link

Hi,

Thank you for the enormous efforts in open-sourcing this amazing project!

I'm testing the model on some in-the-wild images. Most images work fine, but on a small number of images there is the following error:

[Exception|implicit_gemm]feat=torch.Size([7342720, 192]),w=torch.Size([96, 3, 3, 3, 192]),pair=torch.Size([27, 7342720]),act=7342720,issubm=True,istrain=True

Upon further investigation it seems like there is an overflow occurring for the underlying spconv library:

/home/ruiningli/spconv/spconv/build/core_cc/src/cumm/conv/main/ConvMainUnitTest/ConvMainUnitTest_matmul_split_Ampere_f16f16f16_0.cu(294)                                                                                                             
int64_t(N) * int64_t(C) * tv::bit_size(algo_desp.dtype_a) / 8 < int_max assert faild. your data exceed int32 range. this will be fixed in cumm + nvrtc (spconv 2.2/2.3).

I'm wondering if anyone has encountered the same during training / inference, and how I might be able to get around this.

Many thanks!
Ruining

@FishWoWater
Copy link

Could you please upload the test images you used?

@RuiningLi
Copy link
Author

Image
I'm having issues with the above image. But it seems like the 3D asset can be generated properly on the HF demo. Maybe there is a version mismatch?

@RuiningLi
Copy link
Author

OK I dig a bit further down on this, it seems like the issue occurs only during batched inference, i.e., num_samples is bigger than 1 during inference time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants