questions about the upcoming fp8 support #13735
-
Hi! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 10 replies
-
1.5 models are already supported for cuda(that's why it has -xl variation) And for cpu I haven't checked Basically the idea is param in fp8 but calculation in other precision |
Beta Was this translation helpful? Give feedback.
-
thank u for answer! thank u so much for ur hard work |
Beta Was this translation helpful? Give feedback.
-
@Amin456789 I push a commit for CPU Need you guys to help to check if it work. |
Beta Was this translation helpful? Give feedback.
Just model.to(float8_e4m3fn)
(e4m3 is normally enough for all the usecase, but if you meet some problems that some param will overflow, use e5m2)