Preview Release 5
New in this release:
Large performance improvements for calls to reoptimize
!
- Previously, we would immediately try to obtain a newly compiled version of the code once the previous version's measurements were stable, blocking if the compiled code isn't ready. Now, we return the best-seen version if there is a concurrent compile job active but has not completed yet.
- The tuner is now less likely to suggest fully unrolling every loop in the module by biasing the generation of random configs towards smaller unrolling factors. High values often use to overload LLVM, leading to non-terminating compilation jobs.
- Fast Instruction Selection and
-O1
(for both optimization and codegen) are now used when compiling the default/initial configuration. This allows the tuner to respond very quickly to first-timereoptimize
requests. - The IR optimization pipeline is no longer being run twice on each compile, and it is also now being tuned for both code size and aggressiveness.