Skip to content

Releases: kavon/atJIT

Preview Release 6

30 Aug 19:02
Compare
Choose a tag to compare

New in this (large) release:

  • We are now using Polly for something! N-dimensional tiling (where N > 1) can be performed on loops, where the tiling sizes are chosen by the tuner. This requires POLLY_KNOBS to be ON, and the use of Kruse's out-of-tree version of Polly. Instructions are in the README.
  • ATDriver now uses an experimentation-rate limiter to better exploit the results of the tuner. [1]
  • The Bayes tuner is no longer (badly) over-fitting to its training examples, making it a lot smarter. [2]
  • Reduced the number of individual loop settings that are picked when randomly producing a new config. [3]
  • The Annealing tuner was slightly improved by choosing a better escape velocity factor.
  • Compilation jobs that seem to be non-terminating (>90sec) are now detected and we crash the program in this case.
  • The tuner can now play with enabling Interprocedural Register Allocation, which is not on by default in LLVM.
  • The benchmark suite has been greatly improved. [4]

Footnotes:

  1. Every Nth reoptimize request will actually obtain a new version of the function. Other times, we return the best-known version.
  2. We hold-out a small number of examples to use for validation during training, and we stop training when the prediction error is no longer decreasing.
  3. We use a biased coin-flip to filter these out.
  4. We build multiple variants with different AOT optimization settings. We also can now view the performance with and without JIT overheads so that we can more directly compare the tuning algorithms themselves.

Preview Release 5

17 Aug 01:32
Compare
Choose a tag to compare

New in this release:

Large performance improvements for calls to reoptimize!

  • Previously, we would immediately try to obtain a newly compiled version of the code once the previous version's measurements were stable, blocking if the compiled code isn't ready. Now, we return the best-seen version if there is a concurrent compile job active but has not completed yet.
  • The tuner is now less likely to suggest fully unrolling every loop in the module by biasing the generation of random configs towards smaller unrolling factors. High values often use to overload LLVM, leading to non-terminating compilation jobs.
  • Fast Instruction Selection and -O1 (for both optimization and codegen) are now used when compiling the default/initial configuration. This allows the tuner to respond very quickly to first-time reoptimize requests.
  • The IR optimization pipeline is no longer being run twice on each compile, and it is also now being tuned for both code size and aggressiveness.

Preview Release 4

09 Aug 18:52
Compare
Choose a tag to compare

New in this release:

  • Compilation jobs can now occur in parallel with the client's code. reoptimize requests can spark these optimistic additional jobs (currently up to at most 10) so that future reoptimize calls hopefully spend less time waiting on the compiler.
  • The queue of compilation jobs are also making use of pipeline parallelism, i.e., optimization of some module can occur in parallel with the codegen of another module.
  • The codegen optimization level is now a tuned knob, defaulting to Aggressive
  • FastISel is now a tuned knob, defaulting to off.
  • The test suite now includes a basic memory leak test.

Preview Release 3

04 Aug 18:16
Compare
Choose a tag to compare

New in this release:

  • A new tuner, AT_Anneal that uses Simulated Annealing (SA) to optimize the configuration. The particular SA algorithm used is relatively simple: https://www.jstor.org/stable/2246034
  • Now that we have SA, we also have the concept of perturbing a configuration, i.e., randomly pick a "nearby" configuration. This has been used to improve AT_Bayes so that there is actually an exploration-exploitation trade off. Currently, we try to exploit only the best-seen configuration.
  • Better debugging output while tuning: now we properly show time spent optimizing & compiling.

Preview Release 2

30 Jul 15:46
Compare
Choose a tag to compare

New features:

  • A tuner that is somewhat intelligent. It uses techniques inspired by Bayesian Optimization, and is called AT_Bayes.
  • The ability to specify tunable parameters via the tuned_parameter::IntRange class. This allows the user to tell the tuner to optimize the input value according to an inclusive range constraint. Thus, algorithmic selection is now possible via this feature.
  • The pct_err option to reoptimize to specify how tolerable of noise you would like the tuner to be. This indirectly controls the rate at which the tuner navigates the search space, e.g., less tolerance for error on a noisy system will reduce the pace of experimentation.

Preview Release 1

24 Jul 15:32
Compare
Choose a tag to compare

This release includes:

  • a simplistic random tuner that always generates a randomly optimized function once the previously optimized version has received enough performance measurements.
  • an up-to-date README