Table of Contents
This benchmark compares the execution time of the Horner scheme algorithm for univariate polynomial evaluation, implemented with popular expression template libraries in C++, that have been made accessible in python through the use of pybind11.
The libraries that are currently available in the benchmark are
We use iterative
and recursive
Horner algorithms as well as dynamic
and fixed
size polynomials.
The fixed recursive
versions are of special interest since they heavily exploit the expression template nature of the aforementioned libraries through the use of TMP and auto return type deduction. However, those fixed size polynomials may not be applicable to most problems, therefore we also supply dynamic versions which accept arbitrary number of coefficients (order).
In order to be able to run this benchmark on your PC you need to have the following libraries installed
C++:
You also need Python 3.x with the following libraries
Python:
With these libraries installed you can simply run
git clone https://github.com/BeneSim/polynomial_benchmark.git
cd polynomial_benchmark
mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH="\
/path/to/xtl/installation/root;\
/path/to/xsimd/installation/root;\
/path/to/xtensor/installation/root;\
/path/to/xtensor-python/installation/root;\
/path/to/Eigen/installation/root;\
/path/to/pybind11/installation/root" \
-DCMAKE_MODULE_PATH="/path/to/xtensor-python/src/cmake" \
-DCMAKE_BUILD_TYPE=Release ..
cmake --build . --target benchmark
You can of course omit the -DCMAKE_PREFIX_PATH
stuff if you've installed the libraries in paths that CMake is aware of. xtensor-python ships with a FindNumPy.cmake module in the src/cmake folder.
There are 3 additional options that you can set (using cmake -DOPTION_NAME=VALUE
or other tools like ccmake
)
USE_MARCH_NATIVE=[ON/OFF]
Compiles the python bindings with-march=native
USE_XSIMD=[ON/OFF]
Enablesxsimd
supportUSE_LAMBDA_XFUNCTION
Uses an alternative approach for the fixed recursive Horner scheme (suggested by @wolfv)
If you want to run every combination, e.g. if you want to contribute to the benchmark results, you may want to use a simple bash script like this one
#!/bin/bash
for USE_MARCH_NATIVE in ON OFF
do
for USE_XSIMD in ON OFF
do
for USE_LAMBDA_XFUNCTION in ON OFF
do
cmake -DUSE_MARCH_NATIVE=$USE_MARCH_NATIVE -DUSE_XSIMD=$USE_XSIMD -DUSE_LAMBDA_XFUNCTION=$USE_LAMBDA_XFUNCTION ..
cmake --build . --target benchmark
done
done
done
Warning: Running all combinations will approximately take 30 minutes!
Here are some benchmarks for various CPUs, only the order = 20
benchmarks will be presented. The complete set of benchmarks can be found in benchmarks.
If you'd like to contribute with benchmark results on different CPUs feel free to add a PR. Please make sure to run all combinations of compile options as demonstrated in Run the Benchmark.