Releases: kimwalisch/primesieve
primesieve-12.6
This is a new maintenance release, it is fully backwards compatible with the previous release.
PreSieve.cpp
: Added AVX512 and ARM SVE pre-sieving, up to 3% faster.PreSieve.cpp
: Increased pre-sieving to primes β€ 163 (previously primes β€ 100). Memory usage of pre-sieve lookup tables has been reduced from 210 kilobytes to 123 kilobytes and the pre-sieve lookup tables are now static (not generated at runtime anymore).CpuInfo.cpp
: More robust CPU cache size detection.
primesieve-12.5
This release improves the thread load balancing on CPUs with a large number of CPU cores. The worker threads now process smaller sieve intervals which improves the performance of short computations β€10 seconds. On a 4th Gen AMD EPYC 9R14 CPU with 192 threads counting the primes up to 10^12 now runs 10% faster (in 1.187 secs) and counting the primes up to 10^11 runs 70% faster (in 0.115 secs).
ChangeLog
ParallelSieve.cpp
: Tune thread load balancing.
primesieve-12.4
This is a maintenance release, the C/C++ API and ABI are fully backwards compatible with primesieve-12.*
ChangeLog
- Move x86 CPUID code from
cpuid.hpp
tosrc/x86/cpuid.cpp
. multiarch_x86_popcnt.cmake
: Detect x86 POPCNT support.CMakeLists.txt
: Use CMake list for all compile time definitions.CMakeLists.txt
: Use CMake list for all link libraries.
primesieve-12.3
This release adds runtime dispatching to AVX512 (for x64 CPUs that support it) for MinGW. For x64 CPUs, AVX512 runtime dispatching is now enabled by default when compiling using GCC and Clang on all operating systems.
- Improve Windows multiarch support (now works with MinGW64).
- Add runtime POPCNT detection using CPUID for x86 CPUs.
- Improve GCC/Clang multiarch preprocessor logic.
CMakeLists.txt
: Remove POPCNT/BMI check for x86 CPUs.
primesieve-12.1
This is a new maintenance release, it is fully backwards compatible with the previous release.
primesieve-12.0
The C/C++ API and ABI of primesieve-12.0 are fully backwards compatible with primesieve-11.*
The stress test functionality is the main new feature of primesieve-12.0, it can be launched using the --stress-test[=MODE]
option of the primesieve command-line application. The stress test option supports two modes: CPU
(default) or RAM
. The CPU mode uses little memory (< 5 MiB per thread) and puts the highest load on the CPU. The RAM mode uses much more memory (each thread uses about 1.16 GiB) than the CPU mode, but the CPU usually won't get as hot. Due to primesieve's function multi-versioning support, on x64 CPUs the stress test will run an AVX512 algorithm if your CPU supports it.
stressTest.cpp
: New-S[=MODE]
and--stress-test[=MODE]
command-line options.RiemannR.cpp
: Faster Riemann R function implementation #144.CmdOptions.cpp
: New-R
and--RiemannR
command line options.CmdOptions.cpp
: New--RiemannR-inverse
command line option.CmdOptions.cpp
: Add new--timeout
option for stress testing.main.cpp
: Improve command-line option handling.
primesieve-11.2
This is a new maintenance release, it is fully backwards compatible with the previous release. This release contains one CMake bug fix, documentation improvements, tests have been ported to GitHub Actions and the nth prime code has been cleaned up.
nthPrime.cpp
: Rewritten using more accurate nth prime approximation.nthPrimeApprox.cpp
: Added logarithmic integral and Riemann R function implementations.cmake/libatomic.cmake
: Fix failed to find libatomic #141..github/workflows/ci.yml
: Port AppVeyor CI tests to GitHub Actions.doc/C_API.md
: Fix off by 1 error in OpenMP example #137.doc/CPP_API.md
: Fix off by 1 error in OpenMP example #137.Vector.hpp
: Rename pod_vector to Vector and pod_array to Array.iterator.h
: Improve documentation.iterator.hpp
: Improve documentation.C_API.md
: Add SIMD (vectorization) section.CPP_API.md
: Add SIMD (vectorization) section.README.md
: Add C & C++ API badges.
Thanks to @sethtroisi and Sven S. for being primesieve sponsors in this release cycle!
primesieve-11.1
When primesieve is distributed via distro package managers, it is often not compiled using the highest optimization level -O3
. Because of this primesieve's pre-sieving algorithm was not auto-vectorized in many cases. As a workaround for this issue I have now manually vectorized the pre-sieving algorithm for x64 CPUs (using portable SSE2) and for ARM64 CPUs (using portable ARM NEON). This can improve performance by up to 40%.
PreSieve.cpp
: Vectorize loop using x64 SSE2 & ARM NEON.popcount.cpp
: Add POPCNT algorithm for x64 & AArch64.primesieve.h
: Fix-Wstrict-prototypes
warning.examples/c/*.c
: Fix-Wstrict-prototypes
warning.test/*.c
: Fix-Wstrict-prototypes
warning.CMakeLists.txt
: NewWITH_AUTO_VECTORIZATION
option (with default ON).cmake/auto_vectorize.cmake
: Enable auto-vectorization if the compiler supports it.scripts/build_mingw64_x64.sh
: Build primesieve x64 release binary.scripts/build_mingw64_arm64.sh
: Build primesieve arm64 release binary.
primesieve-11.0
This version fixes two annoying libprimesieve issues. Firstly, from now on the shared libprimesieve version (.so
version) will match the primesieve version. This makes it easier to depend on libprimesieve and to update to the latest libprimesieve. Secondly, primesieve_jump_to()
has been added to libprimesieve's API. The new primesieve_jump_to(iter, start, stop)
includes the start number (generates primes β₯ start), whereas the old primesieve_skipto(iter, start, stop)
excludes the start number (generates primes > start). In practice, the use of
primesieve_jump_to()
requires up to 2x less start number corrections (e.g. start-1) compared to primesieve_skipto()
.
C API deprecations
The libprimesieve C API and ABI are backwards compatible with libprimesieve β₯ 10.0. However, the primesieve_skipto()
function from the libprimesieve C API has been marked as deprecated, please use the new primesieve_jump_to()
instead.
C++ API breaking changes
Unlike the C API, in the C++ API the primesieve::iterator::skipto()
method has been replaced by primesieve::iterator::jump_to()
. The new method includes the start number whereas the old method excluded the start number. The primesieve::iterator
constructors now also include the start number while they previously excluded the start number. Please read the documentation for more information.
ChangeLog
CMakeLists.txt
: Improve Emscripten WebAssembly support.iterator.cpp
: Add newprimesieve::iterator::jump_to()
.iterator.cpp
: Fix use after free inprimesieve::iterator::clear()
.iterator-c.cpp
: Add newprimesieve_jump_to()
.iterator-c.cpp
: Markprimesieve_skipto()
as deprecated.iterator-c.cpp
: Fix use after free inprimesieve_iterator_clear()
.pod_vector.hpp
: Added support for types with destructors.malloc_vector.hpp
: Fix potential memory leak.api.cpp
: Support non power of 2 sieve sizes.PrimeSieve.cpp
: Support non power of 2 sieve sizes.PreSieve.cpp
: Usestd::initializer_list
instead ofstd::vector
.Erat.cpp
: Improve documentation.C_API.md
: Improvenext_prime()
andprev_prime()
documentation.CPP_API.md
: Improvenext_prime()
andprev_prime()
documentation.
Acknowledgements
I would like to thank Philip Vetter for his detailed feedback on the libprimesieve API, which caused me to create the new primesieve_jump_to()
.
primesieve-8.0
This is a new major release, the API of libprimesieve is backwards compatible, but the ABI (Application Binary Interface) of libprimesieve is not backwards compatible. This means that if your program uses the C/C++ libprimesieve you can simply recompile your program against the latest libprimesieve without any modifications of your code needed. If on the other hand you have e.g. written libprimesieve bindings for another programming language you will have to migrate your code to the new libprimesieve ABI.
Highlights of primesieve-8.0
- libprimesieve now has multiarch support for x64 CPUs. At runtime libprimesieve now dispatches to the latest supported CPU instruction set like
POPCNT
,BMI2
,AVX512
#116. - libprimesieve now generates an array (or vector) of primes up to 20% faster #123.
ChangeLog
primesieve::iterator
's ABI has been modified in both the C & C++ API.
primesieve::iterator
's API remains backwards compatible.CPP_API.md
: Renameddoc/CPP_Examples.md
todoc/CPP_API.md
.C_API.md
: Renameddoc/C_Examples.md
todoc/C_API.md
.- Fix undefined behavior (g++-12 issue) caused by
resizeUninitialized.hpp
, use newpod_vector<uint64_t>
frompod_vector.hpp
instead. iterator.cpp
: Enable pre-sieving forprimesieve::iterator.prev_prime()
.iterator-c.cpp
: Enable pre-sieving forprimesieve::iterator.prev_prime()
.PreSieve.cpp
: Detect if the user sieves many consective intervals.PrimeGenerator.cpp
: ImproveAVX512
offillNextPrimes()
.PrimeGenerator.cpp
: Reduce memory usage for tiny stop numbers.PrimeGenerator.hpp
: Add GCC/Clang's function multiversioning forAVX512
.Erat.cpp
: Dynamically grow the sieve size: use a small sieve size for small stop numbers and a large sieve size for large stop numbers.Erat.cpp
: Reduce memory usage, allocate the minimum required memory to store all sieving primes.CpuInfo.cpp
: DetectAVX512
using CPUID.pmath.hpp
: Use compiler instrinsics forilog2()
&floorPow2()
.StorePrimes.hpp
: Usevector::insert()
instead ofvector::push_back()
, see: #123.CMakeLists.txt
: Automatically enable expensive debug assertions in debug mode (if CMAKE_BUILD_TYPE=Debug
).