Add a new architecture mode: 'avx512_spr'. #4025

mulugetam · 2024-11-12T17:18:16Z

This PR adds a new architecture mode to support the new extensions to AVX512, namely AVX512-FP16, which have been available since Intel® Sapphire Rapids.

This PR is a prerequisite for PR#4020 that speeds up hamming distance evaluations.

mengdilin · 2024-11-12T17:46:40Z

Hmm weird the CIs are not running on the PR. Do you mind pushing a new commit and see if the CIs start?

mulugetam · 2024-11-12T18:26:15Z

@mengdilin Pushed a new commit but CI not starting. Is this possibly because I updated .github/workflows/build.yml?

mengdilin · 2024-11-12T19:33:46Z

Yea there is a syntax error in the build file: see https://github.com/facebookresearch/faiss/actions/runs/11804384709

mengdilin · 2024-11-12T19:34:25Z

.github/workflows/build.yml

@@ -67,6 +67,17 @@ jobs:
        uses: ./.github/actions/build_cmake
        with:
          opt_level: avx512
+  linux-x86_64-AVX512-cmake:


linux-x86_64-AVX512-advanced-cmake

mulugetam · 2024-11-12T22:29:04Z

@mengdilin CI is picking g++ version 11. But AVX512-FP16 (-mavx512fp16) requires version 12+.

mengdilin · 2024-11-12T22:49:07Z

Ah yea the conda publication CI is using the older compiler version. We should investigate on our side; however, I don't think we want to publish this architecture mode to conda right now, can you omit it?

Looks like CI failure is coming from a unit test failure from

faiss/tests/test_contrib.py

Line 552 in 0fb56d9

def test_ivf_train_2level(self):

Try commenting out that test and see if anything else fails?
I'm going on PTO, handing this over to the next performance oncall @kuarora

mengdilin · 2024-11-12T22:48:02Z

conda/faiss/build-lib.sh

      -DFAISS_ENABLE_GPU=OFF \
      -DFAISS_ENABLE_PYTHON=OFF \
      -DBLA_VENDOR=Intel10_64lp \
      -DCMAKE_INSTALL_LIBDIR=lib \
      -DCMAKE_BUILD_TYPE=Release .

-make -C _build -j$(nproc) faiss faiss_avx2 faiss_avx512
+make -C _build -j$(nproc) faiss faiss_avx2 faiss_avx512 faiss_avx512_sr


This is used for faiss's conda packaging upload. I don't think we want to expose this build mode yet in conda officially. Can you omit this for now?

mulugetam · 2024-11-13T16:51:37Z

Thanks @mengdilin. @kuarora Could you please review?

alexanderguzhva · 2024-11-13T17:12:34Z

@mulugetam I would use -march=sapphirerapids -mtune=sapphirerapids for the compiler flags, because SR supports many other AVX512 instruction extensions that are not currently listed among compiler flags

mengdilin

Back from PTO. LGTM, can you resolve the conflict so I can import the PR. For the unit test, can you relax the threshold instead of commenting it out? If somehow you cannot relax this threshold, you can skip this unittest when in SR mode similar to

faiss/faiss/gpu/test/test_cagra.py

Line 13 in 697b6dd

@unittest.skipIf(

(ideally we don't have to do this)

mengdilin · 2024-12-03T18:43:36Z

faiss/CMakeLists.txt

+endif()
+if(NOT WIN32)
+  # Architecture mode to support AVX512 extensions available since Intel(R) Sapphire Rapids.
+  # Ref: https://networkbuilders.intel.com/solutionslibrary/intel-avx-512-fp16-instruction-set-for-intel-xeon-processor-based-products-technology-guide


thanks for the ref!

mengdilin · 2024-12-03T18:45:16Z

tests/test_contrib.py

@@ -568,7 +569,7 @@ def test_ivf_train_2level(self):
        # normally 47 / 200 differences
        ndiff = (Iref != Inew).sum()
        self.assertLess(ndiff, 51)


is there away to relax the threshold such that this test passes in SR mode as well?

@mengdilin The test case passes in my machine with a ndiff value of 50 (not sure why it fails here). I will do another push that resolves the conflict. I have also updated the name from avx512-sr to avx512_spr to be consistent with what numpy returns.

mulugetam · 2024-12-19T21:47:52Z

@mengdilin Only timeout and anaconda_telemetry errors.

mengdilin · 2024-12-19T22:06:10Z

@mulugetam there were some CI issues yesterday that we resolved. Can you rebase to latest commit and retry? The anaconda issues should definitely be resolved on the latest.

avx512_spr is a mode that supports avx512 features available since Intel(R) Sapphire Rapids. Signed-off-by: Mulugeta Mammo <[email protected]>

Signed-off-by: Mulugeta Mammo <[email protected]>

…ain_2level. Signed-off-by: Mulugeta Mammo <[email protected]>

Signed-off-by: Mulugeta Mammo <[email protected]>

…om 51 to 53. Signed-off-by: Mulugeta Mammo <[email protected]>

Signed-off-by: Mulugeta Mammo <[email protected]>

facebook-github-bot · 2024-12-20T16:58:36Z

@mengdilin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mengdilin · 2024-12-20T17:04:23Z

@mulugetam faiss/python/swigfaiss_avx512_spr.swig shouldn't be checked in. Looks like it's part of .gitignore but it got accidentally tracked. Can you revert this?

Signed-off-by: Mulugeta Mammo <[email protected]>

mulugetam · 2024-12-20T18:00:26Z

@mengdilin untracked.

facebook-github-bot · 2024-12-20T19:52:01Z

@mengdilin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-12-23T16:59:50Z

@mengdilin merged this pull request in 3beb07b.

Summary: The `_mm512_popcnt_epi64` intrinsic is used to accelerate Hamming distance calculations in `HammingComputerDefault` and `HammingComputer64`. Benchmarking with [bench_hamming_computer](https://github.com/facebookresearch/faiss/blob/main/benchs/bench_hamming_computer.cpp) on AWS [r7i](https://aws.amazon.com/ec2/instance-types/r7i/) instance shows a performance improvement of up to 30% compared to AVX-2. This PR depends on [PR#4025](#4025) Pull Request resolved: #4020 Reviewed By: junjieqi Differential Revision: D67650183 Pulled By: mengdilin fbshipit-source-id: 17e5b68570dced1fea0b885dd4e67c17dfc7bece

facebook-github-bot added the CLA Signed label Nov 12, 2024

mulugetam mentioned this pull request Nov 12, 2024

Use _mm512_popcnt_epi64 to speedup hamming distance evaluation. #4020

Closed

mengdilin reviewed Nov 12, 2024

View reviewed changes

gtwang01 added the install label Nov 12, 2024

mengdilin reviewed Nov 12, 2024

View reviewed changes

mengdilin reviewed Dec 3, 2024

View reviewed changes

mulugetam changed the title ~~Add a new architecture mode: 'avx512-sr'.~~ Add a new architecture mode: 'avx512_spr'. Dec 17, 2024

mulugetam force-pushed the avx512-sr branch from 315d5c3 to 058d750 Compare December 18, 2024 20:09

mulugetam added 10 commits December 19, 2024 22:49

Add a new architecture mode: 'avx512_spr'

e026e56

avx512_spr is a mode that supports avx512 features available since Intel(R) Sapphire Rapids. Signed-off-by: Mulugeta Mammo <[email protected]>

Remove unnecessary space.

dc20374

Signed-off-by: Mulugeta Mammo <[email protected]>

Remove avx512-sr mode from conda.

b967931

Signed-off-by: Mulugeta Mammo <[email protected]>

Comment out test_ivf_train_2level in faiss/tests/test_contrib.py.

2507ace

Signed-off-by: Mulugeta Mammo <[email protected]>

Use sapphirerapids for -march and -mtune.

693fd40

Signed-off-by: Mulugeta Mammo <[email protected]>

Remove unnecessary spaces.

417f7ee

Signed-off-by: Mulugeta Mammo <[email protected]>

Rename avx512-sr to avx512_spr. Uncomment test_contrib.py/test_ivf_tr…

7abee3d

…ain_2level. Signed-off-by: Mulugeta Mammo <[email protected]>

Add avx512_spr to build-pull-request.

96e5a7f

Signed-off-by: Mulugeta Mammo <[email protected]>

Bump the number of node differences in tests/test_ivf_train_2level fr…

6a9545d

…om 51 to 53. Signed-off-by: Mulugeta Mammo <[email protected]>

Modify the description for avx512_spr.

f4804e6

Signed-off-by: Mulugeta Mammo <[email protected]>

mulugetam force-pushed the avx512-sr branch from 733e0c6 to f4804e6 Compare December 19, 2024 22:56

Stop tracking faiss/python/swigfaiss_avx512_spr.swig

9d75e9b

Signed-off-by: Mulugeta Mammo <[email protected]>

facebook-github-bot closed this in 3beb07b Dec 23, 2024

facebook-github-bot added the Merged label Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new architecture mode: 'avx512_spr'. #4025

Add a new architecture mode: 'avx512_spr'. #4025

mulugetam commented Nov 12, 2024

mengdilin commented Nov 12, 2024

mulugetam commented Nov 12, 2024

mengdilin commented Nov 12, 2024

mengdilin Nov 12, 2024

mulugetam commented Nov 12, 2024

mengdilin commented Nov 12, 2024

mengdilin Nov 12, 2024

mulugetam Nov 13, 2024

mulugetam commented Nov 13, 2024

alexanderguzhva commented Nov 13, 2024

mengdilin left a comment

mengdilin Dec 3, 2024

mengdilin Dec 3, 2024

mulugetam Dec 17, 2024

mulugetam commented Dec 19, 2024

mengdilin commented Dec 19, 2024

facebook-github-bot commented Dec 20, 2024

mengdilin commented Dec 20, 2024

mulugetam commented Dec 20, 2024

facebook-github-bot commented Dec 20, 2024

facebook-github-bot commented Dec 23, 2024

Add a new architecture mode: 'avx512_spr'. #4025

Add a new architecture mode: 'avx512_spr'. #4025

Conversation

mulugetam commented Nov 12, 2024

mengdilin commented Nov 12, 2024

mulugetam commented Nov 12, 2024

mengdilin commented Nov 12, 2024

mengdilin Nov 12, 2024

Choose a reason for hiding this comment

mulugetam commented Nov 12, 2024

mengdilin commented Nov 12, 2024

mengdilin Nov 12, 2024

Choose a reason for hiding this comment

mulugetam Nov 13, 2024

Choose a reason for hiding this comment

mulugetam commented Nov 13, 2024

alexanderguzhva commented Nov 13, 2024

mengdilin left a comment

Choose a reason for hiding this comment

mengdilin Dec 3, 2024

Choose a reason for hiding this comment

mengdilin Dec 3, 2024

Choose a reason for hiding this comment

mulugetam Dec 17, 2024

Choose a reason for hiding this comment

mulugetam commented Dec 19, 2024

mengdilin commented Dec 19, 2024

facebook-github-bot commented Dec 20, 2024

mengdilin commented Dec 20, 2024

mulugetam commented Dec 20, 2024

facebook-github-bot commented Dec 20, 2024

facebook-github-bot commented Dec 23, 2024