-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Home
Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python (versions 2 and 3). Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research.
Given a set of vectors x_i in dimension d, Faiss build a data structure in RAM from it. After the structure is constructed, when given a new vector x in dimension d it performs efficiently the operation:
i = argmin_i ||x - x_i||
where ||.|| is the Euclidean distance (L2).
If Faiss terms, the data structure is an index, an object that has an add method to add x_i vectors. Note that the x_i's are assumed to be fixed.
Computing the argmin is the search operation on the index.
This is all what Faiss is about. It can also:
-
return not just the nearest neighbor, but also the 2nd nearest, 3rd, ..., k-th nearest neighbor
-
search several vectors at a time rather than one (batch processing). For many index types, this is faster than searching one vector after another
-
trade precision for speed, ie. give an incorrect result 10% of the time with a method that's 10x faster or uses 10x less memory
-
perform maximum inner product search argmax_i <x, x_i> instead of minimum Euclidean search. There is also limited support for other distances (L1, Linf, etc.).
-
return all elements that are within a given radius of the query point (range search)
-
store the index on disk rather than in RAM.
Faiss is based on years of research. Most notably it implements:
-
The inverted file from “Video google: A text retrieval approach to object matching in videos.”, Sivic & Zisserman, ICCV 2003. This is the key to non-exhaustive search in large datasets. Otherwise all searches would need to scan all elements in the index, which is prohibitive even if the operation to apply for each element is fast
-
The product quantization (PQ) method from “Product quantization for nearest neighbor search”, Jégou & al., PAMI 2011. This can be seen as a lossy compression technique for high-dimensional vectors, that allows relatively accurate reconstructions and distance computations in the compressed domain.
-
The three-level quantization (IVFADC-R aka
IndexIVFPQR
) method from "Searching in one billion vectors: re-rank with source coding", Tavenard & al., ICASSP'11. -
The inverted multi-index from “The inverted multi-index”, Babenko & Lempitsky, CVPR 2012. This method greatly improves the speed of inverted indexing for fast/less accurate operating points.
-
The optimized PQ from “Optimized product quantization”, He & al, CVPR 2013. This method can be seen as a linear transformation of the vector space to make it more amenable for indexing with a product quantizer.
-
The pre-filtering of product quantizer distances from “Polysemous codes”, Douze & al., ECCV 2016. This technique performs a binary filtering stage before computing PQ distances.
-
The GPU implementation and fast k-selection is described in “Billion-scale similarity search with GPUs”, Johnson & al, ArXiv 1702.08734, 2017
-
The HNSW indexing method from "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs", Malkov & al., ArXiv 1603.09320, 2016
A general paper about product quantization and related methods: "A Survey of Product Quantization", Yusuke Matsui, Yusuke Uchida, Hervé Jégou, Shin’ichi Satoh, ITE transactions on MTA, 2018. The overview image below is from that paper (boxes in red are the papers that are implemented in Faiss):
This wiki contains high-level information about Faiss and a tutorial. Navigate it using the sidebar.
Most examples are in Python for brievity, but the C++ API is exactly the same, so the translation for one to the other is trivial most of the times.
Faiss building blocks: clustering, PCA, quantization
Index IO, cloning and hyper parameter tuning
Threads and asynchronous calls
Inverted list objects and scanners
Indexes that do not fit in RAM
Brute force search without an index
Fast accumulation of PQ and AQ codes (FastScan)
Setting search parameters for one query
Binary hashing index benchmark