![『[RAG-GOOGLE] MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings』のカバーアート](https://m.media-amazon.com/images/I/41inar-KhpL._SL500_.jpg)
[RAG-GOOGLE] MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
カートのアイテムが多すぎます
カートに追加できませんでした。
ウィッシュリストに追加できませんでした。
ほしい物リストの削除に失敗しました。
ポッドキャストのフォローに失敗しました
ポッドキャストのフォロー解除に失敗しました
-
ナレーター:
-
著者:
このコンテンツについて
Welcome to our podcast! Today, we're diving into MUVERA (Multi-Vector Retrieval Algorithm), a groundbreaking development from researchers at Google Research, UMD, and Google DeepMind. While neural embedding models are fundamental to modern information retrieval (IR), multi-vector models, though superior, are computationally expensive. MUVERA addresses this by ingeniously reducing complex multi-vector similarity search to efficient single-vector search, allowing the use of highly-optimised MIPS (Maximum Inner Product Search) solvers.
The core innovation is Fixed Dimensional Encodings (FDEs), single-vector proxies for multi-vector similarity that offer the first theoretical guarantees (ε-approximations). Empirically, MUVERA significantly outperforms prior state-of-the-art implementations like PLAID, achieving an average of 10% higher recall with 90% lower latency across diverse BEIR retrieval datasets. It also incorporates product quantization for 32x memory compression of FDEs with minimal quality loss.
A current limitation is that MUVERA did not outperform PLAID on the MS MARCO dataset, possibly due to PLAID's extensive tuning for that specific benchmark. Additionally, the effect of the average number of embeddings per document on FDE retrieval quality remains an area for future study. MUVERA's applications primarily lie in enhancing modern IR pipelines, potentially improving the efficiency of components within LLMs.
Learn more: https://arxiv.org/pdf/2405.19504