Dense Passage Retrieval (DPR)

advanced
TechniquesLast updated: 2025-01-15
Also known as: DPR

What is Dense Passage Retrieval (DPR)?


Dense Passage Retrieval (DPR) is a neural retrieval technique that uses trained bi-encoder models to encode both queries and passages into dense vector embeddings optimized specifically for retrieval tasks. Unlike generic embedding models, DPR models are trained with contrastive learning objectives that explicitly optimize for the task of matching questions to relevant passages, typically resulting in better retrieval performance for question-answering applications.


The DPR approach uses separate encoder models for queries and passages (a bi-encoder architecture), allowing passages to be pre-encoded and indexed offline while queries are encoded at runtime. The training process uses datasets of question-passage pairs, teaching the model to produce similar embeddings for questions and their relevant passages while pushing apart embeddings for unrelated pairs. This specialized training makes DPR particularly effective for knowledge-intensive tasks where precise retrieval is critical.


DPR represented a significant advance over sparse retrieval methods like BM25, demonstrating that neural approaches could substantially improve retrieval quality for question-answering tasks. While the term DPR specifically refers to the technique introduced by Facebook AI Research, the broader approach of using trained bi-encoders for retrieval has become widely adopted. Modern systems often combine DPR-style dense retrieval with sparse methods in hybrid architectures to leverage both semantic matching and keyword precision.


Related Terms