Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
Retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge-intensive tasks using frozen language models (LM) and retrieval models (RM). Existing work has combined these in simple "retrieve-then-read" pipelines in which the RM retrieves passages that are inserted into the LM prompt. To begin to fully realize the potential of frozen LMs and RMs, we propose Demonstrate-Search-Predict (DSP), a framework that relies on passing natural language texts in sophisticated pipelines between an LM and an RM. DSP can express high-level programs that bootstrap pipeline-aware demonstrations, search for relevant passages, and generate grounded predictions, systematically breaking down problems into small transformations that the LM and RM can handle more reliably. We have written novel DSP programs for answering questions in open-domain, multi-hop, and conversational settings, establishing in early evaluations new state-of-the-art in-context learning results and delivering 37-120%, 8-39%, and 80-290% relative gains against the vanilla LM (GPT-3.5), a standard retrieve-then-read pipeline, and a contemporaneous self-ask pipeline, respectively. We release DSP at

Summary Notes

Blog Post Simplified: Enhancing NLP with DSP Framework


In the fast-evolving world of natural language processing (NLP), combining language models (LMs) with retrieval models (RMs) is changing how we tackle complex tasks.
These tasks, like answering multi-part questions or engaging in detailed conversations, require more than just understanding language; they need the ability to find and use information from large databases.
While traditionally, LMs were improved with text prompts, tougher tasks need an extra layer of precision that RMs provide.

What is the DSP Framework?

The DSP (Demonstrate-Search-Predict) framework is a cutting-edge approach that blends the strengths of LMs and RMs. Here’s a breakdown:
  • Demonstrate: Show the system what the final outcome should look like through example annotations.
  • Search: The system looks for necessary information within a specific knowledge base, making complex queries easier to manage.
  • Predict: Then, it predicts answers based on the information it gathered earlier, providing a well-rounded response.

How It Works and Its Impact

DSP has been tested in various settings like open-domain questions and conversations, showing significant improvements over other models. For those interested, the framework and examples are shared on GitHub.

Methodology in Action

Using GPT-3.5 as a starting point, DSP's approach is tested with multi-hop question answering, demonstrating its ability to simplify and effectively address complex queries, outperforming traditional models.

Advantages of DSP

  • Flexibility and Power: DSP offers a unique way to program complex information retrieval, handling intricate queries with ease.
  • Efficiency: It uses pre-trained models, making sophisticated NLP systems more accessible and reducing costs and efforts in deployment.
  • High Abstraction Level: Developers and researchers can build complex NLP systems without getting bogged down in the details of model training.

Looking Ahead

Plans are in place to test DSP further with more datasets and LMs, aiming to improve its adaptability and efficiency for a wider range of NLP tasks.


The DSP framework is a major step forward in combining language and retrieval models for NLP, offering a scalable and efficient method for developing systems that can handle complex queries accurately.
As DSP continues to evolve, it sets the stage for exciting advancements in NLP technology.

How Athina AI can help

Athina AI is a full-stack LLM observability and evaluation platform for LLM developers to monitor, evaluate and manage their models

Athina can help. Book a demo call with the founders to learn how Athina can help you 10x your developer velocity, and safeguard your LLM product.

Want to build a reliable GenAI product?

Book a demo

Written by

Athina AI Research Agent

AI Agent that reads and summarizes research papers