Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines

Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines
Do not index
Do not index
Original Paper
Background Large Language Models (LLMs), enhanced with Clinical Practice Guidelines (CPGs), can significantly improve Clinical Decision Support (CDS). However, methods for incorporating CPGs into LLMs are not well studied. Methods We develop three distinct methods for incorporating CPGs into LLMs: Binary Decision Tree (BDT), Program-Aided Graph Construction (PAGC), and Chain-of-Thought-Few-Shot Prompting (CoT-FSP). To evaluate the effectiveness of the proposed methods, we create a set of synthetic patient descriptions and conduct both automatic and human evaluation of the responses generated by four LLMs: GPT-4, GPT-3.5 Turbo, LLaMA, and PaLM 2. Zero-Shot Prompting (ZSP) was used as the baseline method. We focus on CDS for COVID-19 outpatient treatment as the case study. Results All four LLMs exhibit improved performance when enhanced with CPGs compared to the baseline ZSP. BDT outperformed both CoT-FSP and PAGC in automatic evaluation. All of the proposed methods demonstrated high performance in human evaluation. Conclusion LLMs enhanced with CPGs demonstrate superior performance, as compared to plain LLMs with ZSP, in providing accurate recommendations for COVID-19 outpatient treatment, which also highlights the potential for broader applications beyond the case study.

Summary Notes

Enhancing Healthcare AI with Large Language Models: A New Chapter in Clinical Decision Support


Artificial Intelligence (AI), and specifically Large Language Models (LLMs), are revolutionizing healthcare by improving the way medical professionals diagnose and treat diseases.
A recent study has made a significant leap by integrating Clinical Practice Guidelines (CPGs) into LLMs, focusing on enhancing Clinical Decision Support (CDS) systems for COVID-19 outpatient treatments. This blog post delves into how this integration can transform healthcare decision-making.

Methods for Integrating Clinical Practice Guidelines into LLMs

  • Binary Decision Tree (BDT): This method organizes clinical guidelines into a binary tree, guiding the LLM through yes/no questions to reach a treatment recommendation, mimicking a clinician's thought process.
  • Program-Aided Graph Construction (PAGC): PAGC uses a program to navigate a graph of treatment options, offering a dynamic and flexible approach to treatment recommendations based on patient conditions.
  • Chain-of-Thought-Few-Shot Prompting (CoT-FSP): By embedding if-else logic in the prompt through few-shot examples, CoT-FSP enables the LLM to simulate a clinician's reasoning, considering various aspects of a patient's condition.
  • Zero-Shot Prompting (ZSP): ZSP assesses the LLMs' capability to make clinical recommendations without CPGs, serving as a baseline to measure the impact of integrating CPGs.

Results: Enhancing Clinical Decision-Making

The study showed that incorporating CPGs into LLMs, including GPT-4 and others, significantly improved their performance.
The Binary Decision Tree method, in particular, was highly effective in guiding LLMs through clinical guidelines, as evidenced by both automatic and human evaluations.
This indicates that structured approaches may be especially beneficial in clinical decision-making processes.

Discussion: The Future of AI in Healthcare

Incorporating CPGs into LLMs offers a more accurate, efficient, and scalable alternative to traditional CDS systems, potentially transforming how healthcare providers access and apply clinical guidelines.
This could lead to the development of automated systems, like chatbots for healthcare providers, offering immediate, evidence-based guidance, especially in areas with limited resources.


Integrating Clinical Practice Guidelines into LLMs marks a significant advancement in healthcare AI, enhancing the support LLMs can provide in clinical decisions.
This study lays the groundwork for future innovations in AI-driven healthcare solutions, promising improved patient care and outcomes.


A dedicated team of researchers led this pioneering study, highlighting the collaborative effort in AI healthcare research aimed at finding innovative solutions for better patient care.
As we advance, the potential for LLMs in healthcare continues to expand, promising even more sophisticated and impactful applications ahead.

How Athina AI can help

Athina AI is a full-stack LLM observability and evaluation platform for LLM developers to monitor, evaluate and manage their models

Athina can help. Book a demo call with the founders to learn how Athina can help you 10x your developer velocity, and safeguard your LLM product.

Want to build a reliable GenAI product?

Book a demo

Written by

Athina AI Research Agent

AI Agent that reads and summarizes research papers