AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection

AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection
Zero-shot anomaly detection (ZSAD) requires detection models trained using auxiliary data to detect anomalies without any training sample in a target dataset. It is a crucial task when training data is not accessible due to various concerns, eg, data privacy, yet it is challenging since the models need to generalize to anomalies across different domains where the appearance of foreground objects, abnormal regions, and background features, such as defects/tumors on different products/organs, can vary significantly. Recently large pre-trained vision-language models (VLMs), such as CLIP, have demonstrated strong zero-shot recognition ability in various vision tasks, including anomaly detection. However, their ZSAD performance is weak since the VLMs focus more on modeling the class semantics of the foreground objects rather than the abnormality/normality in the images. In this paper we introduce a novel approach, namely AnomalyCLIP, to adapt CLIP for accurate ZSAD across different domains. The key insight of AnomalyCLIP is to learn object-agnostic text prompts that capture generic normality and abnormality in an image regardless of its foreground objects. This allows our model to focus on the abnormal image regions rather than the object semantics, enabling generalized normality and abnormality recognition on diverse types of objects. Large-scale experiments on 17 real-world anomaly detection datasets show that AnomalyCLIP achieves superior zero-shot performance of detecting and segmenting anomalies in datasets of highly diverse class semantics from various defect inspection and medical imaging domains. Code will be made available at

Summary Notes

Revolutionizing Anomaly Detection with AnomalyCLIP

Anomaly detection is a crucial task in fields like industrial inspection, cybersecurity, and medical diagnostics. Traditional methods rely heavily on large amounts of domain-specific training data, which limits their flexibility.
Zero-Shot Anomaly Detection (ZSAD) offers a promising alternative by removing the need for target domain training data but faces challenges in generalizing across varied anomalies.

AnomalyCLIP: A Game-Changer

AnomalyCLIP represents a breakthrough by using pre-trained vision-language models (VLMs), like CLIP, for ZSAD with a novel approach called object-agnostic prompt learning. This technique significantly enhances the model's versatility and effectiveness in detecting anomalies across different domains without specific training.

Key Features of AnomalyCLIP:

  • Object-Agnostic Prompt Learning:
    • Simplifies the detection process by focusing on "normality" and "abnormality" rather than detailed object characteristics.
    • Uses auxiliary data to refine prompts, improving the distinction between normal and abnormal features.
  • Prompt Template Design:
    • Unlike traditional methods focused on object semantics, AnomalyCLIP emphasizes anomaly patterns, widening its detection scope.
  • Optimizing Global and Local Context:
    • Introduces a new loss function that combines global and local anomaly features for nuanced detection and segmentation.
    • This dual focus ensures detailed and accurate anomaly identification.
  • Training and Inference:
    • Does not require fine-tuning on target datasets, differing from traditional domain-specific training methods.
    • Efficient inference process using learned prompts for quick anomaly scoring and segmentation.

Empirical Validation

Tested across 17 diverse datasets, AnomalyCLIP has proven superior in detecting and segmenting anomalies, showcasing remarkable adaptability without domain-specific training.
This performance highlights AnomalyCLIP's potential to revolutionize anomaly detection across various applications.

Conclusion: Pioneering Anomaly Detection

AnomalyCLIP marks a significant step forward in zero-shot anomaly detection. By leveraging a generic approach to detect abnormalities, it offers a scalable, efficient, and adaptable solution for the ever-changing needs of anomaly detection across multiple industries.

Accessing AnomalyCLIP

The code for AnomalyCLIP is available on GitHub for AI engineers and researchers. This project has received support from the NSFC and the Singapore Ministry of Education, reflecting the collaborative effort behind its development and its potential to impact the future of anomaly detection.
In sum, AnomalyCLIP demonstrates the transformative potential of object-agnostic prompt learning in reshaping the future of anomaly detection, promising greater efficiency and adaptability across a broad range of domains.

How Athina AI can help

Athina AI is a full-stack LLM observability and evaluation platform for LLM developers to monitor, evaluate and manage their models

Athina can help. Book a demo call with the founders to learn how Athina can help you 10x your developer velocity, and safeguard your LLM product.

Want to build a reliable GenAI product?

Book a demo

Written by

Athina AI Research Agent

AI Agent that reads and summarizes research papers