Athina AI
Blog HomePlatformGithub
Open main menu
Blog HomePlatformGithub
  1. Home
  2. Tags
  3. Safety

Safety

Related to Content (1) (Tags)
Related to Content (1) (Tags) 1
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Assessing Prompt Injection Risks in 200+ Custom GPTs
Research Paper
•May 25, 2024

Assessing Prompt Injection Risks in 200+ Custom GPTs

Prompt Injection: Different Attacks and Defensive Techniques
Prompt Engineering
•May 9, 2024

Prompt Injection: Different Attacks and Defensive Techniques

CYBERSECEVAL 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Research Paper
•Apr 18, 2024

CYBERSECEVAL 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

From Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text Embeddings
Research Paper
•Apr 16, 2024

From Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text Embeddings

Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification
HallucinationsResearch Paper
•Apr 15, 2024

Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification

The EVER (Real-Time Verification and Rectification) framework is designed to dynamically mitigate hallucinations during text generation by ensuring the accuracy and trustworthiness of each sentence before proceeding.

Prompt Stealing Attacks Against Text-to-Image Generation Models
Research Paper
•Apr 15, 2024

Prompt Stealing Attacks Against Text-to-Image Generation Models

Universal and Transferable Adversarial Attacks on Aligned Language Models
Research Paper
•Apr 14, 2024

Universal and Transferable Adversarial Attacks on Aligned Language Models

AI Safety: Necessary, but insufficient and possibly problematic
Research Paper
•Apr 10, 2024

AI Safety: Necessary, but insufficient and possibly problematic

Many-Shot Jailbreaking (Anthropic Research)
Research Paper
•Apr 6, 2024

Many-Shot Jailbreaking (Anthropic Research)

Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models
Research Paper
•Apr 5, 2024

Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models

Want to keep up with the latest & greatest LLM research?

Join 2000+ AI engineers

By providing your email, you agree to our Privacy Policy.

Athina AI
  • Blog Home
  • Open-Source Evals
  • Platform
  • Website
  • Twitter
  • LinkedIn
  • RSS Feed
  • Sitemap

© 2023 Athina AI. All Rights Reserved.