How a $5B Insurance Provider Uses LLMs for Risk Underwriting

How a $5B Insurance Provider Uses LLMs for Risk Underwriting
Do not index
Do not index
Original Paper
Blog URL
In the competitive landscape of the insurance industry, speed, convenience, and consistent control over risk selection and decision-making are critical.
 
Insurers are increasingly investing in AI and digital risk processing technologies to streamline their workflows, enhance underwriting productivity, and improve broker responsiveness.
 
Risk processing is a manual, time-consuming and siloed operation. Here’s how their risk assessment team uses LLMs to transform their underwriting process.
 

Problem

 
The mentioned insurance provider works with many insurance brokers to get leads. Each broker provides Risk submissions in a different format (PDFs, spreadsheets, emails, broker APIs) and varying schema.
 

Here are the challenges that they face:

 
  • Converting all the submissions to a machine-readable structured format.
  • Various brokers often ask the same question to the end client but in a differently phrased manner. For e.g:
    •  
      Broker 1:
      Q: List down the chronic medical conditions you’ve been through in the last 5 years.
      A:
       
      Broker 2:
      Q: Have you been diagnosed with any of these chronic diseases?
      A: Diabetes or mental health issues
      B: Cardiovascular and Pulmonary Disease
      C: Cancer
      D: None of the above
       
 
Extracting information and understanding the meaning of these submissions is difficult and time-consuming.
  • Evaluating these submissions as per their internal risk guidelines and taxonomy requires manual effort.
 

Solution

 
The team decided to leverage LLMs to overcome these challenges.
 
For this, they needed a Model and Prompt that could Evaluate each submission based on their internal risk assessment questionnaire.
 
The team chose Athina IDE to experiment with their datasets and quickly find out the best model and prompt for the task.
 

Experiment: Evaluation of Submissions

 
To evaluate a submission, each question in their Risk assessment criteria had to be answered using the information available in the Risk Submission form.
 
Since the information available in the submission is only semantically similar to the Risk assessment criteria, they used an LLM to answer every question using the Risk Submission form.
 
They created 3 datasets for each Risk submission form using the models - GPT 4o, Llama 3.1 and Claude 3.5 Sonnet respectively
 

Dataset Setup & Response Generation

 
notion image
(Note: This is only an indicative dataset)
 
RiskAssessmentQuestions - List of questions from their Risk assessment criteria
GroundTruthAsPerdocument - Human labelled Ground truth
Model Responses (Dynamic Columns) - Response generated from GPT-4o and 3 different versions of Fine tuned models
 
  • The model generated responses for each question by traversing through the information provided in the Risk Submission form.
  • Dynamic columns in Athina IDE allowed them to execute complex prompt techniques like - COT reasoning and reference data from existing columns like a spreadsheet.
 
notion image
(Note: This is only an indicative prompt)

Response Evaluation

 
Athina IDE contains 50+ preset evaluators for immediate use and gives a flexibility to write custom evals as well.
 
Here’s an example how they used one of our evaluators to compare model’s answer with the Ground truth labels.
 
notion image
 
This helped them benchmark performance of their fine-tuned models to GPT-4o. For e.g. performance of their V2 model was a comparable to GPT 4o.

Conclusion

 
Athina helped them fast-track the process of model development that can evaluate the incoming Risk submissions. Here are some of the ways in which Athina IDE was useful:
 
  • Rapid Experimentation: The team was able to experiment with their pipelines quickly and figure out the best model and prompt.
  • Collaboration: Technical and non-technical members of the team were able to collaborate on the experiments independently
  • Visibility: The team had the visibility of the impact of changing the parameters over the entire pipeline.
 
 
💡
Athina can help you 10x your AI development →
If you’re working on LLM applications that require experiment, prototyping and evaluation of your AI pipelines, then check out Athina AI or reach out to us at himanshu@athina.ai

Athina can help. Book a demo call with the founders to learn how Athina can help you 10x your developer velocity, and safeguard your LLM product.

Want to build a reliable GenAI product?

Book a demo