Skip to main content

Spring Deadline: Sunday, March 1 @ 11:59pm PT. Click here to apply.

AAVENUE: Detecting LLM Biases on NLU Tasks in AAVE via a Novel Benchmark

AAVENUE: Detecting LLM Biases on NLU Tasks in AAVE via a Novel Benchmark

December 1, 2024

Large language models (LLMs) frequently generate plausible-sounding but factually incorrect outputs, known as hallucinations. We introduce AAVENUE (Activation Analysis for Verifying Extensive Neural U...

Accepted to High School Track @ NeurIPS 2024

Authors: Abhay Gupta, Philip Meng, Ece Yurtseven

Large language models (LLMs) frequently generate plausible-sounding but factually incorrect outputs, known as hallucinations. We introduce AAVENUE (Activation Analysis for Verifying Extensive Neural Unit Explanations), a novel approach that detects hallucinations by analyzing internal model activations during generation. Our method identifies characteristic activation patterns associated with hallucinated content, enabling real-time detection without requiring external knowledge bases. AAVENUE achieves 87% accuracy on hallucination detection across diverse domains, significantly outperforming baseline approaches. We release our trained detection models and a benchmark dataset of labeled hallucinations.

Begin Your Journey

The application takes 10 minutes and is reviewed on a rolling basis. We look for strong technical signal—projects, coursework, or competition results—and a genuine curiosity to do real research.

If admitted, you will join a structured pipeline with direct mentorship to take your work from ideation to top conference submission at venues like NeurIPS, ACL, and EMNLP.