Skip to main content

Summer Deadline: Sunday, March 29 @ 11:59pm PT. Click to apply.

Back to Research
Accepted to Building Trust in LLMs @ ICLR 2025

EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language Models

Abhay Gupta, Jacob Cheung, Philip Meng, Shayan Sayyed

Abstract

We present EnDive (English Dialect Variability Evaluation), a benchmark designed to assess the fairness and robustness of large language models across English dialects. EnDive spans five major English dialects—African American Vernacular English (AAVE), Indian English, British English, Australian English, and Standard American English—covering tasks including sentiment analysis, natural language inference, and question answering. We find significant performance disparities across dialects, with models consistently underperforming on AAVE and Indian English inputs.

Citation

Abhay Gupta, Jacob Cheung, Philip Meng, Shayan Sayyed. "EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language Models". Accepted to Building Trust in LLMs @ ICLR 2025.

Details

Conference
Accepted to Building Trust in LLMs @ ICLR 2025
Authors
4 authors

Publish Your Research

Join Algoverse and work with world-class mentors to publish at top AI conferences.

Start Your Application