Skip to main content

Spring Deadline: Sunday, March 1 @ 11:59pm PT. Click here to apply.

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting

December 1, 2024

We propose Semantic Self-Consistency (SSC), a novel framework for evaluating the reasoning capabilities of large language models. SSC measures whether a model produces semantically equivalent answers ...

Accepted to MathAI @ NeurIPS 2024

Authors: Tim Knappe, Ryan Li, Ayush Chauhan, Kaylee Chhua

We propose Semantic Self-Consistency (SSC), a novel framework for evaluating the reasoning capabilities of large language models. SSC measures whether a model produces semantically equivalent answers when presented with logically equivalent formulations of the same question. Unlike traditional consistency metrics that focus on exact string matching, SSC captures deeper semantic alignment through learned embeddings. Our experiments reveal significant inconsistencies in state-of-the-art models, with performance dropping by 15-30% on semantically rephrased questions. We release a benchmark of 10,000 question pairs for evaluating SSC.

Begin Your Journey

The application takes 10 minutes and is reviewed on a rolling basis. We look for strong technical signal—projects, coursework, or competition results—and a genuine curiosity to do real research.

If admitted, you will join a structured pipeline with direct mentorship to take your work from ideation to top conference submission at venues like NeurIPS, ACL, and EMNLP.