Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting

Accepted to MathAI @ NeurIPS 2024

Authors: Tim Knappe, Ryan Li, Ayush Chauhan, Kaylee Chhua

We propose Semantic Self-Consistency (SSC), a novel framework for evaluating the reasoning capabilities of large language models. SSC measures whether a model produces semantically equivalent answers when presented with logically equivalent formulations of the same question. Unlike traditional consistency metrics that focus on exact string matching, SSC captures deeper semantic alignment through learned embeddings. Our experiments reveal significant inconsistencies in state-of-the-art models, with performance dropping by 15-30% on semantically rephrased questions. We release a benchmark of 10,000 question pairs for evaluating SSC.

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting

Begin Your Journey