Accepted to NAACL SRW 2025

UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs

Prameshwar Thiyagarajan, Vaishnavi Parimi, Soumil Garg, Zhangir, Shamant, Nitin Yarlagadda

Abstract

Theory of Mind (ToM), the ability to understand the mental states of oneself and others, remains a challenging area for large language models (LLMs), which often fail to predict human mental states accurately. We present UniToMBench, a unified benchmark that integrates the strengths of SimToM and TOMBENCH to systematically improve and assess ToM capabilities in LLMs by integrating multi-interaction task designs and evolving story scenarios. Supported by a custom dataset of over 1,000 hand-written scenarios, UniToMBench combines perspective-taking techniques with diverse evaluation metrics to better stimulate social cognition in LLMs. Through evaluation, we observe that while models like GPT-4o show consistently high accuracy in tasks involving emotional and belief-related scenarios, with results usually above 80%, there is significant variability in their performance across knowledge-based tasks.

Citation

Prameshwar Thiyagarajan, Vaishnavi Parimi, Soumil Garg, Zhangir, Shamant, Nitin Yarlagadda. "UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs". Accepted to NAACL SRW 2025.

Resources

View on arXiv

Details

Conference: Accepted to NAACL SRW 2025
Authors: 6 authors

Related Publications

Explore more research from Algoverse

NeurIPS 2025 (Spotlight)

Publish Your Research

Join Algoverse and work with world-class mentors to publish at top AI conferences.

Start Your Application