Skip to main content

Summer Deadline: Sunday, March 29 @ 11:59pm PT. Click to apply.

Mitigating Sycophancy in Language Models via Sparse Activation Fusion and Multi-Layer Activation Steering

Mitigating Sycophancy in Language Models via Sparse Activation Fusion and Multi-Layer Activation Steering

December 1, 2025

Abstract coming soon. This paper has been accepted but the arXiv preprint is not yet available.

Accepted to Mech Interp @ NeurIPS 2025

Authors: Pyae Phoo Min, Avigya Paudel, Naufal Adityo, Arthur Zhu, Andrew Rufail

Abstract coming soon. This paper has been accepted but the arXiv preprint is not yet available.

Begin Your Journey

The application takes 10 minutes and is reviewed on a rolling basis. We look for strong technical signal—projects, coursework, or competition results—and a genuine curiosity to do real research.

If admitted, you will join a structured pipeline with direct mentorship to take your work from ideation to top conference submission at venues like NeurIPS, ACL, and EMNLP.