October 27, 2025 | 6 minute read
Algoverse at NeurIPS 2025: 60 Unique Papers by 230 Students

Hey, Kevin from Algoverse AI Research here!

I’m beyond thrilled to share our most significant milestone to date: so far, 60 unique papers authored by 230 Algoverse students have been accepted to NeurIPS 2025 in San Diego, including several Spotlights, with more expected to follow. If you’re new to the space, Neural Information Processing Systems (NeurIPS) is widely recognized as the most prestigious conference in artificial intelligence and machine learning. Publications at NeurIPS represent groundbreaking contributions and are commonly associated with leading universities and industry leaders like Google DeepMind.

For context: we went from 7 NeurIPS papers last year to 60 this year. That growth has been the result of deeper mentorship, tighter processes, and higher standards this past year.

And we’ve seen what this momentum does for students. Last year’s authors went on to Stanford, Berkeley EECS, CMU, and other top programs. Additionally, many turned their projects into research roles and AI/ML internships.

The ongoing list of this year’s NeurIPS acceptances is listed below. If you want to do serious research with a team that will push you and stand behind your work, you’re in the right place, and we encourage you to join our upcoming 2025 Winter Cohort on November 2.

Accepted Papers

AI for Health

[Spotlight] Examining the Vulnerability of Multi-Agent Medical Systems to Human Interventions for Clinical Reasoning – Dillon Mehta, Rishi Malhotra, Adam Zobian, Yong Ying Tan, Samir Chopra, Daniella Rand, Natalie Pang, Abhiram Gudimella, Raghav Thallapragada, Derek Jiu, Prisha Shah
Multi-Turn LLM Systems for Diagnostic Decision-Making: Considerations, Biases, and Challenges – Sejong Kim, Drona Thoka, Varun Puttagunta, Kaylin Sheng, Mark Li, Adnan Ahmed, Thi Uyen Hanh Le, Sai Chidvilas Gudiboina, Ali Ugur

ARLET

Idea: Fairness Constraints as Reliability Guarantees for RLHF Reward Models – Advay Samnerkar, Doelle Bhattacharya

BioSafe GenAI

Prompting Toxicity: Analyzing Biosafety Risks in Genomic Language Models – Akshay Murthy, Mengmeng Zhang, Shanmukhi Kannamangalam
Weight- and Activation-Based Steering Methods Provide Complementary Control of Biochemical Traits in Protein Language Models – Armaity Katki, Nathan Choi, Son Sophak Otra

CCFM

When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models – Michael Shihong Zhang, Rishi Adi Ruia, Arnav Kewalram, Saathvik Dharmapuram

Cognitive Interpretability

A Few Bad Neurons: Isolating and Surgically Correcting Sycophancy – Claire O’Brien, Jessica Seto, Dristi Roy, Aditya Dwivedi
DecepBench: Benchmarking Multimodal Deception Detection – Vittesh Maganti, Nysa Lalye, Ethan Braverman

DL4C

DuoLens: A Framework for Robust Detection of Machine-Generated Multilingual Text and Code – Shriyansh Agrawal, Aidan Lau, Sanyam Shah

Entity Reasoning

Chopping Trees: Semantic Similarity Based Dynamic Pruning for Tree-of-Thought Reasoning – Xirui Huang
Active Inference Control: Steering, Not Just Scaling, Language Model Reasoning – Josh Karthikeyan, Kai Fu, Derek Jiu
SwiftSolve: A Self-Iterative, Complexity-Aware Multi-Agent Framework for Competitive Programming – Adhyayan Veer Singh, Aaron Shen, Brian Law, Ahmed Ismail
Extending AutoCompressors via Surprisal-Based Dynamic Segmentation – Srivishnu Ramamurthi, Richard Xu, Raine Ma, Dawson Park, David Guo
Inference-Time Chain-of-Thought Pruning with Latent Informativeness Signals – Sophie Li, Nicholas Huang, Nina Luo, Vincent Lin
LoRA-Guided PPO for Cost-Aware and Compute-Efficient Agent Orchestration – Aneesh Durai, Joshua Cong Hu, Kevaan Buch
Confidence-Coverage Gating for Early Exit – Aaroosh Rustagi, Hsien Xin Peng, Khushal Murthy, Attrey Koul

Fairness and Reliability in Language Models

FRIT: Using Causal Importance to Improve Chain-of-Thought Faithfulness – Anand Swaroop, Akshat Nallani, Saksham Uboweja, Adiliia Uzdenova, Michael Nguyen
Peek-a-Boo Reasoning: Contrastive Region Masking in MLLMs – Anjana Nair, Yushen Li, Adhitya Rajendra Kumar
Scratchpad Thinking: Alternation Between Storage and Computation in Latent Reasoning Models – Sayam Goyal, Brad Peters, María Emilia Granda, Akshath Vijayakumar Narmadha, Dharunish Yugeswardeenoo
Limits of Emergent Reasoning of Large Language Models in Agentic Frameworks for Deterministic Games – Chris Su, Harrison Li, Matheus Marques

FM4LS (Foundations and Mechanistic Interpretability for Life Sciences)

Mechanistic Interpretability of Semantic Abstraction in Biomedical Texts – Nikhil Gourisetty, Snata Mohanty, Vishnu Srinivas, Soumil Jain

Generalization and Process Control in Generation

Adaptive Originality Filtering: Rejection-Based Prompting and RiddleScore for Culturally Grounded Multilingual Riddle Generation – Duy Le, Kent Ziti, Evan Girard-Sun
Emotional Framing as a Control Channel: Effects of Prompt Valence on LLM Performance – Enmanuel Felix-Pena, Tiki Li, Wayne Chen, Ethan Hin

Imageomics

Novel Finetuning Strategies for Adapting Biomedical Vision Language Models to Organ-Centered Pathology Microscopy Tasks – Siddharth Venkatesh, Ayman Sheikh, Anne Essien Essien, Pratibh, Rayhan Roswendi, Jeremiah Zhang

Language Agents Workshop

AgentChangeBench: A Multi-Dimensional Evaluation Framework for Goal-Shift Robustness in Conversational AI – Manik Rana, Calissa Man, Jeffrey Paine, Anotida Expected Msiiwa

LLM Evaluation

Adversarial Behavior in Research Settings: Conducting Control Evaluations with RE-Bench – Harini Rajakumar, Vanessa Nwauwa
GASLIGHTBENCH: Quantifying LLM Susceptibility to Social Prompting – Lening Nick Cui, Sahil Ghosh, Gareth Lee, Xuanzhe Yao, William H. Logian, Michael Li, Ellie Podoshev
Extending AutoCompressors via Surprisal-Based Dynamic Segmentation – Srivishnu Ramamurthi, Richard Xu, Raine Ma, Dawson Park, David Guo
GUARD: Guiding Unbiased Alignment through Reward Debiasing – Advay Samnerkar, Doelle Bhattacharya
MISCHIEF: A Benchmark in Minimal-Pairs of Safety and Culture for Holistic Evaluation of Fine-Grained Image-Caption Alignment – Sagarika Banerjee, Tangatar Madi, Advait Swaminathan, Nguyen Dao Manh Anh
Predicting Emergent Software Engineering Capabilities by Fine-tuning – Terry Huang, Henry Velasquez
When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models – Michael Shihong Zhang, Rishi Adi Ruia, Arnav Kewalram, Saathvik Dharmapuram

LockLLM

AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models – Aashray Reddy, Andrew Zagula, Nicholas Saban
LSMAS (LLM Security Modeling via Activation Steering) – Anthony Kuang, Ahmed Ismail
User Confidence-Fueled Stereotypes: Investigating Sycophantic Amplification of Implicit Bias in Language Models – Hannah You, Daniel Wang, Victor Chan, Mirabel Wang

Mechanistic Interpretability

[Spotlight] Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behaviour – Daniel Aarao Reis Arturi, Eric Zhang, Andrew Adrian Ansah
[Spotlight] Scratchpad Thinking: Alternation Between Storage and Computation in Latent Reasoning Models – Brad Peters, Sayam Goyal, María Emilia Granda, Akshath Vijayakumar Narmadha, Dharunish Yugeswardeenoo
Death by a Thousand Directions: Exploring the Geometry of Harmfulness in LLMs through Subconcept Probing – Saleena Angeline Sartawita, McNair Shah, Adhitya Rajendra Kumar, Naitik Chheda
Discovering Transformer Circuits via a Hybrid Attribution and Pruning Framework – Hao Gu, Vibhas Nair, Amrithaa Ashok Kumar
Emergent World Beliefs: Exploring Transformers in Stochastic Games – Michael Ma, Adam Kamel, Tanish Rastogi
Mitigating Sycophancy in Language Models via Sparse Activation Fusion and Multi-Layer Activation Steering – Pyae Phoo Min, Avigya Paudel, Naufal Adityo, Arthur Zhu, Andrew Rufail
Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact – Cheng-Ting Chou, Amrit Kurakula
What Do Refusal Tokens Learn? Fine-Grained Representations and Evidence for Downstream Steering – Rishab Alagharu, Ishneet Sukhvinder Singh, Anjali Batta, Jaelyn S. Liang, Shaibi Shamsudeen, Arnav Sheth

Mathematics of AI

Amortized Latent Steering: Low-Cost Alternative to Test-Time Optimization – Nathan Egbuna, Saatvik Gaur

Multi-Turn and Interactive Language Model Systems

[Spotlight] WOLF: Werewolf-based Observations for LLM Deception and Falsehoods – Mrinal Agarwal, Saad Rana, Theo Sundoro, Hermela Berhe
SENTINEL: Sentiment Evolution and Narrative Tracking in Extended LLM Interactions – Pranav Anuraag, Ethan Xu, Asher Nerenberg, Alexander Arutchev
Modeling and Predicting Multi-Turn Answer Instability in Large Language Models – Jiahang He, Rishi Ramachandran, Neel Ramachandran, Aryan Katakam
SMAGDi: Socratic Multi Agent Interaction Graph Distillation for Efficient High Accuracy Reasoning – Aayush Aluru, Myra N. Malik, Samarth Patankar

Regulation and Responsible ML

StealthEval: A Probe-Rewrite-Evaluate Workflow for Reliable Benchmarks – Lang Xiong, Nishant Bhargava, Jeremy Chang, Jianhang Hong

Reliable ML

A Few Bad Neurons: Isolating and Surgically Correcting Sycophancy – Claire O’Brien, Jessica Seto, Dristi Roy, Aditya Dwivedi
ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models – Haziq Mohammad Khalid, Athikash Jeyaganthan, Timothy Do
StealthEval: A Probe-Rewrite-Evaluate Workflow for Reliable Benchmarks – Lang Xiong, Nishant Bhargava, Jeremy Chang, Jianhang Hong
Automated Generation of Multilingual Jailbreak Prompts – Jonathan Ding, Khanak Jain, Dhruv Nair
GUARD: Guiding Unbiased Alignment through Reward Debiasing – Advay Samnerkar, Doelle Bhattacharya
Cross-Lingual Multimodal Retrieval-Augmented Generation for Open Question Answering in Tamil and Yoruba – Mobareji Abejide, Arya Ram

Safe and Explainable Agents

[Spotlight] Examining the Vulnerability of Multi-Agent Medical Systems to Human Interventions for Clinical Reasoning – Dillon Mehta, Rishi Malhotra, Adam Zobian, Yong Ying Tan, Samir Chopra, Daniella Rand, Natalie Pang, Abhiram Gudimella, Raghav Thallapragada, Derek Jiu
Automated Specialization of Stateful Agent Systems – Myan Vu, Harrish Ayyanar, Pang Jiang, Anwiketh Reddy

Spatial Vision and Language Environments

Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition – Nicholas Babey, Tiffany Gu, Yiheng Li
COREVQA: Spatial Reasoning and Multi-Step Visual Entailment in Crowded Environments – Kazuma Choji, Ishant Yunay Chintapatla, Naaisha Agarwal, Andrew Lwin

Speech, Image, and Multimodal Generation Models

Cross-Lingual Multimodal Retrieval-Augmented Generation for Open Question Answering in Tamil and Yoruba – Mobareji Abejide, Arya Ram

Universal Representations

[Oral] Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behaviour – Daniel Aarao Reis Arturi, Eric Zhang, Andrew Adrian Ansah
Task Matrices: Linear Maps for Cross-model Finetuning Transfer – Darrin O’Brien, Dhikshith Gajulapalli, Alexander Ramsey, Rishi Nalem

Urban AI Systems

Enhancing Rural Autonomous Driving Performance with Diffusion-Augmented Synthetic Datasets – Siddharth Arun, Trisha Panchangmath, Saanvi Celamkoti, Vayden Wong

Seeing Algoverse students present alongside graduate researchers and industry leaders at NeurIPS 2025 underscores what’s possible when talented students receive direct mentorship and resources to pursue serious research. With 60 unique papers already accepted, and more decisions still to come, this year stands as a testament to the potential of young researchers pushing the boundaries of modern AI.

October 27, 2025 | 6 minute readAlgoverse at NeurIPS 2025: 60 Unique Papers by 230 Students