Skip to main content

Spring Deadline: Sunday, March 1 @ 11:59pm PT. Click here to apply.

Back to Research
Accepted to Interplay @ COLM 2025

Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact

Advey Nandan, Tim Chou, Amrit Lalith

Abstract

We investigate the phenomenon of neuron universality in independently trained GPT-2 Small models, examining how these universal neurons—neurons with consistently correlated activations across models—emerge and evolve throughout training. By analyzing five GPT-2 models at three checkpoints (100k, 200k, 300k steps), we identify universal neurons through pairwise correlation analysis of activations over a dataset of 5 million tokens. Universal neurons emerge early, increasing consistently through training, notably in deeper layers. Universal neurons are highly stable over time, especially in later layers. Ablating universal neurons significantly increases loss and KL divergence, confirming their causal importance to model predictions. Layer-wise ablation reveals that ablating universal neurons in the first layer causes a disproportionately large increase in both KL divergence and loss.

Citation

Advey Nandan, Tim Chou, Amrit Lalith. "Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact". Accepted to Interplay @ COLM 2025.

Details

Conference
Accepted to Interplay @ COLM 2025
Authors
3 authors

Publish Your Research

Join Algoverse and work with world-class mentors to publish at top AI conferences.

Start Your Application