Skip to main content

Deadline Extended: Sunday, June 7 @ 11:59pm PT. May 24 cohort is now waitlisted; June 6 cohort closing soon. Click to apply.

Causal Language Control in Multilingual Transformers via Sparse Feature Steering

Causal Language Control in Multilingual Transformers via Sparse Feature Steering

December 1, 2025

Deterministically controlling the target generation language of large multilingual language models (LLMs) remains a fundamental challenge, particularly in zero-shot settings where neither explicit lan...

Accepted to ACL SRW 2025

Authors: Tim Chou, George Liu

Deterministically controlling the target generation language of large multilingual language models (LLMs) remains a fundamental challenge, particularly in zero-shot settings where neither explicit language prompts nor fine-tuning are available. We investigate whether sparse autoencoder (SAE) features can be leveraged to steer the generated language of LLMs during inference. Using pretrained SAEs on the residual streams of Gemma-2B and Gemma-9B, we identify features whose activations differ most significantly between English and four target languages: Chinese, Japanese, Spanish, and French. By modifying just a single SAE feature at one transformer layer, we achieve controlled language shifts with up to 90% success, as measured by FastText language classification, while preserving semantic fidelity according to LaBSE similarity.

Begin Your Journey

The application takes 5 minutes and is reviewed on a rolling basis. We look for strong technical signal—projects, coursework, or competition results—and a genuine curiosity to do real research.

If admitted, you will join a structured pipeline with direct mentorship to take your work from ideation to top conference submission at venues like NeurIPS, ACL, and EMNLP.

Begin Your Journey