Skip to main content

Spring Deadline: Sunday, March 1 @ 11:59pm PT. Click here to apply.

Back to Research
Accepted to Sparsity in LLMs @ ICLR 2025

Efficient Transformers via MPO-Based Low-Rank Factorization and Pruning

Sam Mikhak, Venkata Sai Gummidi

Abstract

This paper explores the use of matrix product operators (MPOs) to compress transformer-based architectures. By factorizing full-rank weight matrices into tensor-train products, MPOs reduce both memory footprint and computational cost, critical for deployment on resource-constrained devices. Experiments on speaker identification using the LibriSpeech train-clean-360 subset show that MPO-based models, and even their pruned variants, maintain high performance with far fewer parameters than full-rank transformers.

Citation

Sam Mikhak, Venkata Sai Gummidi. "Efficient Transformers via MPO-Based Low-Rank Factorization and Pruning". Accepted to Sparsity in LLMs @ ICLR 2025.

Details

Conference
Accepted to Sparsity in LLMs @ ICLR 2025
Authors
2 authors

Publish Your Research

Join Algoverse and work with world-class mentors to publish at top AI conferences.

Start Your Application