Skip to main content

Spring Deadline: Sunday, March 1 @ 11:59pm PT. Click here to apply.

QIANets for Reduced Latency and Improved Inference Times in CNN Models

QIANets for Reduced Latency and Improved Inference Times in CNN Models

December 1, 2024

We introduce QIANets (Quantum-Inspired Attention Networks), a novel architecture that leverages quantum-inspired computational principles to achieve efficient attention computation. By reformulating t...

Accepted to Compression @ NeurIPS 2024

Authors: Zhumazhan Balapanov, Edward Magongo, Vanessa Matvei, Olivia Holmberg

We introduce QIANets (Quantum-Inspired Attention Networks), a novel architecture that leverages quantum-inspired computational principles to achieve efficient attention computation. By reformulating the attention mechanism using tensor network decompositions inspired by quantum many-body physics, we achieve sub-quadratic complexity in sequence length while maintaining model expressiveness. Our approach demonstrates significant speedups on long-context tasks, with experiments showing 3-5x inference acceleration compared to standard transformers on sequences of 8K+ tokens.

Begin Your Journey

The application takes 10 minutes and is reviewed on a rolling basis. We look for strong technical signal—projects, coursework, or competition results—and a genuine curiosity to do real research.

If admitted, you will join a structured pipeline with direct mentorship to take your work from ideation to top conference submission at venues like NeurIPS, ACL, and EMNLP.