Back to Newsroom
CCA - Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space
Press Release
October 6, 2025
san francisco, CALIFORNIA

Zyphra shares its research into a novel attention variant: Compressed Convolutional Attention (CCA). CCA dramatically reduces the compute, memory, and parameter costs of self-attention while matching or exceeding the performance of existing methods. Details of CCA and its grouped-query variant CCGQA are described in the accompanying technical report published to arXiv. CCGQA has been subsequently used to train the ZAYA suite of language models.

Authors
Zyphra Team
Collaborators
Daniel A Roberts (Sequoia Capital & MIT), Andrey Gromov (Meta FAIR), Kushal Tirumala (Meta FAIR) and Hassan Shapourian (Cisco)