ZAYA
Frontier reasoning language models, built for production speed.
ZAYA1 Reasoning
Built for long context and low latency.
Modalities
Large Language Model (LLM)
Architecture
8.3B-760M  Mixture-of-Experts (MoE)
Features
State-of-the-Art Benchmarks
Competitive with Qwen3-4B and Gemma3-12B. Surpasses Granite-4-H-Tiny.
Fast Time-to-first-token
800M-class latency with 12B-class quality.
Trained Full Stack AMD
First model trained end-to-end on AMD’s hardware, software, and networking stack.
Model Components
Compressed Convolutional Attention (CCA)
Compresses inputs before attention without sacrificing quality.
ZAYA Router
Ensures true expert specialization while maintaining balance across hardware.
Previous Models
ZAYA1 Base