About
Models
Agent
Blog
Careers
Our Work
AI Models
Agent
Try Our Models
ZAYA
Frontier reasoning language models, built for production speed.
ZAYA1 Reasoning
Built for long context and low latency.
Modalities
Large Language Model (LLM)
Architecture
8.3B-760M Mixture-of-Experts (MoE)
Features
State-of-the-Art Benchmarks
Competitive with Qwen3-4B and Gemma3-12B. Surpasses Granite-4-H-Tiny.
Fast Time-to-first-token
800M-class latency with 12B-class quality.
Trained Full Stack AMD
First model trained end-to-end on AMD’s hardware, software, and networking stack.
Model Components
Compressed Convolutional Attention (CCA)
Compresses inputs before attention without sacrificing quality.
ZAYA Router
Ensures true expert specialization while maintaining balance across hardware.
Coming soon
Previous Models
ZAYA1 Base
Learn More
arXiv