DeepSeek V3.2 Benchmarks Show 90% Cost Reduction
The Chinese AI lab's latest release activates only 37B of its 685B parameters during inference, achieving GPT-4 class performance at a fraction of the cost. The paper details a novel sparse attention mechanism.
- • 37B active parameters from 685B total using Mixture of Experts
- • Achieves 95% of GPT-4 performance on key benchmarks
- • Training cost estimated at $5.6M (vs $100M+ for GPT-4)
Cost democratization could reshape the AI landscape, enabling smaller players to compete with tech giants on model capability.
Benchmark gaming is rampant in AI. Real-world performance on complex reasoning tasks remains to be seen. The training data composition is also undisclosed.