Introduction
Deep Seek's R1 model, released in January 2025, is a highly competitive language model that achieves superior performance with minimal compute resources. It introduces a novel technique called multi-head latent attention, which significantly reduces the key-value cache size and enhances computational efficiency.