rw-book-cover

Metadata

  • Author: AlphaSignal
  • Full Title: πŸ”₯ Google Unveils 27B Gemma 3: Quantized for Consumer GPUs

Highlights

  • Google released Quantization-Aware Trained (QAT) versions of its Gemma 3 models, including 27B. These models maintain performance while reducing memory needs to run on consumer GPUs. The 27B QAT model loads in 14.1 GB of VRAM. (View Highlight)
  • Memory Reductions with int4 Quantization QAT lowers memory use for model weights across all Gemma 3 sizes. β€’ 27B: from 54 GB (BF16) to 14.1 GB (int4) for model weights β€’ 12B: from 24 GB to 6.6 GB β€’ 4B: from 8 GB to 2.6 GB β€’ 1B: from 2 GB to 0.5 GB (View Highlight)