LLM weight compression Delta-Aware Quantization: Preserving Fine-Tuned AI Knowledge for Efficient LLM Deployment Discover Delta-Aware Quantization (DAQ), an innovative data-free framework that efficiently compresses post-trained LLMs by preserving critical fine-tuning knowledge, crucial for enterprise AI.