Nf4.rar -
In the context of computer science and machine learning, refers to 4-bit NormalFloat , a specialized quantization data type introduced in the seminal paper QLoRA: Efficient Finetuning of Quantized LLMs by Tim Dettmers et al. (2023). 📄 Core Concept: The QLoRA Paper
💡 : If you are looking for the software/machine learning paper, search for "QLoRA" or "4-bit NormalFloat" on arXiv . NF4.rar
The term "NF4" is central to this "long paper" which revolutionized how large language models (LLMs) are fine-tuned on consumer hardware. In the context of computer science and machine
: Neural network weights typically follow a normal distribution. NF4 concentrates its 16 "bins" where most weights exist (near zero), minimizing rounding errors. The term "NF4" is central to this "long
: Compresses 16-bit weights to 4 bits, effectively reducing VRAM usage by ~75% (e.g., a 65B parameter model can be loaded with ~35GB instead of ~130GB).
: A process that quantizes the quantization constants themselves to save additional memory.