Answer
Low-Rank Adaptation: freeze base model weights, inject trainable low-rank matrices (A×B) into attention layers. Full fine-tune: millions of params. LoRA: thousands. Rank r=16-64 typical. Merges into base model at inference — no latency overhead.