01 Retrieval-augmented generation (RAG)
A user query is embedded, matched against a vector index (or hybrid search), and the top passages are prepended to the prompt. The LLM conditions on this evidence—reducing hallucination when the corpus is trustworthy and well-chunked.
Strengths: fresh facts without retraining, citations possible, good for proprietary docs. Weaknesses: retrieval quality becomes the ceiling; chunking and hybrid search matter a lot.
02 Fine-tuning (supervised / preference)
You continue training (or LoRA/QLoRA adapters) on curated examples—instruction-output pairs, preference rankings (RLHF/DPO), or domain text. The model internalizes style, format, and domain vocabulary; weights change, so versioning and evaluation are essential.
Strengths: stable behavior, specific output formats, can reduce prompt length. Weaknesses: data must be high quality; risk of forgetting; update cycle slower than editing a doc store.
03 Side-by-side
| Dimension | RAG | Fine-tuning |
|---|---|---|
| Factual freshness | Strong — update the corpus/index | Weak — need retrain for new facts |
| Style / tone / format | Prompt-dependent | Strong — if reflected in data |
| Latency & cost | Extra retrieval step + longer context | Inference like base model (after tuning) |
| Privacy / compliance | Control access at index layer | Data in training pipeline; audit training sets |
| When data is scarce | Works if docs exist | Risk of overfitting; prefer parameter-efficient methods |
04 Decision sketch
Many products use RAG for grounding and fine-tuning (or adapters) for tone and tool-use format. Evaluate end-to-end task success, not single-component accuracy alone.