← All articles
Techniques6 min read

Fine-Tuning

Retraining an AI model on your own data. When it's worth it, when it's not, and how it works.

What fine-tuning is

Fine-tuning is taking a pre-trained model (like GPT-4o or Claude) and training it further on your own dataset. The result is a model that has the original model's general capabilities plus specialized behavior on your data.

If you fine-tune GPT-4o on 1,000 examples of your brand's copywriting, the resulting model will naturally produce copy in your brand's voice — without you having to explain it in every prompt.

Fine-tuning vs prompting — the real comparison

Most people reach for fine-tuning too early. Prompting is faster, cheaper, and easier to iterate.

Prompting is better when:

You're still figuring out what "good output" looks like

You need flexibility (different tasks, different tones)

Your dataset is small

Fine-tuning is better when:

You have hundreds or thousands of high-quality examples

You need consistent, specific behavior at scale

You want to reduce prompt length (fine-tuned behavior is baked in)

You're calling the API thousands of times and cost matters

How fine-tuning works

1. Prepare training data — pairs of inputs and ideal outputs. Typically 100–10,000 examples for useful results.

2. Upload to the provider's fine-tuning API (OpenAI, Anthropic, or open-source alternatives)

3. Training runs (hours to days depending on dataset size)

4. Test and evaluate the fine-tuned model

5. Deploy — use it via API like the base model

The training process doesn't change what the model knows — it changes how it behaves.

Practical costs

OpenAI fine-tuning (GPT-4o mini): approximately $0.008 per 1,000 training tokens. A dataset of 1,000 examples at 500 tokens each costs ~$4 to train. Inference is slightly more expensive than the base model.

Anthropic fine-tuning: available through their API, pricing varies by model.

The cost of fine-tuning is rarely the limiting factor. Preparing high-quality training data is.

When to actually use it

Start with prompting. If you find yourself using the same long system prompt repeatedly, have a consistent task, and have examples of ideal outputs — that's when fine-tuning makes sense.

The clearest signal: you've spent weeks prompting and the output quality has plateaued. You've done everything prompting can do. Now fine-tuning can take you further.

Takeaway

Start with prompting. Fine-tune when you've exhausted what prompts can do and have the data to justify it.