Prompt Templates
Have you ever run a model and received a response that was nonsensical, repetitive, or completely ignored your instructions? The culprit is often a missing or incorrect Prompt Template.
🧠 Key Concepts for Beginners
Why do templates matter? (The Actor Analogy 🎭)
Imagine you hire an actor to play a character. If you just walk up to them and say “Help me!”, they might just stare at you because they don’t know if they are in a play, a movie, or a real-life conversation.
However, if you hand them a script that says:
[SCENE: A medieval castle] [CHARACTER: A wise wizard] [USER: Help me!], they immediately know how to act.
A Prompt Template is that script for the AI. It provides the structure (the “scaffolding”) that tells the model:
- This is where the instructions start.
- This is the user’s question.
- This is the part where the model should begin generating its response.
If you use the wrong template, the model gets “confused” and loses its ability to follow instructions.
📋 Common Template Formats
Different model families use different structures. Here are the most common ones you will encounter:
1. Llama-3 (Instruct)
Used by Meta’s Llama-3 models. It uses special “header” tags to separate parts of the conversation.
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
{Your Prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>2. Alpaca
A very common, older format used by many fine-tuned models.
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{Your Prompt}
### Response:3. ChatML
A highly versatile format used by many models (like Hermes or Qwen).
You are a helpful assistant.<|im_end|>
<|im_start|>user
{Your Prompt}<|im_end|>
<|im_start|>assistant🛠️ How to Use Templates in llama.cpp
There are two ways to handle templates:
Method 1: The Easy Way (Automatic)
Recent versions of llama.cpp are getting better at automatically detecting the template from the model’s metadata. If you are using a modern GGUF file, you might not need to do anything!
Method 2: The Manual Way (Command Line)
If your model isn’t detecting the template correctly, you can provide it yourself using the --template flag (if your version supports it) or by manually formatting your prompt string.
Example of manual formatting for Alpaca:
./bin/Release/llama-cli.exe -m "my-alpaca-model.gguf" -p "### Instruction: Write a song about cats. ### Response:"💡 Pro-Tip: Use llama-server for easier Chatting
If you find manual prompting too tedious, use the llama-server.exe (explained in CLI Usage).
The server provides a built-in Web UI that handles prompt templates automatically, allowing you to chat with your model just like you would with ChatGPT.
Last Updated: 2026-05-03