Prompt Templates

Have you ever run a model and received a response that was nonsensical, repetitive, or completely ignored your instructions? The culprit is often a missing or incorrect Prompt Template.

🧠 Key Concepts for Beginners

Why do templates matter? (The Actor Analogy 🎭)

Imagine you hire an actor to play a character. If you just walk up to them and say “Help me!”, they might just stare at you because they don’t know if they are in a play, a movie, or a real-life conversation.

However, if you hand them a script that says: [SCENE: A medieval castle] [CHARACTER: A wise wizard] [USER: Help me!], they immediately know how to act.

A Prompt Template is that script for the AI. It provides the structure (the “scaffolding”) that tells the model:

  1. This is where the instructions start.
  2. This is the user’s question.
  3. This is the part where the model should begin generating its response.

If you use the wrong template, the model gets “confused” and loses its ability to follow instructions.


📋 Common Template Formats

Different model families use different structures. Here are the most common ones you will encounter:

1. Llama-3 (Instruct)

Used by Meta’s Llama-3 models. It uses special “header” tags to separate parts of the conversation.

<|begin_of_text|><|start_header_id|>system<|end_header_id|>
 
You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
 
{Your Prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

2. Alpaca

A very common, older format used by many fine-tuned models.

Below is an instruction that describes a task. Write a response that appropriately completes the request.
 
### Instruction:
{Your Prompt}
 
### Response:

3. ChatML

A highly versatile format used by many models (like Hermes or Qwen).

You are a helpful assistant.<|im_end|>
<|im_start|>user
{Your Prompt}<|im_end|>
<|im_start|>assistant

🛠️ How to Use Templates in llama.cpp

There are two ways to handle templates:

Method 1: The Easy Way (Automatic)

Recent versions of llama.cpp are getting better at automatically detecting the template from the model’s metadata. If you are using a modern GGUF file, you might not need to do anything!

Method 2: The Manual Way (Command Line)

If your model isn’t detecting the template correctly, you can provide it yourself using the --template flag (if your version supports it) or by manually formatting your prompt string.

Example of manual formatting for Alpaca:

./bin/Release/llama-cli.exe -m "my-alpaca-model.gguf" -p "### Instruction: Write a song about cats. ### Response:"

💡 Pro-Tip: Use llama-server for easier Chatting

If you find manual prompting too tedious, use the llama-server.exe (explained in CLI Usage).

The server provides a built-in Web UI that handles prompt templates automatically, allowing you to chat with your model just like you would with ChatGPT.


Last Updated: 2026-05-03