ShareGPT vs Alpaca: TL;DR
ShareGPT vs Alpaca dataset formatting is the first real decision you make when fine-tuning an LLM — and the wrong choice silently degrades your model.
| ShareGPT | Alpaca | |
|---|---|---|
| Best for | Multi-turn chat, tool use, system prompts | Single-turn instruction following |
| Structure | conversations list of {from, value} | instruction, input, output keys |
| Framework support | Unsloth, Axolotl, TRL, LLaMA-Factory | Axolotl, TRL, older scripts |
| System prompt | Native (from: system) | Bolted on via instruction field |
| Multi-turn | ✅ Native | ❌ Workaround only |
| Hugging Face Datasets | sharegpt loader | alpaca loader |
| Self-hosted fine-tune cost | Free (GPU time only) | Free (GPU time only) |
Choose ShareGPT if: You're fine-tuning a chat model, need multi-turn data, or use Unsloth/Axolotl in 2026.
Choose Alpaca if: You have a legacy dataset already in Alpaca format and a single-turn task like summarization or classification.
Why Format Matters More Than You Think
Most fine-tuning failures aren't caused by bad hyperparameters. They're caused by mismatched data formatting — the model sees malformed tokens, the chat template gets applied twice, or turn boundaries collapse.
Symptoms of wrong format:
- Loss drops normally but the model ignores instructions at inference
apply_chat_templatethrows aKeyErroronconversations- Multi-turn evals show the model repeating the human turn verbatim
- Axolotl warns:
dataset_type: sharegpt but found alpaca keys
Both formats are JSON (or JSONL). The difference is in key names and nesting depth.
ShareGPT wraps turns in a
conversations list; Alpaca flattens everything into three top-level keys.
Alpaca Format: Structure and Limits
Alpaca was introduced with the Stanford Alpaca paper in 2023. It's flat and simple:
{
"instruction": "Summarize the following customer complaint in one sentence.",
"input": "I ordered a laptop on March 1st and it still hasn't arrived...",
"output": "Customer placed an order on March 1st and has not received it after 10 days."
}
When input is empty, most loaders merge instruction + output directly:
{
"instruction": "Write a Python function that returns the Fibonacci sequence up to n.",
"input": "",
"output": "def fibonacci(n):\n a, b = 0, 1\n result = []\n while a < n:\n result.append(a)\n a, b = b, a + b\n return result"
}
What Alpaca can't do natively:
- Multi-turn dialogue (no concept of a "conversation")
- Per-example system prompts (instruction doubles as system context)
- Tool call / function call turns
- Role-aware masking (you can't mask the human turn loss separately)
For 2026 chat models — Llama 3.3, Qwen 2.5, Mistral Small 3 — Alpaca is the wrong default.
ShareGPT Format: Structure and Strengths
ShareGPT wraps all turns in a conversations array. Each turn has a from role and a value string:
{
"conversations": [
{
"from": "system",
"value": "You are a senior Python developer. Be concise and correct."
},
{
"from": "human",
"value": "Write a Python function that returns the Fibonacci sequence up to n."
},
{
"from": "gpt",
"value": "def fibonacci(n):\n a, b = 0, 1\n result = []\n while a < n:\n result.append(a)\n a, b = b, a + b\n return result"
}
]
}
Multi-turn adds more objects to the same array:
{
"conversations": [
{"from": "system", "value": "You are a helpful coding assistant."},
{"from": "human", "value": "What does `__slots__` do in Python?"},
{"from": "gpt", "value": "`__slots__` restricts instance attributes to a fixed set, reducing memory overhead per object."},
{"from": "human", "value": "Give me an example with a dataclass comparison."},
{"from": "gpt", "value": "class Point:\n __slots__ = ('x', 'y')\n def __init__(self, x, y):\n self.x = x\n self.y = y\n\n# vs @dataclass which uses __dict__ by default"}
]
}
Role name variants — different loaders accept different spellings. Unsloth and Axolotl both accept these by default:
| Canonical | Alternatives accepted |
|---|---|
human | user |
gpt | assistant, model |
system | system |
Stick to human / gpt / system unless your framework specifies otherwise.
Converting Alpaca → ShareGPT in Python
When your dataset is in Alpaca format but your framework needs ShareGPT, use this converter. It handles both the input-present and input-empty cases:
# convert_alpaca_to_sharegpt.py
# Converts an Alpaca JSONL dataset to ShareGPT format.
# Tested on Python 3.12, datasets==2.19, transformers==4.41
import json
from pathlib import Path
def alpaca_to_sharegpt(record: dict) -> dict:
instruction = record.get("instruction", "").strip()
input_text = record.get("input", "").strip()
output = record.get("output", "").strip()
# Merge instruction + input when both are present
human_turn = f"{instruction}\n\n{input_text}" if input_text else instruction
return {
"conversations": [
{"from": "human", "value": human_turn},
{"from": "gpt", "value": output},
]
}
def convert_file(src: str, dst: str) -> None:
src_path = Path(src)
dst_path = Path(dst)
converted = []
with src_path.open() as f:
for line in f:
line = line.strip()
if not line:
continue
record = json.loads(line)
converted.append(alpaca_to_sharegpt(record))
with dst_path.open("w") as f:
for item in converted:
f.write(json.dumps(item, ensure_ascii=False) + "\n")
print(f"Converted {len(converted)} records → {dst_path}")
if __name__ == "__main__":
convert_file("train_alpaca.jsonl", "train_sharegpt.jsonl")
Expected output:
Converted 52002 records → train_sharegpt.jsonl
If it fails:
KeyError: 'instruction'→ your source file uses a different schema; inspect withpython -c "import json; print(json.loads(open('file.jsonl').readline()).keys())"UnicodeDecodeError→ addencoding="utf-8"to bothopen()calls
Loading Each Format with Hugging Face Datasets
# load_datasets.py
# Shows how to load both formats for inspection before training.
from datasets import load_dataset
# Alpaca — flat JSON keys
alpaca_ds = load_dataset("json", data_files="train_alpaca.jsonl", split="train")
print(alpaca_ds[0].keys()) # dict_keys(['instruction', 'input', 'output'])
# ShareGPT — nested conversations list
sharegpt_ds = load_dataset("json", data_files="train_sharegpt.jsonl", split="train")
print(sharegpt_ds[0]["conversations"][0]) # {'from': 'human', 'value': '...'}
Using ShareGPT with Unsloth (Llama 3.3, Qwen 2.5)
Unsloth's standardize_sharegpt normalizes role name variants before applying the chat template. This is the full pipeline as of Unsloth 2025.11:
# unsloth_sharegpt_train.py
# Fine-tune Llama 3.3 8B on a ShareGPT dataset with Unsloth.
# Requires: unsloth[colab-new], datasets, trl
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template, standardize_sharegpt
from datasets import load_dataset
from trl import SFTTrainer, SFTConfig
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/Llama-3.3-8B-Instruct",
max_seq_length=4096,
load_in_4bit=True,
)
tokenizer = get_chat_template(tokenizer, chat_template="llama-3.1")
dataset = load_dataset("json", data_files="train_sharegpt.jsonl", split="train")
# Normalize 'user'/'assistant' → 'human'/'gpt' and validate structure
dataset = standardize_sharegpt(dataset)
def apply_template(examples):
# apply_chat_template expects a list of conversation dicts
texts = [
tokenizer.apply_chat_template(
convo,
tokenize=False, # Return strings; SFTTrainer tokenizes internally
add_generation_prompt=False, # Don't append assistant prefix during training
)
for convo in examples["conversations"]
]
return {"text": texts}
dataset = dataset.map(apply_template, batched=True)
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=dataset,
args=SFTConfig(
dataset_text_field="text", # Must match the key set in apply_template above
max_seq_length=4096,
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
num_train_epochs=3,
learning_rate=2e-4,
output_dir="./outputs",
),
)
trainer.train()
Key parameters explained:
tokenize=False— returns a formatted string; SFTTrainer handles tokenization so you don't apply it twiceadd_generation_prompt=False— during training you include the full assistant turn; only setTrueat inferencedataset_text_field="text"— must match the key name you write inapply_template
Using Alpaca with Axolotl
Axolotl still has strong Alpaca support via its alpaca dataset type. In your config.yaml:
# axolotl_alpaca_config.yaml
base_model: mistralai/Mistral-Small-3.1-24B-Instruct
model_type: MistralForCausalLM
datasets:
- path: train_alpaca.jsonl
ds_type: json
type: alpaca # Axolotl's built-in Alpaca formatter
sequence_len: 2048
val_set_size: 0.02
output_dir: ./axolotl-output
num_epochs: 3
learning_rate: 2e-5
micro_batch_size: 2
gradient_accumulation_steps: 4
For ShareGPT in Axolotl, swap type: alpaca → type: sharegpt:
datasets:
- path: train_sharegpt.jsonl
ds_type: json
type: sharegpt
conversation: llama-3 # Maps roles to Llama 3's chat template tokens
If Axolotl warns conversation not set → explicitly add conversation: chatml or the model-specific template name.
Validation: Catch Format Errors Before Training
Running a 3-hour fine-tune only to find malformed data at epoch 2 is painful. Validate first:
# validate_sharegpt.py
# Checks every record in a ShareGPT JSONL for required keys and role ordering.
import json
import sys
from pathlib import Path
VALID_ROLES = {"human", "gpt", "system", "user", "assistant", "model"}
def validate_record(record: dict, idx: int) -> list[str]:
errors = []
if "conversations" not in record:
errors.append(f"[{idx}] Missing 'conversations' key")
return errors
convos = record["conversations"]
if not isinstance(convos, list) or len(convos) == 0:
errors.append(f"[{idx}] 'conversations' must be a non-empty list")
return errors
for turn_idx, turn in enumerate(convos):
if "from" not in turn:
errors.append(f"[{idx}] Turn {turn_idx} missing 'from'")
elif turn["from"] not in VALID_ROLES:
errors.append(f"[{idx}] Turn {turn_idx} unknown role: {turn['from']}")
if "value" not in turn or not isinstance(turn["value"], str):
errors.append(f"[{idx}] Turn {turn_idx} missing or non-string 'value'")
return errors
def validate_file(path: str) -> None:
total = 0
all_errors = []
with Path(path).open() as f:
for idx, line in enumerate(f):
line = line.strip()
if not line:
continue
try:
record = json.loads(line)
except json.JSONDecodeError as e:
all_errors.append(f"[{idx}] JSON parse error: {e}")
continue
all_errors.extend(validate_record(record, idx))
total += 1
if all_errors:
print(f"Found {len(all_errors)} errors in {total} records:")
for err in all_errors[:20]: # Show first 20 to avoid wall of text
print(" ", err)
sys.exit(1)
else:
print(f"✅ {total} records valid")
if __name__ == "__main__":
validate_file(sys.argv[1])
python validate_sharegpt.py train_sharegpt.jsonl
# ✅ 52002 records valid
Head-to-Head: When Each Format Wins
| Scenario | Winner | Reason |
|---|---|---|
| Chat assistant fine-tune | ShareGPT | Multi-turn is native; role masking works correctly |
| Single-turn summarization | Alpaca | Simpler structure, less conversion overhead |
| Tool/function calling data | ShareGPT | Tool turns map to from: tool naturally |
| Legacy dataset from 2023 | Alpaca | Already formatted; conversion adds risk with no benefit |
| Unsloth + Llama 3.3 | ShareGPT | standardize_sharegpt + apply_chat_template pipeline is battle-tested |
| Axolotl + Mistral | Either | Axolotl handles both natively |
| Mixing system prompts per example | ShareGPT | from: system per conversation; Alpaca has one global instruction field |
| Filtering by turn count | ShareGPT | len(example["conversations"]) is trivial; Alpaca has no concept of turns |
What You Learned
- Alpaca is flat and fast to set up, but it has no real multi-turn support — don't use it for chat models in 2026.
- ShareGPT's
conversationslist maps cleanly toapply_chat_template, which every major framework now expects. - Always run a validation script before training — malformed records cause late-epoch crashes, not early ones.
tokenize=False+add_generation_prompt=Falseis the correct pairing during training; flipadd_generation_prompt=Trueonly at inference.- Axolotl accepts both formats via
type: alpacaortype: sharegpt— no code required, just config.
Tested on Unsloth 2025.11, Axolotl 0.7, TRL 0.9, Python 3.12, CUDA 12.4, RTX 4090 (24GB VRAM)
FAQ
Q: Can I mix Alpaca and ShareGPT records in one training run?
A: No. Each framework expects one format per dataset entry. Convert everything to ShareGPT first using the script above, then concatenate the JSONL files.
Q: What's the minimum number of ShareGPT records needed for fine-tuning?
A: Quality beats quantity. 500–1,000 high-quality, diverse records typically outperform 50,000 noisy ones. Start with 1,000 and evaluate before scaling.
Q: Does ShareGPT format work with OpenAI's fine-tuning API?
A: OpenAI uses a similar but distinct format — messages with role/content keys, not conversations with from/value. Convert with: {"messages": [{"role": t["from"].replace("gpt","assistant").replace("human","user"), "content": t["value"]} for t in record["conversations"]]}.
Q: How do I handle tool call turns in ShareGPT?
A: Add turns with "from": "tool" and put the tool result in "value". Unsloth and LLaMA-Factory both support this. Axolotl requires conversation: tool_use in the dataset config.
Q: Does the input field in Alpaca have a cost at inference?
A: It adds tokens to the prompt, so yes — longer prompts cost more on hosted APIs (OpenAI, Anthropic, etc., priced in USD per million tokens). In ShareGPT you can include the same context inside a human turn and control token usage more precisely.