CrewAI Output Pydantic: Structured Agent Results in 2026

Use Pydantic models in CrewAI to get typed, validated agent output. Stop parsing raw strings — get structured JSON from every task.

Problem: CrewAI Returns Raw Strings You Can't Reliably Parse

Your agent runs, produces a result, and then you write three lines of brittle string manipulation to extract the data you actually wanted. Then the model rephrases its output and your parser breaks.

You'll learn:

  • How to attach a Pydantic model to a CrewAI Task using output_pydantic
  • How to access the validated result directly from task.output.pydantic
  • How to handle validation errors without crashing your crew

Time: 15 min | Difficulty: Intermediate


Why Raw String Output Breaks in Production

By default, task.output.raw is a string. The LLM decides its own format. On one run you get:

Name: Acme Corp
Revenue: $4.2M
Founded: 2018

On the next:

{"company": "Acme Corp", "revenue": "4.2M", "founded": 2018}

Both are "correct" from the model's perspective. Neither is safe to parse without a schema. CrewAI's output_pydantic field solves this by instructing the task to validate its output against a Pydantic model before returning.

Symptoms of the problem:

  • KeyError or AttributeError when accessing agent results downstream
  • Inconsistent field names across runs (revenue vs annual_revenue)
  • Having to prompt-engineer JSON format into every task description

Solution

Step 1: Install Dependencies

# Requires crewai >= 0.28.0 and pydantic v2
pip install "crewai[tools]>=0.28.0" pydantic

Verify:

python -c "import crewai; print(crewai.__version__)"
# Expected: 0.28.0 or higher

Step 2: Define Your Pydantic Output Model

from pydantic import BaseModel, Field
from typing import Optional

class CompanyResearch(BaseModel):
    name: str = Field(description="Legal company name")
    founded: int = Field(description="Year founded as integer")
    revenue_usd_millions: float = Field(description="Annual revenue in USD millions")
    headquarters: str = Field(description="City, Country")
    summary: str = Field(description="2-sentence company overview")
    competitors: list[str] = Field(
        default_factory=list,
        description="Top 3 direct competitors by name"
    )

Keep field descriptions precise — CrewAI passes them to the LLM as formatting instructions.


Step 3: Attach the Model to a Task

from crewai import Agent, Task, Crew, LLM

llm = LLM(model="openai/gpt-4o-mini")

researcher = Agent(
    role="Company Research Analyst",
    goal="Extract accurate company data from public sources",
    backstory="You specialize in structured business intelligence.",
    llm=llm,
    verbose=True,
)

research_task = Task(
    description=(
        "Research {company_name} and return structured data. "
        "Use publicly available information only."
    ),
    expected_output="Structured company profile with all required fields populated.",
    output_pydantic=CompanyResearch,  # <-- this is the key line
    agent=researcher,
)

output_pydantic tells CrewAI to:

  1. Append JSON formatting instructions to the task prompt automatically
  2. Parse the raw output string as JSON after the task completes
  3. Validate it against CompanyResearch using Pydantic
  4. Raise a ValidationError if required fields are missing or wrong type

Step 4: Run the Crew and Access Typed Output

crew = Crew(
    agents=[researcher],
    tasks=[research_task],
    verbose=True,
)

result = crew.kickoff(inputs={"company_name": "Stripe"})

# Access the validated Pydantic object directly
company: CompanyResearch = research_task.output.pydantic

print(company.name)                    # "Stripe"
print(company.founded)                 # 2010  (int, not "2010")
print(company.revenue_usd_millions)    # 4200.0
print(company.competitors)             # ["Braintree", "Adyen", "Square"]

# Or serialize for downstream use
print(company.model_dump())
print(company.model_dump_json(indent=2))

Step 5: Handle Validation Failures Gracefully

The LLM occasionally returns malformed JSON or omits a required field. Wrap crew execution:

from pydantic import ValidationError

try:
    result = crew.kickoff(inputs={"company_name": "Stripe"})
    company = research_task.output.pydantic

    if company is None:
        # Output parsed but Pydantic validation failed — raw string is still available
        print("Structured parse failed. Raw output:")
        print(research_task.output.raw)
    else:
        process(company)

except ValidationError as e:
    # Pydantic schema mismatch — log fields that failed
    print(f"Validation errors: {e.error_count()}")
    for err in e.errors():
        print(f"  Field '{err['loc']}': {err['msg']}")

If pydantic is None after a successful run: the model returned text it believed was valid JSON but wasn't. Add model_config = ConfigDict(strict=False) to your Pydantic class to allow coercion (e.g., "2010"2010).


Step 6: Use Structured Output Across Multi-Agent Crews

Each task in a crew can have its own output model. Pass structured data between tasks using context:

class EnrichmentResult(BaseModel):
    company: str
    tech_stack: list[str]
    hiring: bool
    latest_funding_round: Optional[str] = None

enrich_task = Task(
    description=(
        "Using the company data provided in context, "
        "identify the tech stack and hiring status of {company_name}."
    ),
    expected_output="Enriched company profile with tech and hiring data.",
    output_pydantic=EnrichmentResult,
    agent=enricher,
    context=[research_task],  # pulls research_task.output into this task's prompt
)

Downstream tasks receive the full serialized JSON of research_task.output.raw as context — which is clean because output_pydantic forced valid JSON on the upstream task.


Verification

crew = Crew(agents=[researcher, enricher], tasks=[research_task, enrich_task])
crew.kickoff(inputs={"company_name": "Linear"})

# Both should be non-None
assert research_task.output.pydantic is not None
assert enrich_task.output.pydantic is not None

# Type checks pass
assert isinstance(research_task.output.pydantic, CompanyResearch)
assert isinstance(enrich_task.output.pydantic, EnrichmentResult)

print("All structured outputs validated ✓")

You should see: No assertion errors and both Pydantic objects accessible with full type hints in your IDE.


What You Learned

  • output_pydantic on a Task enforces a schema without any manual parsing
  • Access the result via task.output.pydantic — it's a real Pydantic object, not a dict
  • task.output.raw is always available as fallback if validation fails
  • Multi-task crews benefit most: structured upstream output becomes clean context for downstream tasks

Limitation: output_pydantic adds ~50–100 tokens of JSON schema instructions to each task prompt. For token-sensitive deployments, use output_json with a plain dict schema instead — it's lighter but loses Pydantic validation.

Tested on CrewAI 0.80.0, Pydantic 2.7, Python 3.12, gpt-4o-mini and claude-3-5-haiku