Introduction
Pydantic is the most widely used data validation library in the Python ecosystem. If you have worked with FastAPI, you have used Pydantic — it is what makes request body parsing and response serialization type-safe. Pydantic v2 (released June 2023) rewrote the core in Rust, making validation 5–50x faster than v1 while adding a cleaner, more explicit API.
This guide covers Pydantic v2 from the ground up: how the type system works, how to configure fields, how to validate and parse JSON, how to serialize models back to dicts and JSON, and how to write custom validators. It also covers the key differences from v1 for teams migrating existing code.
Tool tip: If you have a JSON response and want Pydantic model code generated automatically, use the JSON to Python Dataclass / Pydantic Generator. It handles camelCase → snake_case conversion and Field(alias=...) generation.
Installation
Pydantic v2 requires Python 3.8+. Install it with pip:
pip install pydantic # installs v2 by default since June 2023 pip install pydantic==2.7.0 # pin a specific v2 version
Pydantic v2 ships with pydantic-core (the Rust extension) bundled automatically. You do not need to install anything separately. Verify the version with:
import pydantic print(pydantic.VERSION) # e.g., "2.7.0"
BaseModel Basics
The core of Pydantic is BaseModel. Define a class that inherits from it and annotate fields with Python type hints. Pydantic reads those annotations at class-creation time and builds a validation schema from them.
from pydantic import BaseModel
from datetime import datetime
class User(BaseModel):
id: int
name: str
email: str
is_active: bool = True
created_at: datetime
# Construction: Pydantic validates and coerces values
user = User(
id=1,
name="Alice",
email="[email protected]",
created_at="2024-01-15T10:30:00Z" # string → datetime automatically
)
print(user.id) # 1 (int, not str "1")
print(user.created_at) # datetime(2024, 1, 15, 10, 30, tzinfo=UTC)
print(user.is_active) # True (default applied)A critical point: Pydantic v2 validates on construction and raises ValidationError if values cannot be coerced to the annotated types. In the example above, the string "2024-01-15T10:30:00Z" is automatically parsed into a datetime object — you do not need to call datetime.fromisoformat()yourself. This is called lax mode (the default): Pydantic tries reasonable coercions before failing.
Validation errors
When validation fails, Pydantic raises ValidationError with a structured list of errors. Each error has a location, type, and human-readable message:
from pydantic import ValidationError
try:
User(id="not-an-int", name="Bob", email="[email protected]", created_at="not-a-date")
except ValidationError as e:
print(e.error_count()) # 2
for err in e.errors():
print(err["loc"], err["type"], err["msg"])
# ('id',) int_parsing Input should be a valid integer...
# ('created_at',) datetime_parsing Input should be a valid datetime...Configuring Fields with Field()
The Field() function adds constraints and metadata to individual fields. It replaces what required an entire custom validator in earlier libraries. Common uses: setting default values, adding JSON aliases, and constraining numeric or string ranges.
from pydantic import BaseModel, Field
from typing import Annotated
class Product(BaseModel):
# Alias maps JSON "productName" → Python "product_name"
product_name: str = Field(alias="productName", min_length=1, max_length=200)
# Constrained numeric fields
price: float = Field(gt=0, le=1_000_000) # 0 < price <= 1,000,000
quantity: int = Field(default=0, ge=0) # non-negative, defaults to 0
discount: float = Field(default=0.0, ge=0, lt=1) # 0 <= discount < 1
# Description and example for OpenAPI docs (FastAPI reads these)
sku: str = Field(
pattern=r"^[A-Z]{2}-[0-9]{4}$",
description="Product SKU in format AB-1234",
examples=["XY-0001"]
)
# Construction uses the alias
p = Product(**{"productName": "Laptop", "price": 999.99, "sku": "XY-0001"})
print(p.product_name) # "Laptop"
# Or use populate_by_name to allow both alias and field name:
from pydantic import ConfigDict
class Product2(BaseModel):
model_config = ConfigDict(populate_by_name=True)
product_name: str = Field(alias="productName")The Annotated pattern is the v2-preferred way to attach constraints. It keeps the field annotation and the constraint logic together without coupling them to Field():
from typing import Annotated
from pydantic import BaseModel, Field
# Define a reusable constrained type
PositiveFloat = Annotated[float, Field(gt=0)]
ShortString = Annotated[str, Field(min_length=1, max_length=100)]
class Item(BaseModel):
name: ShortString
price: PositiveFloatNested Models
Pydantic automatically validates nested models. A field annotated as another BaseModel subclass is validated recursively:
from pydantic import BaseModel
from typing import List
class Address(BaseModel):
street: str
city: str
zip_code: str
country: str = "US"
class OrderItem(BaseModel):
product_id: int
quantity: int
unit_price: float
class Order(BaseModel):
order_id: str
customer_email: str
shipping_address: Address
items: List[OrderItem]
total: float
# Nested dicts are coerced into the appropriate models
order = Order(**{
"order_id": "ORD-001",
"customer_email": "[email protected]",
"shipping_address": {
"street": "123 Main St",
"city": "Austin",
"zip_code": "78701"
},
"items": [
{"product_id": 42, "quantity": 2, "unit_price": 29.99},
{"product_id": 77, "quantity": 1, "unit_price": 49.99}
],
"total": 109.97
})
print(order.shipping_address.city) # "Austin"
print(order.items[0].product_id) # 42Parsing JSON: model_validate_json()
In v2, the correct way to parse a JSON string directly into a model is model_validate_json(). This is faster than json.loads() followed by model_validate() because pydantic-core handles the JSON parsing and validation in a single pass in Rust.
from pydantic import BaseModel
from datetime import datetime
class ApiResponse(BaseModel):
user_id: int
username: str
last_login: datetime
roles: list[str]
json_string = """
{
"user_id": 42,
"username": "alice",
"last_login": "2024-09-15T14:22:00Z",
"roles": ["admin", "editor"]
}
"""
# Fastest path: parse JSON + validate in one step
response = ApiResponse.model_validate_json(json_string)
print(response.last_login) # datetime(2024, 9, 15, 14, 22, tzinfo=UTC)
print(response.roles) # ['admin', 'editor']
# From a dict (already-parsed JSON):
data = {"user_id": 42, "username": "alice", "last_login": "2024-09-15T14:22:00Z", "roles": ["admin"]}
response2 = ApiResponse.model_validate(data)v1 migration note: In Pydantic v1, you used User.parse_raw(json_string) and User.parse_obj(dict). In v2 these are deprecated. Use model_validate_json() and model_validate() instead.
Serialization: model_dump() and model_dump_json()
To convert a model back to a plain dict or JSON string, use model_dump() and model_dump_json(). These replace v1's .dict() and .json() methods.
from pydantic import BaseModel, Field
from datetime import datetime
class User(BaseModel):
user_id: int
name: str = Field(alias="userName")
email: str
created_at: datetime
user = User(user_id=1, userName="Alice", email="[email protected]",
created_at=datetime(2024, 1, 15, 10, 30))
# model_dump(): returns a Python dict
d = user.model_dump()
# {'user_id': 1, 'name': 'Alice', 'email': '[email protected]', 'created_at': datetime(...)}
# Use aliases in output (useful for REST API responses with camelCase keys)
d_aliased = user.model_dump(by_alias=True)
# {'user_id': 1, 'userName': 'Alice', 'email': '[email protected]', 'created_at': datetime(...)}
# Exclude fields
d_partial = user.model_dump(exclude={"email"})
# model_dump_json(): returns a JSON string directly (faster than json.dumps(model_dump()))
json_str = user.model_dump_json()
# '{"user_id":1,"name":"Alice","email":"[email protected]","created_at":"2024-01-15T10:30:00"}'
# Round-trip: parse → validate → serialize
json_str2 = user.model_dump_json(by_alias=True)
# '{"user_id":1,"userName":"Alice",...}'Custom Validators
When built-in constraints (gt, min_length, pattern) are not enough, use @field_validator for single-field logic and @model_validator for cross-field logic.
@field_validator
from pydantic import BaseModel, field_validator
class PasswordForm(BaseModel):
username: str
password: str
confirm_password: str
@field_validator("password")
@classmethod
def password_strength(cls, v: str) -> str:
if len(v) < 8:
raise ValueError("Password must be at least 8 characters")
if not any(c.isupper() for c in v):
raise ValueError("Password must contain at least one uppercase letter")
return v # return the (possibly transformed) value
# mode="before" runs BEFORE Pydantic's own type coercion
@field_validator("username", mode="before")
@classmethod
def strip_username(cls, v: str) -> str:
return v.strip().lower()@model_validator for cross-field rules
from pydantic import BaseModel, model_validator
from typing import Self
class PasswordForm(BaseModel):
password: str
confirm_password: str
@model_validator(mode="after")
def passwords_must_match(self) -> Self:
if self.password != self.confirm_password:
raise ValueError("Passwords do not match")
return self
class DateRange(BaseModel):
start_date: str
end_date: str
@model_validator(mode="after")
def end_after_start(self) -> Self:
if self.end_date <= self.start_date:
raise ValueError("end_date must be after start_date")
return selfStrict Mode vs Lax Mode
By default, Pydantic uses lax mode: it tries reasonable type coercions before raising an error. Passing the string "42" where an int is expected will succeed. This is convenient for parsing JSON (where all values start as strings), but it can mask bugs when constructing models programmatically.
Strict mode disables coercions: the value must already be exactly the right type. This is useful for internal code where you control the input and want to catch type errors early.
from pydantic import BaseModel, ConfigDict
# Strict at the model level
class StrictUser(BaseModel):
model_config = ConfigDict(strict=True)
age: int
StrictUser(age="25") # raises ValidationError — "25" is a str, not int
StrictUser(age=25) # OK
# Strict per-field only
from pydantic import Field
from typing import Annotated
class MixedModel(BaseModel):
strict_age: Annotated[int, Field(strict=True)]
lenient_score: float # still allows "3.14" → 3.14
# model_validate in strict mode (override for one call)
StrictUser.model_validate({"age": 25}, strict=True)FastAPI Integration
FastAPI is built on Pydantic. Every request body, query parameter schema, and response model in FastAPI is a Pydantic model. When you use v2 with FastAPI 0.100+, you get automatic request validation, detailed error responses, and OpenAPI documentation generation for free.
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field, EmailStr
from typing import Optional
app = FastAPI()
# Request body model
class CreateUserRequest(BaseModel):
name: str = Field(min_length=1, max_length=100)
email: EmailStr # requires "pydantic[email]"
age: Optional[int] = Field(default=None, ge=0, le=150)
# Response model — controls what fields are returned
class UserResponse(BaseModel):
id: int
name: str
email: str
@app.post("/users", response_model=UserResponse, status_code=201)
async def create_user(body: CreateUserRequest):
# body is already validated by FastAPI/Pydantic
user = await db.create_user(body.name, body.email, body.age)
return UserResponse(id=user.id, name=user.name, email=user.email)
# Pydantic serializes this to JSON automaticallyThe response_model parameter ensures that only the fields defined in UserResponse are included in the response — even if the database model has additional fields like password_hash. This is a security feature, not just a convenience.
Key Differences from Pydantic v1
If you are migrating from v1, these are the most common changes:
| Pydantic v1 | Pydantic v2 |
|---|---|
| .dict() | .model_dump() |
| .json() | .model_dump_json() |
| .parse_obj(d) | .model_validate(d) |
| .parse_raw(s) | .model_validate_json(s) |
| @validator | @field_validator |
| @root_validator | @model_validator |
| class Config: orm_mode = True | model_config = ConfigDict(from_attributes=True) |
| class Config: allow_population_by_field_name = True | model_config = ConfigDict(populate_by_name=True) |
Pydantic ships a compatibility layer: from pydantic.v1 import BaseModel lets you import the v1 API from within a v2 installation. This is useful if you have a dependency that still uses v1 — you can avoid conflicts while you migrate your own code incrementally.
Reading ORM Objects (from_attributes)
When you use an ORM like SQLAlchemy or Django ORM, you often need to convert ORM model instances into Pydantic models for API responses. Enable from_attributes=True in the model config to let Pydantic read attribute access (obj.field) instead of key access (obj["field"]):
from pydantic import BaseModel, ConfigDict
class UserSchema(BaseModel):
model_config = ConfigDict(from_attributes=True)
id: int
name: str
email: str
# Works with SQLAlchemy ORM instances:
# db_user = session.get(User, 1) # SQLAlchemy ORM object
# schema = UserSchema.model_validate(db_user) # reads db_user.id, db_user.name, etc.Pydantic, Dataclasses, or TypedDict?
Pydantic is not always the right tool. Choose based on what you need:
- Pydantic BaseModel: when you need runtime validation, JSON parsing, serialization, or FastAPI integration. The validation overhead (microseconds per model in v2) is negligible for API workloads.
- Python dataclasses: when you need no external dependencies, the data is internal-only, and you never serialize to JSON. Standard library only; no validation.
- TypedDict: when you are typing existing code that passes plain dicts around and cannot change to class instances. Provides mypy/pyright checking; zero runtime overhead.
- Pydantic dataclasses: when you want dataclass syntax (
@dataclass) but with Pydantic validation. Usefrom pydantic.dataclasses import dataclass— this is a drop-in replacement for the standard dataclass decorator.
Related Tools
- JSON to Python Dataclass / Pydantic Generator — Generate Pydantic BaseModel or dataclass code from a JSON sample
- JSON to Pydantic — Focused Pydantic v2 code generator with validators
- JSON Schema Generator — Generate JSON Schema from JSON (compatible with Pydantic's model_json_schema())
- JSON to Zod Schema — TypeScript equivalent of Pydantic: runtime validation with static type inference