Reasons to use dataclass over pydantic basemodel
While Pydantic is the industry standard for external data (APIs, JSON parsing), Python's built-in dataclasses are often the better choice for internal data.
To answer your specific questions:
- Is it speed? YES. Dataclasses are significantly faster at creating objects (instantiation).
- Is it strict data type? NO. Pydantic is stricter. Dataclasses do not validate types at runtime; they blindly accept whatever you give them.
Here are the 4 most convincing reasons to use Dataclasses over Pydantic in a modern app, with examples for each.
1. Speed: The "Tight Loop" Argumentβ
Pydantic runs a complex validation engine every time you create an object. Dataclasses just assign values to memory. In a "tight loop" (creating millions of objects), Pydantic can be a bottleneck.
The Benchmark Logic:
- Pydantic: Checks type, converts type, validates constraints. (~1000ns per object)
- Dataclass:
self.x = x. (~100ns per object)
Example:
from dataclasses import dataclass
from pydantic import BaseModel
import timeit
# 1. The Dataclass
@dataclass(slots=True) # slots=True makes it even faster
class PointDC:
x: int
y: int
# 2. The Pydantic Model
class PointPM(BaseModel):
x: int
y: int
# Benchmarking 1 million creations
def loop_dc():
return [PointDC(i, i) for i in range(1_000)]
def loop_pm():
return [PointPM(x=i, y=i) for i in range(1_000)]
# Result: Dataclasses are typically 10x - 20x faster here
# Use Dataclasses when processing large lists of data internally.
2. Predictability: No "Magic" Coercionβ
Pydantic tries to be helpful by "coercing" data. If you pass the string "42" to an int field, Pydantic converts it to 42.
Dataclasses are "dumb"βthey keep exactly what you passed. This is often safer for internal logic where silent type conversion could hide a bug.
Example:
@dataclass
class InventoryItemDC:
name: str
quantity: int
class InventoryItemPM(BaseModel):
name: str
quantity: int
# --- The "Magic" Difference ---
# Pydantic: SILENTLY converts the string "5" to integer 5
item_pm = InventoryItemPM(name="Apple", quantity="5")
print(type(item_pm.quantity)) # <class 'int'> (Magic happened)
# Dataclass: Preserves the string "5" (even though type hint says int)
item_dc = InventoryItemDC(name="Apple", quantity="5")
print(type(item_dc.quantity)) # <class 'str'> (No magic)
# Why use Dataclasses?
# If 'quantity' coming in as a string is a BUG in your code,
# Pydantic hides the bug. Dataclasses let the bug surface so you can fix the root cause.
3. Zero Dependencies (Library Authors)β
If you are writing a library (like a client SDK or a utility tool) that other people will install, you want to keep your "dependency weight" low.
- Dataclasses: Built into Python. Zero extra install size.
- Pydantic: A compiled binary extension (Rust). It adds weight to the installation.
Example: If you are building a simple CLI tool, using Pydantic might add 10-20MB to your docker image or virtual environment. Using Dataclasses adds 0MB.
4. Application Startup Timeβ
Pydantic models are "expensive" to define. When Python first imports a file containing a BaseModel, Pydantic has to inspect fields, build validators, and compile schemas.
Dataclasses are cheap.
Example: For a standard web server (Django/FastAPI), this doesn't matter. But for AWS Lambda or CLI tools (where the script runs for 1 second and dies), Pydantic's import overhead can noticeably slow down the "cold start" time.
π Summary: The Decision Matrixβ
| Feature | Use Dataclasses When... | Use Pydantic When... |
|---|---|---|
| Data Source | Trusted (Internal functions, DB results). | Untrusted (User JSON, API payloads). |
| Speed | You are creating 100k+ objects in a loop. | You are processing single requests. |
| Types | You want exact values (no auto-conversion). | You want smart conversion (str int). |
| Environment | Library code, Lambdas, Scripts. | Web APIs (FastAPI), Config management. |
The Modern Hybrid Approach: Most pro Python apps use both.
- Use Pydantic at the "Edge" (API endpoints) to validate/clean incoming data.
- Convert that data into Dataclasses for the "Core" (business logic) to pass it around quickly and cheaply.
πΊ Relevant Videoβ
Pydantic vs Dataclasses - ArjanCodes
This video provides a practical code walkthrough comparing the syntax and use-cases of both, reinforcing the "trusted vs untrusted" distinction.
