Django Under Hood 06: Django’s Form Pipeline - From POST Data to Validated Python Objects
Part 6 of the “Django Under the Hood ” series — deep dives into Django’s internals, edge cases, and the mechanics that separate production-grade applications from tutorial code.
if form.is_valid():
form.save()
Two lines. Fourteen steps.
Between request.POST and a saved database record, Django executes a precise sequence of operations: binding data, coercing types, running validators, checking constraints, cleaning fields, and cross-field validation.
Most developers learn “call is_valid(), check errors, call save()." That's enough for basic forms. But when validation behaves unexpectedly — when clean_fieldname doesn't run, when errors appear on the wrong field, when cleaned_data is missing keys — understanding the pipeline is the only way to debug it.
Let’s trace a form submission from raw POST bytes to validated Python objects.
The Form Class Hierarchy
class ContactForm(forms.Form):
name = forms.CharField(max_length=100)
email = forms.EmailField()
message = forms.CharField(widget=forms.Textarea)
This inherits from:
ContactForm
└── Form
└── BaseForm
└── RenderableFormMixin
The real logic lives in BaseForm:
# django/forms/forms.py
class BaseForm(RenderableFormMixin):
def __init__(self, data=None, files=None, ...):
self.data = data or {}
self.files = files or {}
self.is_bound = data is not None or files is not None
self._errors = None # Cached validation errors
self._cleaned_data = None # Not set until validation
Key concept: A form is bound when it receives data. Unbound forms render empty fields. Bound forms validate and show errors.
# Unbound - for displaying empty form
form = ContactForm()
# Bound - for processing submission
form = ContactForm(data=request.POST)
Field Declaration: Metaclass Magic
How does Django know what fields exist?
class ContactForm(forms.Form):
name = forms.CharField()
email = forms.EmailField()
The Form class uses a metaclass that collects field declarations:
# django/forms/forms.py
class DeclarativeFieldsMetaclass(MediaDefiningClass):
def __new__(mcs, name, bases, attrs):
# Collect fields from class attributes
declared_fields = {}
for key, value in list(attrs.items()):
if isinstance(value, Field):
declared_fields[key] = value
attrs.pop(key) # Remove from class
# Inherit fields from parent classes
for base in bases:
if hasattr(base, 'declared_fields'):
declared_fields = {**base.declared_fields, **declared_fields}
new_class = super().__new__(mcs, name, bases, attrs)
new_class.declared_fields = declared_fields
return new_class
At class creation time, fields move from class attributes to declared_fields:
ContactForm.declared_fields
# {'name': <CharField>, 'email': <EmailField>, 'message': <CharField>}
Each form instance gets its own copy:
def __init__(self, ...):
self.fields = copy.deepcopy(self.declared_fields)
This allows per-instance field modification:
form = ContactForm()
form.fields['email'].required = False # Only affects this instance
The Validation Pipeline: 14 Steps
When you call form.is_valid():
# django/forms/forms.py
def is_valid(self):
return self.is_bound and not self.errors
@property
def errors(self):
if self._errors is None:
self.full_clean() # The validation pipeline
return self._errors
full_clean() orchestrates everything:
def full_clean(self):
self._errors = ErrorDict()
if not self.is_bound:
return # Nothing to validate
self._clean_fields() # Steps 1-8 (per field)
self._clean_form() # Steps 9-12 (cross-field)
self._post_clean() # Steps 13-14 (ModelForm hook)
Steps 1–8: Field Cleaning (_clean_fields)
def _clean_fields(self):
for name, field in self.fields.items():
# Step 1: Get raw value from data
value = self._field_data_value(field, self.add_prefix(name))
try:
# Step 2: Check if empty and required
if value in field.empty_values:
if field.required:
raise ValidationError(field.error_messages['required'])
continue # Skip remaining validation
# Step 3: Call field.to_python() - type coercion
value = field.to_python(value)
# Step 4: Call field.validate() - field-specific validation
field.validate(value)
# Step 5: Call field.run_validators() - run validators list
field.run_validators(value)
# Step 6: Store in cleaned_data
self.cleaned_data[name] = value
# Step 7: Call clean_<fieldname>() if it exists
if hasattr(self, f'clean_{name}'):
# Step 8: Replace value with method's return
value = getattr(self, f'clean_{name}')()
self.cleaned_data[name] = value
except ValidationError as e:
self.add_error(name, e)
Let’s trace a single field through this:
class MyForm(forms.Form):
age = forms.IntegerField(min_value=0, max_value=150)
def clean_age(self):
age = self.cleaned_data['age']
if age < 18:
raise ValidationError("Must be 18 or older")
return age
# POST: {'age': '25'}
Steps 9–12: Form Cleaning (_clean_form)
def _clean_form(self):
try:
# Step 9: Call self.clean()
cleaned_data = self.clean()
except ValidationError as e:
# Step 10: Handle ValidationError
self.add_error(None, e) # None = non-field error
else:
# Step 11: Update cleaned_data if clean() returned something
if cleaned_data is not None:
self.cleaned_data = cleaned_data
# Step 12: Validation complete
The clean() method is for cross-field validation:
class RegistrationForm(forms.Form):
password = forms.CharField()
confirm_password = forms.CharField()
def clean(self):
cleaned_data = super().clean()
password = cleaned_data.get('password')
confirm = cleaned_data.get('confirm_password')
if password and confirm and password != confirm:
raise ValidationError("Passwords don't match")
return cleaned_data
Steps 13–14: Post-Clean (_post_clean)
def _post_clean(self):
# Step 13: Hook for subclasses (does nothing in Form)
pass
# Step 14: ModelForm uses this for model validation
For ModelForm, this is where model validation runs:
# django/forms/models.py
class BaseModelForm(BaseForm):
def _post_clean(self):
# Step 13: Apply cleaned_data to model instance
try:
self.instance = construct_instance(
self, self.instance, self._meta.fields, self._meta.exclude
)
except ValidationError as e:
self._update_errors(e)
# Step 14: Run model's full_clean()
try:
self.instance.full_clean(exclude=exclude, validate_unique=False)
except ValidationError as e:
self._update_errors(e)
Why clean_fieldname Didn’t Run
This is the most common confusion. clean_fieldname() only runs if:
- The field passed all previous validation steps
- No ValidationError was raised during steps 2–5
class MyForm(forms.Form):
email = forms.EmailField()
def clean_email(self):
print("This runs!") # Only if email is valid
return self.cleaned_data['email']
# If email is "not-an-email", clean_email() never runs
# The field fails at step 4 (validate) before reaching step 7
The pipeline short-circuits on error:
try:
value = field.to_python(value) # Might fail
field.validate(value) # Might fail
field.run_validators(value) # Might fail
self.cleaned_data[name] = value
if hasattr(self, f'clean_{name}'):
value = getattr(self, f'clean_{name}')() # Only reached if above succeeded
except ValidationError as e:
self.add_error(name, e)
# Skip clean_fieldname() for this field
Field Internals: to_python, validate, validators
to_python: Type Coercion
Converts raw string input to Python type:
# django/forms/fields.py
class IntegerField(Field):
def to_python(self, value):
if value in self.empty_values:
return None
try:
value = int(str(value))
except (ValueError, TypeError):
raise ValidationError(
self.error_messages['invalid'],
code='invalid',
)
return value
Each field type has its own coercion:
validate: Field-Specific Rules
Built-in validation that runs after coercion:
class EmailField(CharField):
def validate(self, value):
super().validate(value) # CharField validation
if value and not email_re.match(value):
raise ValidationError(
self.error_messages['invalid'],
code='invalid',
)
validators: The Validators List
from django.core.validators import MinValueValidator, MaxValueValidator
class IntegerField(Field):
def __init__(self, *, min_value=None, max_value=None, **kwargs):
super().__init__(**kwargs)
if min_value is not None:
self.validators.append(MinValueValidator(min_value))
if max_value is not None:
self.validators.append(MaxValueValidator(max_value))
Validators are callable objects:
# django/core/validators.py
class MinValueValidator:
def __init__(self, limit_value, message=None):
self.limit_value = limit_value
def __call__(self, value):
if value < self.limit_value:
raise ValidationError(
f'Ensure this value is greater than or equal to {self.limit_value}.'
)
Executed by run_validators():
def run_validators(self, value):
errors = []
for validator in self.validators:
try:
validator(value)
except ValidationError as e:
errors.extend(e.error_list)
if errors:
raise ValidationError(errors)
Note: All validators run, even if earlier ones fail. Errors accumulate.
Error Handling: Where Errors Go
def add_error(self, field, error):
if field is None:
# Non-field error (form-level)
self._errors.setdefault('__all__', ErrorList()).extend(error)
else:
# Field-specific error
self._errors.setdefault(field, ErrorList()).extend(error)
# Remove from cleaned_data if present
if field in self.cleaned_data:
del self.cleaned_data[field]
Three places errors can live:
form.errors['email'] # Field-specific errors
form.errors['__all__'] # Non-field errors (form.clean())
form.non_field_errors() # Same as form.errors.get('__all__', [])
ErrorDict and ErrorList
# django/forms/utils.py
class ErrorDict(dict):
"""A dict that renders as HTML <ul>."""
def as_ul(self):
return format_html_join(
'', '<li>{}{}</li>',
((k, v.as_ul()) for k, v in self.items())
)
class ErrorList(list):
"""A list that renders as HTML."""
def as_ul(self):
return format_html(
'<ul class="errorlist">{}</ul>',
format_html_join('', '<li>{}</li>', ((e,) for e in self))
)
In templates:
{{ form.errors }} <!-- ErrorDict.as_ul() -->
{{ form.email.errors }} <!-- ErrorList.as_ul() -->
{{ form.non_field_errors }} <!-- ErrorList.as_ul() -->
ModelForm: The Model Connection
class ArticleForm(forms.ModelForm):
class Meta:
model = Article
fields = ['title', 'content', 'published']
ModelForm generates fields from model fields:
# django/forms/models.py
def fields_for_model(model, fields=None, exclude=None, ...):
field_dict = {}
for f in model._meta.get_fields():
if not isinstance(f, ModelField):
continue
if fields is not None and f.name not in fields:
continue
if exclude and f.name in exclude:
continue
# Convert model field to form field
form_field = f.formfield() # Each model field knows its form equivalent
if form_field:
field_dict[f.name] = form_field
return field_dict
Model Field to Form Field Mapping
# django/db/models/fields/__init__.py
class CharField(Field):
def formfield(self, **kwargs):
defaults = {'max_length': self.max_length}
defaults.update(kwargs)
return forms.CharField(**defaults)
class IntegerField(Field):
def formfield(self, **kwargs):
return forms.IntegerField(**kwargs)
class ForeignKey(RelatedField):
def formfield(self, **kwargs):
return forms.ModelChoiceField(
queryset=self.related_model._default_manager.all(),
**kwargs
)
save(): Creating or Updating
# django/forms/models.py
class BaseModelForm(BaseForm):
def __init__(self, data=None, instance=None, ...):
self.instance = instance or self._meta.model()
def save(self, commit=True):
# Apply cleaned_data to instance
for name, value in self.cleaned_data.items():
setattr(self.instance, name, value)
if commit:
self.instance.save()
self._save_m2m() # Handle many-to-many
else:
# Caller must save and call save_m2m() later
self.save_m2m = self._save_m2m
return self.instance
The commit=False pattern:
form = ArticleForm(request.POST)
if form.is_valid():
article = form.save(commit=False)
article.author = request.user # Set additional fields
article.save()
form.save_m2m() # Don't forget M2M!
Widget System: Rendering Fields
Each field has a widget for HTML rendering:
class CharField(Field):
widget = TextInput # Default widget
class EmailField(CharField):
widget = EmailInput
class BooleanField(Field):
widget = CheckboxInput
Widgets handle rendering and value extraction:
# django/forms/widgets.py
class TextInput(Input):
input_type = 'text'
def render(self, name, value, attrs=None):
return format_html(
'<input type="{}" name="{}" value="{}"{}>',
self.input_type,
name,
value or '',
flatatt(attrs),
)
def value_from_datadict(self, data, files, name):
return data.get(name)
Custom Widgets
class DatePickerWidget(forms.DateInput):
template_name = 'widgets/datepicker.html'
def __init__(self, attrs=None):
default_attrs = {'class': 'datepicker', 'autocomplete': 'off'}
if attrs:
default_attrs.update(attrs)
super().__init__(attrs=default_attrs)
class Media:
css = {'all': ('css/datepicker.css',)}
js = ('js/datepicker.js',)
FormSet: Multiple Forms
ArticleFormSet = forms.formset_factory(ArticleForm, extra=3)
formset = ArticleFormSet(request.POST)
FormSet manages multiple form instances:
# django/forms/formsets.py
class BaseFormSet:
def __init__(self, data=None, ...):
self.data = data or {}
self.forms = []
# Management form tracks total/initial/min/max
self.management_form
@property
def forms(self):
# Lazy form instantiation
forms = []
for i in range(self.total_form_count()):
forms.append(self._construct_form(i))
return forms
def is_valid(self):
# Validate all forms
return all(form.is_valid() for form in self.forms)
The management form:
<input type="hidden" name="form-TOTAL_FORMS" value="3">
<input type="hidden" name="form-INITIAL_FORMS" value="0">
<input type="hidden" name="form-MIN_NUM_FORMS" value="0">
<input type="hidden" name="form-MAX_NUM_FORMS" value="1000">
Common Gotchas
Gotcha 1: cleaned_data Missing Keys
def clean(self):
password = self.cleaned_data['password'] # KeyError!
If password field failed validation, it's not in cleaned_data.
Fix: Use .get()
def clean(self):
password = self.cleaned_data.get('password')
if password:
# Process
Gotcha 2: clean_fieldname Must Return Value
def clean_email(self):
email = self.cleaned_data['email']
if User.objects.filter(email=email).exists():
raise ValidationError("Email taken")
# Forgot to return! cleaned_data['email'] becomes None
Fix: Always return
def clean_email(self):
email = self.cleaned_data['email']
if User.objects.filter(email=email).exists():
raise ValidationError("Email taken")
return email # Don't forget!
Gotcha 3: Validation Order
Fields validate in declaration order:
class MyForm(forms.Form):
first = forms.CharField()
second = forms.CharField()
def clean_second(self):
# self.cleaned_data['first'] exists here
# (if first field passed validation)
return self.cleaned_data['second']
But in clean(), all fields are already processed:
def clean(self):
# Access any field from cleaned_data
# (if they passed validation)
Gotcha 4: BooleanField Required
class MyForm(forms.Form):
agree = forms.BooleanField() # required=True by default!
Unchecked checkbox = field not in POST data = fails required validation.
Fix: required=False or NullBooleanField
agree = forms.BooleanField(required=False)
What’s Next
This was the form and validation pipeline — 14 steps from POST to Python.
Next in the series: Authentication Backend Chain — how Django authenticates users, the backend resolution order, session mechanics, and why custom backends sometimes don’t work.
Series: Django Under the Hood
- What Actually Happens When a Request Hits Your Server
- The ORM Query Compiler
- Connection Management and the Database Wrapper
- Signal Dispatch Internals
- Template Engine Compilation
- Form and Validation Pipeline ← You are here
- Authentication Backend Chain (coming next)
- Static Files and WhiteNoise Internals
- Migration System Deep Dive
- Test Client and Request Factory Mechanics