Model validation in Django REST Framework
Updated . 7 min read.
Django REST Framework's ModelSerializer doesn't run a model instance's full_clean() method as part of its validation routine. This behaviour differs from Django's ModelForm. It's reasonable to have assumed that the two would behave similarly in this regard, both given their overall similarity, and Django REST Framework's ubiquity.
Django REST Framework used to run Model.full_clean() as part of its validation routine. It was removed in Django REST Framework 3.0:
We no longer use the
.full_clean()method on model instances, but instead perform all validation explicitly on the serializer. This gives a cleaner separation, and ensures that there's no automatic validation behavior onModelSerializerclasses that can't also be easily replicated on regularSerializerclasses.The
.clean()method will not be called as part of serializer validation, as it would be if using aModelForm.
# But why?
Django REST Framework contributor Xavier Ordoquy justifies this design decision in reply to a Stack Overflow question. The justification is multifaceted and I encourage you to read it. I am however going to focus on the implication this change was implemented with a view that your Django project should have a 'business logic' layer that abstracts state changes.
This sounds entirely reasonable. It's a very common software engineering pattern. Generally though, it speaks to some core software engineering principles (separation of concerns, DRY.etc). You may've taken the approach that 'data access layer' to be entirely handled by Django internally, which then provides you an 'open by default' API, with many ways to integrate your business logic. A lot of Django's batteries (like ModelForm and the Django admin interface) are built on this assumption.
This approach is often true, but it can be problematic for projects that are more complex, have more people working on them, or just exist in an environment where there's a desire to trade in development velocity for fewer defects.
It takes a lot of scar tissue wisdom see these problems before you run into them first-hand. Some of you will already be nodding your heads. For everyone else, consider how you'd implement sending an email to a user whenever their email address was changed. Ensure that your solution:
- Allows for updating multiple records with a single SQL query, e.g. by using
QuerySet.bulk_update()(noModel.save()or signals!), and - Ensures that a hypothetical developer rewriting the 'change email address' form 6 months from now won't accidentally bypass this notification email in a way that won't be noticed.
If you're like me, you'll decide that you need to adopt an approach where all state changes happen in a way that gives you an opportunity to send the email. You'll also want to adopt this convention project-wide, as you'll want non-conformance to stick out like a sore thumb. Or you may come up with something completely different. Again, this is a mammoth topic unto itself. Reasonable people can disagree.
What could this look like? Django REST Framework author Tom Christie describes a convention where code never directly changes a model's state or triggers a database write. Instead, all state change operations occur via custom model methods and model manager methods:
Never write to a model field or call
save()directly. Always use model methods and manager methods for state changing operations.The convention is more clear-cut and easier to follow that "Fat models, thin views", and does not exclude your team from laying an additional business logic layer on top of your models if suitable.
Adopting this as part of your formal Django coding conventions will help your team ensure a good codebase style, and give you confidence in your application-level data integrity.
This seems like a reasonable execution of a 'business logic layer'. It may however be counter to how you currently work with Django.
# Relating this back to Django REST Framework
In his blog post, Tom Christie implies that The Django REST Framework behaviour change is in actually in favour of his posed convention:
Django REST framework's
SerializerAPI ... follows a similar approach to validation as Django'sModelFormimplementation. In the upcoming 3.0 release the validation step will become properly decoupled from the object-creation step, allowing you to strictly enforce model class encapsulation while using REST framework serializers.
Let's look at the Django REST Framework documentation's summary of ModelSerializer's functionality:
- It will automatically generate a set of fields for you, based on the model.
- It will automatically generate validators for the serializer, such as
unique_togethervalidators. - It includes simple default implementations of
.create()and.update().
Providing all this without running model validation is in my opinion fence-sitting such that nobody wins. Those that want to handle the state change process themselves will override create() and update(), or not use ModelSerializer altogether in favour of a vanilla Serializer. Everyone else will reimplement anything they already have in Model.clean() just for Django REST Framework, or—worse—just not realise that Model.clean() isn't being called in the first place.
The third option: calling Model.clean() from ModelSerializer.validate(), needs to be addressed separately. Django REST Framework's 3.0 announcement post implies in no uncertain terms that you should need a good reason to do this, and really pushes you to reimplement validation on your ModelSerializer.
I've used this approach for years in mature projects without much issue at all, so the strong language used in the announcement post feels unwarranted.
My sense is that Django REST Framework painted itself into a corner with some functionality that I don't personally use which is more fundamentally incompatible with supporting model validation. This was alluded to in Xavier's Stack Overflow question from earlier. I acknowledge that I may be missing something, but I'm not seeing the justification for declaring validating within Model.clean()—something that is endorsed by Django's documentation—as being "improper". Let alone in the documentation for a package that is considered so integral to Django that there are regular calls to envelop it.
# Potential solutions
As mentioned, there are a few different ways you can go here. Let's take a look.
# Follow Django REST Framework's advice, and reimplement your validation logic on your ModelSerializer
First, check check if something like field-level validation is more appropriate.
The Django REST Framework 3.0 announcement blog post recommends overriding ModelSerializer.validate(). This is a hook for validation specifically, separate from the actual state change logic.
from rest_framework import serializers
class UserSerializer(serializers.ModelSerializer):
# ...
def validate(self, data):
if data['name'] == 'Bruce' data['age'] < 40:
raise serializers.ValidationError("I don't believe you.")
return data
# Go all-out and adopt Tom Christie's approach
You could extend the previous example by also overriding ModelSerializer.create() and ModelSerializer.update() to interact with your business logic layer. It might look something like this:
from .models import User
class UserSerializer(serializers.ModelSerializer):
# ...
def validate(self, data):
if data['name'] == 'Bruce' data['age'] < 40:
raise serializers.ValidationError("I don't believe you.")
return data
def create(self, validated_data):
return User.objects.your_create_method(**validated_data)
def update(self, instance, validated_data):
return instance.your_update_method(**validated_data)
Maybe your existing validation logic should be the responsibility of the business logic layer. There's no harm in performing validation outside of validate() - it's just offered as a convenient hook.
class UserSerializer(serializers.ModelSerializer):
# ...
def create(self, validated_data):
return User.objects.your_create_method(**validated_data)
def update(self, instance, validated_data):
return instance.your_update_method(**validated_data)
This looks simple on the face of it, but it leaves plenty of questions unanswered.
- How does your business logic layer report that it's been provided invalid data? Does it raise
ValidationError? Or do you raise some other exception,catchthese in yourModelSerializer, andraiseaValidationError?
Bonus question: If you raiseValidationError, do you raise Django'sValidationErroror Django REST Framework'sValidationError? They are different! - Some validation is still being handled by Django REST Framework as
ModelSerializeris still generating fields and validators for you. What is the contract between yourModelSerializerand your services layer? What guarantees are being made regarding the validity of the provided data? Can we even assume that the data types are correct?
There are no one-size-fits-all answers to these questions. This is what software development is about. The important thing is to understand that these things do matter, and to be thoughtful and consistent in your approach.
# Run Model.clean() from your ModelSerializer's validation routine
Django REST Framework's 3.0 announcement provides a brief example of this:
def validate(self, attrs):
instance = ExampleModel(**attrs)
instance.clean()
return attrs
Whilst the battle-tested version I use in my day job has grown to be more complex than this. The above snippet is sufficient for many purposes, and serve as a good starting point for anything more complex.