Python Descriptors Guide
Ever wonder how Django and SQLAlchemy make Python feel like magic?
You write user.name or session.query(User), and somehow, validation, lazy loading, and database magic just happen.
What if I told you this "magic" isn’t magic at all? It’s a secret weapon hidden in plain sight: Python descriptors.
Most developers use them daily without realizing it. But the top 1%? They weaponize descriptors to build frameworks, slash boilerplate, and write code that feels like cheating.
In this deep dive, I’ll pull back the curtain on the technique that separates senior engineers from the rest. No fluff — just battle-tested patterns you can steal to level up your Python architecture, starting today.
What Are Descriptors and Why Should You Care?
If you’ve used Python for any length of time, you’ve definitely used descriptors, even if you didn’t realize it.
Properties, methods, class methods, static methods — they’re all implemented using descriptors. But what exactly are they?
In the simplest terms, a descriptor is an object attribute with “binding behavior,” one whose attribute access has been overridden by three special methods. These methods are __get__(), __set__(), and __delete__(). If any of these methods are defined for an object, it’s said to be a descriptor.
Here’s where it gets interesting: descriptors give you complete control over how attributes are accessed, modified, and deleted. This isn’t just syntactic sugar — it’s a fundamental building block of Python’s object model that allows for incredibly elegant and powerful designs.
The Descriptor Protocol Deep Dive
Let’s roll up our sleeves and look at the descriptor protocol in detail:
class Descriptor:
def __get__(self, obj, obj_type=None):
# called when the descriptor accessed
pass
def __set__(self, obj, value):
#called when the descriptor is assigned to
pass
def __delete__(self, obj):
# called the descriptor when it'is deleted
pass
The magic happens in how Python resolves attribute access. When you access an attribute on an object (e.g., obj.attr), Python follows this lookup chain:
- Check if the attribute is a data descriptor in the class of the object
- Check if the attribute exists in the object’s
__dict__ - Check if the attribute is a non-data descriptor in the class of the object
- Check if the attribute exists in the parent classes
- Raise
AttributeErrorif not found
the distinction between data descriptors (those with __set__ or __delete__) and non-data descriptors (those with only __get__) is crucial. Data descriptors take precedence over instance attributes, while instance attributes take precedence over non-data descriptors.
Practical Magic: Building a Validation Framework
Let’s put theory into practice, here i will show you how to build a validation framework using descriptors that will make your code cleaner, more maintainable, and more robust.
class ValidatedAttribute:
def __init__(self, validator=None, default=None):
self.validator = validator
self.default = default
self._name = None
def __set_name__(self, owner, name):
self._name = f"_{name}"
def __get__(self, obj, obj_type=None):
if obj is None:
return self
return getattr(obj, self._name, self.default)
def __set__(self, obj, value):
if self.validator and not self.validator(value):
raise ValueError(f"Invalid value for {self._name[1:]}: {value}")
setattr(obj, self._name, value)
def __delete__(self, obj):
if hasattr(obj, self._name):
delattr(obj, self._name)
def is_positive(value):
return isinstance(value, (int, float)) and value > 0
def is_email(value):
return isinstance(value, str) and "@" in value and "." in value.split("@")[-1]
class User:
age = ValidatedAttribute(validator=is_positive, default=18)
email = ValidatedAttribute(validator=is_email)
def __init__(self, email, age=None):
self.email = email
if age is not None:
self.age = age
#Usage
try:
user = User("invalid-email", -5)
except ValueError as e:
print(e) #print -> "invalid value for email: invalid-email"
notice how clean the User class is? All the validation logic is encapsulated in the descriptor, making it reusable across multiple classes. This is the power of descriptors — they allow you to extract cross-cutting concerns into reusable components.
Advanced Descriptor Patterns
Let’s explore some advanced patterns that showcase the true power of descriptors.
Lazy Loading with Descriptors
class LazyProperty:
def __init__(self, func):
self._func = func
self._name = func.__name__
def __get__(self, obj, obj_type=None):
if obj is None:
return self
value = self._func(obj)
setattr(obj, self._name, value)
return value
class DataProcessor:
def __init__(self, data):
self.data = data
@LazyProperty
def processed_data(self):
print("expensive computation happening.")
# simulate expensive processing
result = [x * 2 for x in self.data]
return result
#Usage
processor = DataProcessor([1, 2, 3])
print("Processor created")
print(processor.processed_data) # "huge computation happen..." then [2,4,6]
print(processor.processed_data) # [2,4,6] (no computation this time)
This pattern is very useful for expensive computations that should only be performed when needed and then cached for future access.
Cached Methods with Expiration
import time
class CachedMethod:
def __init__(self, ttl=60):
self.ttl = ttl # Time to live in seconds
self._cache = {}
def __call__(self, func):
self._func = func
return self
def __get__(self, obj, obj_type=None):
if obj is None:
return self
def wrapper(*args, **kwargs):
# Create a cache key based on the method name and arguments
key = (self._func.__name__, args, frozenset(kwargs.items()))
#check if we have a cached result and if it;s still valid
if key in self._cache:
result, timestamp = self._cache[key]
if time.time() - timestamp < self.ttl:
return result
#process the result and cache it
result = self._func(obj, *args, **kwargs)
self._cache[key] = (result, time.time())
return result
return wrapper
class API:
@CachedMethod(ttl=30) # cache data till 30 seconds
def get_user_data(self, user_id):
print(f"Fetching data for user {user_id}...")
# Simulate API call
time.sleep(1)
return {"user_id": user_id, "data": "some data"}
# usage
api = API()
print(api.get_user_data(123)) # "fetching data...." then the data
print(api.get_user_data(123)) # returns cached data immediately
time.sleep(31)
print(api.get_user_data(123)) # "fetching data..." again (cache has expired)
This pattern is perfect for caching results of expensive operations like API calls, database queries, or complex computations.
Descriptors vs. Properties: What’s the Difference?
Many developers wonder why they should use descriptors when properties seem to do something similar. The key difference is re-usability. Properties are defined per class, while descriptors are reusable across multiple classes.
Consider this example:
# Using properties
class Person:
def __init__(self, name):
self._name = name
@property
def name(self):
return self._name.title()
@name.setter
def name(self, value):
if not isinstance(value, str):
raise TypeError("Name must be in a string")
self._name = value
class Product:
def __init__(self, name):
self._name = name
@property
def name(self):
return self._name.title()
@name.setter
def name(self, value):
if not isinstance(value, str):
raise TypeError("Name must be in string")
self._name = value
Notice the duplication? Now let’s see how descriptors solve this:
class NameAttribute:
def __get__(self, obj, obj_type=None):
if obj is None:
return self
return obj._name.title()
def __set__(self, obj, value):
if not isinstance(value, str):
raise TypeError("Name must be a string")
obj._name = value
class Person:
name = NameAttribute()
def __init__(self, name):
self._name = name
class Product:
name = NameAttribute()
def __init__(self, name):
self._name = name
Much cleaner and more maintainable!
Descriptors in the Wild: Real-World Examples
Descriptors aren’t just academic exercises — they’re used extensively in Python’s standard library and popular frameworks.
SQLAlchemy’s ORM
SQLAlchemy uses descriptors extensively for its ORM. When you define a column:
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
Those Column objects are descriptors that handle the translation between Python objects and database rows.
Django’s ORM
Similarly, Django uses descriptors for its model fields:
class Person(models.Model):
first_name = models.CharField(max_length=30)
last_name = models.CharField(max_length=30)
models.CharField is a descriptor that handles validation, database conversion, and more.
Python’s property
Even Python’s built-in property is implemented using descriptors. In fact, it’s just a convenient way to create a data descriptor without having to define a full class.
Performance Considerations
One concern that often comes up with descriptors is performance. While it’s true that descriptor access is slightly slower than direct attribute access, the difference is negligible in most applications.
Let’s benchmark it:
import timeit
class DirectAccess:
def __init__(self):
self.value = 42
class DescriptorAccess:
def __init__(self):
self._value = 42
@property
def value(self):
return self._value
@value.setter
def value(self, v):
self._value = v
direct = DirectAccess()
descriptor = DescriptorAccess()
# Test direct access
direct_time = timeit.timeit('direct.value', globals=globals(), number=1000000)
# Test descriptor access
descriptor_time = timeit.timeit('descriptor.value', globals=globals(), number=1000000)
print(f"Direct access: {direct_time:.6f} seconds")
print(f"Descriptor access: {descriptor_time:.6f} seconds")
print(f"Ratio: {descriptor_time / direct_time:.2f}x slower")
On my machine, descriptor access is about 2–3x slower than direct access. While this sounds significant, remember that we’re doing a million accesses. In real-world applications, the benefits of descriptors often far outweigh this small performance cost.
Common Pitfalls and How to Avoid Them
After years of working with descriptors, I’ve seen developers make the same mistakes repeatedly. Here are the most common pitfalls and how to avoid them:
1. Forgetting __set_name__
In Python 3.6+, the __set_name__ method is called automatically when a descriptor is created in a class definition. It’s essential for capturing the name of the attribute the descriptor is assigned to.
class MyDescriptor:
def __set_name__(self, owner, name):
self.name = name # save the attribute name
def __get__(self, obj, obj_type=None):
if obj is None:
return self
return f"Value of {self.name}"
2. Mishandling instance vs. class access
A common mistake is not properly handling the case when a descriptor is accessed on the class rather than an instance:
class BrokenDescriptor:
def __get__(self, obj, obj_type=None):
return obj.some_attr # this will fail if obj is None
class WorkingDescriptor:
def __get__(self, obj, obj_type=None):
if obj is None:
return self # return the descriptor itself
return obj.some_attr
3. Storing state in the descriptor itself
Remember that descriptors are created once when the class is defined, not once per instance. If you need to store instance-specific state, store it in the instance:
class WrongDescriptor:
def __init__(self, default):
self.value = default # this is shared across all instances!
def __get__(self, obj, obj_type=None):
return self.value
def __set__(self, obj, value):
self.value = value #this affects all instances!
class RightDescriptor:
def __init__(self, default):
self.default = default
def __set_name__(self, owner, name):
self.name = f"_{name}"
def __get__(self, obj, obj_type=None):
if obj is None:
return self
return getattr(obj, self.name, self.default)
def __set__(self, obj, value):
setattr(obj, self.name, value) #store in the instance
Closing Thoughts: The Real Magic
Descriptors aren’t an obscure trick. They’re a core part of Python’s design.
Once you understand them, you’ll see how Django and SQLAlchemy weave their magic — and you’ll realize you can do the same.
Start with something small, like validation or lazy loading. Then level up to caching, proxying, or even ORM-like abstractions.
Because the real magic of Python isn’t hidden in its frameworks.
It’s built right into the language — waiting for you to use it.
Happy coding.