Python Descriptors Guide

Python Descriptors Guide

Ever wonder how Django and SQLAlchemy make Python feel like magic?

You write user.name or session.query(User), and somehow, validation, lazy loading, and database magic just happen.

What if I told you this "magic" isn’t magic at all? It’s a secret weapon hidden in plain sight: Python descriptors.

Most developers use them daily without realizing it. But the top 1%? They weaponize descriptors to build frameworks, slash boilerplate, and write code that feels like cheating.

In this deep dive, I’ll pull back the curtain on the technique that separates senior engineers from the rest. No fluff — just battle-tested patterns you can steal to level up your Python architecture, starting today.

What Are Descriptors and Why Should You Care?

If you’ve used Python for any length of time, you’ve definitely used descriptors, even if you didn’t realize it.

Properties, methods, class methods, static methods — they’re all implemented using descriptors. But what exactly are they?

In the simplest terms, a descriptor is an object attribute with “binding behavior,” one whose attribute access has been overridden by three special methods. These methods are __get__(), __set__(), and __delete__(). If any of these methods are defined for an object, it’s said to be a descriptor.

Here’s where it gets interesting: descriptors give you complete control over how attributes are accessed, modified, and deleted. This isn’t just syntactic sugar — it’s a fundamental building block of Python’s object model that allows for incredibly elegant and powerful designs.

The Descriptor Protocol Deep Dive

Let’s roll up our sleeves and look at the descriptor protocol in detail:

class Descriptor:
    def __get__(self, obj, obj_type=None):
        # called when the descriptor accessed
        pass
    
    def __set__(self, obj, value):
        #called when the descriptor is assigned to
        pass
    
    def __delete__(self, obj):
        # called the descriptor when it'is deleted
        pass

The magic happens in how Python resolves attribute access. When you access an attribute on an object (e.g., obj.attr), Python follows this lookup chain:

  1. Check if the attribute is a data descriptor in the class of the object
  2. Check if the attribute exists in the object’s __dict__
  3. Check if the attribute is a non-data descriptor in the class of the object
  4. Check if the attribute exists in the parent classes
  5. Raise AttributeError if not found

the distinction between data descriptors (those with __set__ or __delete__) and non-data descriptors (those with only __get__) is crucial. Data descriptors take precedence over instance attributes, while instance attributes take precedence over non-data descriptors.

Practical Magic: Building a Validation Framework

Let’s put theory into practice, here i will show you how to build a validation framework using descriptors that will make your code cleaner, more maintainable, and more robust.

class ValidatedAttribute:
    def __init__(self, validator=None, default=None):
        self.validator = validator
        self.default = default
        self._name = None
    
    def __set_name__(self, owner, name):
        self._name = f"_{name}"
    
    def __get__(self, obj, obj_type=None):
        if obj is None:
            return self
        return getattr(obj, self._name, self.default)
    
    def __set__(self, obj, value):
        if self.validator and not self.validator(value):
            raise ValueError(f"Invalid value for {self._name[1:]}: {value}")
        setattr(obj, self._name, value)
    
    def __delete__(self, obj):
        if hasattr(obj, self._name):
            delattr(obj, self._name)

def is_positive(value):
    return isinstance(value, (int, float)) and value > 0

def is_email(value):
    return isinstance(value, str) and "@" in value and "." in value.split("@")[-1]

class User:
    age = ValidatedAttribute(validator=is_positive, default=18)
    email = ValidatedAttribute(validator=is_email)
    
    def __init__(self, email, age=None):
        self.email = email
        if age is not None:
            self.age = age

#Usage
try:
    user = User("invalid-email", -5)
except ValueError as e:
    print(e)  #print -> "invalid value for email: invalid-email"

notice how clean the User class is? All the validation logic is encapsulated in the descriptor, making it reusable across multiple classes. This is the power of descriptors — they allow you to extract cross-cutting concerns into reusable components.

Advanced Descriptor Patterns

Let’s explore some advanced patterns that showcase the true power of descriptors.

Lazy Loading with Descriptors

class LazyProperty:
    def __init__(self, func):
        self._func = func
        self._name = func.__name__
    
    def __get__(self, obj, obj_type=None):
        if obj is None:
            return self
        value = self._func(obj)
        setattr(obj, self._name, value)
        return value

class DataProcessor:
    def __init__(self, data):
        self.data = data
    
    @LazyProperty
    def processed_data(self):
        print("expensive computation happening.")
        # simulate expensive processing
        result = [x * 2 for x in self.data]
        return result

#Usage
processor = DataProcessor([1, 2, 3])
print("Processor created")
print(processor.processed_data)  # "huge computation happen..." then [2,4,6]
print(processor.processed_data)  # [2,4,6] (no computation this time)

This pattern is very useful for expensive computations that should only be performed when needed and then cached for future access.

Cached Methods with Expiration

import time

class CachedMethod:
    def __init__(self, ttl=60):
        self.ttl = ttl  # Time to live in seconds
        self._cache = {}
    
    def __call__(self, func):
        self._func = func
        return self
    
    def __get__(self, obj, obj_type=None):
        if obj is None:
            return self
        
        def wrapper(*args, **kwargs):
            # Create a cache key based on the method name and arguments
            key = (self._func.__name__, args, frozenset(kwargs.items()))
            
            #check if we have a cached result and if it;s still valid
            if key in self._cache:
                result, timestamp = self._cache[key]
                if time.time() - timestamp < self.ttl:
                    return result
            
            #process the result and cache it
            result = self._func(obj, *args, **kwargs)
            self._cache[key] = (result, time.time())
            return result
        
        return wrapper

class API:
    @CachedMethod(ttl=30)  # cache data till 30 seconds
    def get_user_data(self, user_id):
        print(f"Fetching data for user {user_id}...")
        # Simulate API call
        time.sleep(1)
        return {"user_id": user_id, "data": "some data"}

# usage
api = API()
print(api.get_user_data(123))  # "fetching data...." then the data
print(api.get_user_data(123))  # returns cached data immediately
time.sleep(31)
print(api.get_user_data(123))  # "fetching data..." again (cache has expired)

This pattern is perfect for caching results of expensive operations like API calls, database queries, or complex computations.

Descriptors vs. Properties: What’s the Difference?

Many developers wonder why they should use descriptors when properties seem to do something similar. The key difference is re-usability. Properties are defined per class, while descriptors are reusable across multiple classes.

Consider this example:

# Using properties
class Person:
    def __init__(self, name):
        self._name = name
    
    @property
    def name(self):
        return self._name.title()
    
    @name.setter
    def name(self, value):
        if not isinstance(value, str):
            raise TypeError("Name must be in a string")
        self._name = value

class Product:
    def __init__(self, name):
        self._name = name
    
    @property
    def name(self):
        return self._name.title()
    
    @name.setter
    def name(self, value):
        if not isinstance(value, str):
            raise TypeError("Name must be in string")
        self._name = value

Notice the duplication? Now let’s see how descriptors solve this:

class NameAttribute:
    def __get__(self, obj, obj_type=None):
        if obj is None:
            return self
        return obj._name.title()
    
    def __set__(self, obj, value):
        if not isinstance(value, str):
            raise TypeError("Name must be a string")
        obj._name = value

class Person:
    name = NameAttribute()
    
    def __init__(self, name):
        self._name = name

class Product:
    name = NameAttribute()
    
    def __init__(self, name):
        self._name = name

Much cleaner and more maintainable!

Descriptors in the Wild: Real-World Examples

Descriptors aren’t just academic exercises — they’re used extensively in Python’s standard library and popular frameworks.

SQLAlchemy’s ORM

SQLAlchemy uses descriptors extensively for its ORM. When you define a column:

class User(Base):
    __tablename__ = 'users'
    
    id = Column(Integer, primary_key=True)
    name = Column(String)

Those Column objects are descriptors that handle the translation between Python objects and database rows.

Django’s ORM

Similarly, Django uses descriptors for its model fields:

class Person(models.Model):
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)

models.CharField is a descriptor that handles validation, database conversion, and more.

Python’s property

Even Python’s built-in property is implemented using descriptors. In fact, it’s just a convenient way to create a data descriptor without having to define a full class.

Performance Considerations

One concern that often comes up with descriptors is performance. While it’s true that descriptor access is slightly slower than direct attribute access, the difference is negligible in most applications.

Let’s benchmark it:

import timeit

class DirectAccess:
    def __init__(self):
        self.value = 42

class DescriptorAccess:
    def __init__(self):
        self._value = 42
    
    @property
    def value(self):
        return self._value
    
    @value.setter
    def value(self, v):
        self._value = v

direct = DirectAccess()
descriptor = DescriptorAccess()

# Test direct access
direct_time = timeit.timeit('direct.value', globals=globals(), number=1000000)

# Test descriptor access
descriptor_time = timeit.timeit('descriptor.value', globals=globals(), number=1000000)

print(f"Direct access: {direct_time:.6f} seconds")
print(f"Descriptor access: {descriptor_time:.6f} seconds")
print(f"Ratio: {descriptor_time / direct_time:.2f}x slower")

On my machine, descriptor access is about 2–3x slower than direct access. While this sounds significant, remember that we’re doing a million accesses. In real-world applications, the benefits of descriptors often far outweigh this small performance cost.

Common Pitfalls and How to Avoid Them

After years of working with descriptors, I’ve seen developers make the same mistakes repeatedly. Here are the most common pitfalls and how to avoid them:

1. Forgetting __set_name__

In Python 3.6+, the __set_name__ method is called automatically when a descriptor is created in a class definition. It’s essential for capturing the name of the attribute the descriptor is assigned to.

class MyDescriptor:
    def __set_name__(self, owner, name):
        self.name = name  # save the attribute name
    
    def __get__(self, obj, obj_type=None):
        if obj is None:
            return self
        return f"Value of {self.name}"

2. Mishandling instance vs. class access

A common mistake is not properly handling the case when a descriptor is accessed on the class rather than an instance:

class BrokenDescriptor:
    def __get__(self, obj, obj_type=None):
        return obj.some_attr  # this will fail if obj is None

class WorkingDescriptor:
    def __get__(self, obj, obj_type=None):
        if obj is None:
            return self  # return the descriptor itself
        return obj.some_attr

3. Storing state in the descriptor itself

Remember that descriptors are created once when the class is defined, not once per instance. If you need to store instance-specific state, store it in the instance:

class WrongDescriptor:
    def __init__(self, default):
        self.value = default  # this is shared across all instances!
    
    def __get__(self, obj, obj_type=None):
        return self.value
    
    def __set__(self, obj, value):
        self.value = value  #this affects all instances!

class RightDescriptor:
    def __init__(self, default):
        self.default = default
    
    def __set_name__(self, owner, name):
        self.name = f"_{name}"
    
    def __get__(self, obj, obj_type=None):
        if obj is None:
            return self
        return getattr(obj, self.name, self.default)
    
    def __set__(self, obj, value):
        setattr(obj, self.name, value)  #store in the instance

Closing Thoughts: The Real Magic

Descriptors aren’t an obscure trick. They’re a core part of Python’s design.

Once you understand them, you’ll see how Django and SQLAlchemy weave their magic — and you’ll realize you can do the same.

Start with something small, like validation or lazy loading. Then level up to caching, proxying, or even ORM-like abstractions.

Because the real magic of Python isn’t hidden in its frameworks.
It’s built right into the language — waiting for you to use it.

Happy coding.

SUBSCRIBE FOR NEW ARTICLES

@
comments powered by Disqus