Descriptors

About Me

Hi, I’m Simeon Franklin.

Technical instructor at Twitter teaching Python + other stuff to the #flock.

Find me @simeonfranklin or http://simeonfranklin.com/

What are descriptors and why do I care?

Descriptors give us a powerful technique to write reusable code that can be shared between classes.
If you are working with classes and objects you need to understand all the tools at your disposal to write Pythonic code.
I’ll try to show some implementation patterns/anti-patterns and common use cases.

Great! Show me some descriptor magic!

First some background knowledge.
Warning: there may be details.
But knowledge is power, right? Stick with me.

Classes, objects, and attributes

Everybody already knows about classes and objects, right?

>>> class Circle(object):
...     PI = 3.14
...     def __init__(self, radius):
...         self.radius = radius
...
>>> mycircle = Circle(2)
>>> mycircle.radius
2
>>> mycircle.PI
3.14

How about object attributes vs class attributes?

Object Attribute Access

When you access an attribute of an object like mycircle.radius you are actually getting back a value stored in a dict on the object.

>>> mycircle.__dict__
{'radius': 2}

Class attribute access

But of course you can fall back to class level attributes which are stored in a dict on the class.

>>> Circle.PI
3.14
>>> Circle.__dict__
dict_proxy({...'PI': 3.14...})
>>> mycircle.PI
3.14

Well - dict -like thing at least. dict_proxy is used by Python where you need a dict but don’t want to allow modifications. You can use this yourself in Python 3.3 with collections.MappingView

Just Three Simple Rules!

We can build some rules to model our understanding so far.

Accessing an attribute on an object like obj.foo gets you:

the corresponding value in obj.__dict__ if it exists
or else it falls back to look in the type(obj).__dict__
And assignment always creates an entry in obj.__dict__.

Plus inheritance

Adding inheritance to the mix just means paying attention to the mro.

>>> class Widget(object):
...     copyright = "Witrett, inc."
...
>>> class Circle(Widget):
...     PI = 3.14
...     def __init__(self, radius):
...         self.radius = radius
...

>>> mycircle = Circle(2)
>>> type(mycircle).mro()
[<class '__main__.Circle'>, <class '__main__.Widget'>, <type 'object'>]
>>> mycircle.copyright
'Witrett, inc.'

Got it?

Three Four Simple Rules

Let’s update our rules:

Accessing an attribute on an object like obj.foo gets you:

the corresponding value in obj.__dict__ if it exists
or else it falls back to look in the type(obj).__dict__ on the class
repeating for each type in the mro until it finds a match
And assignment always creates an entry in obj.__dict__.

One more thing

then we’ll get to descriptors

Sometimes attributes aren’t enough.

>>> class Circle(Widget):
...     PI = 3.14
...     def __init__(self, radius):
...         self.radius = radius
...         self.circumference = 2 * radius * self.PI
...
>>> mycircle = Circle(2)
>>> mycircle.radius = 3
>>> mycircle.circumference # Whoops!
12.56

Classic OOP mistake - now I’ve got a broken class!

Yeah I’m stealing from Raymond Hettinger

See his PyCon 2013 talk @ http://pyvideo.org/video/1779/pythons-class-development-toolkit

Steal from best, right?

@property to the rescue!

Everybody knows how to fix this:

>>> class Circle(Widget):
...     PI = 3.14
...     def __init__(self, radius):
...         self.radius = radius
...     @property
...     def circumference(self):
...         return 2 * self.radius * self.PI
...

>>> mycircle = Circle(2)
>>> mycircle.radius = 3
>>> mycircle.circumference # Fixed!
18.84

We can add getters and setters to our class while maintaining what looks like simple attribute access.

You gotta love properties.

I love making Java-istas Java-ers people forced to write Java envious with this feature.

But have you ever wondered how it works?
And is it always the right tool for the job?

Let’s review our attribute access rules.

Five Six Simple Rules?

Accessing an attribute on an object like obj.foo gets you:

The result of the property of the same name if it is defined
Or the corresponding value in obj.__dict__ if it exists
or else it falls back to look in the type(obj).__dict__
repeating for each type in the mro until it finds a match
And assignment always creates an entry in obj.__dict__.
Unless there was a setter property in which case you’re calling a function.

Rule #1.

Rule #1 is really:

Accessing an attribute on an object like obj.foo gets you:

the result of the __get__ method of the data descriptor of the same name attached to the class if it exists

What’s a data descriptor?

Heck - what’s a descriptor?

A descriptor is any object that implements at least one of methods named __get__(), __set__(), and __delete__().
A data descriptor implements both __get__() and __set__(). Implementing only __get__() makes you a non-data descriptor.

All Clear?

All Clear?

We’ll look at the implementation and signature of the methods in a moment…

but first… The Descriptor Protocol!

The Descriptor Protocol

or as we’ve been calling it: Rule #1.

And a new Rule #3.

Plus a few more details…

Six Seven Simple Rules?

Accessing an attribute on an object like obj.foo gets you:

The result of the __get__ method of the data descriptor of the same name attached to the class if it exists
Or the corresponding value in obj.__dict__ if it exists
Or the result of the of the __get__ method of the non-data descriptor of the same name on the class
or else it falls back to look in the type(obj).__dict__
repeating for each type in the mro until it finds a match.
And assignment always creates an entry in obj.__dict__.
Unless there was a setter property (which we now know is a descriptor) in which case you’re calling a function.

Who knew

simple attribute access could be so complicated?

This is the most complicated thing ever!

Maybe not!

Writing Descriptors

The signature of __get__, __set__ and __del__ are fixed.

See http://docs.python.org/2/howto/descriptor.html#abstract

descr.__get__(self, obj, type=None) --> value

descr.__set__(self, obj, value) --> None

descr.__delete__(self, obj) --> None

We’ll ignore __del__ for now.

Who wants to delete attributes anyways?

get and set

Descriptors look weird - they’re attached to the class and the methods have a funky signature.

>>> class MyDescriptor(object):
...     def __get__(self, obj, type):
...         print self, obj, type
...     def __set__(self, obj, val):
...         print "Got %s" % val
...
>>> class MyClass(object):
...     x = MyDescriptor() # Attached at class definition time!
...

But they allow us to simulate attribute access with functions instead.

>>> obj = MyClass()
>>> obj.x # a function call is hiding here
<...MyDescriptor object ...> <....MyClass object ...> <class '__main__.MyClass'>
>>>
>>> MyClass.x # and here!
<...MyDescriptor object ...> None <class '__main__.MyClass'>
>>>
>>> obj.x = 4 # and here
Got 4

Method signature details:

self is the instance of the descriptor
obj is the instance of the object the descriptor is attached to
type is the class the descriptor is attached to
__get__ can be called on the class or object, __set__ can only be called on the object.

self and type are both provided on object attribute access, only type is provided on class attribute access.

Why doesn’t MyClass.x = 5 call the __set__ method of the descriptor?

Ok, let’s do something useful

We could store values in the descriptor itself. But watch out!

What’s wrong with this code?

>>> class MyDescriptor(object):
...     def __get__(self, obj, type):
...         return self.data
...     def __set__(self, obj, val):
...         self.data = val
...

Whoops! We just re-implemented a class level attribute!

>>> class MyClass(object):
...     val = MyDescriptor()
...
>>> obj1 = MyClass()
>>> obj1.val = 10
>>> obj2 = MyClass()
>>> obj2.val
10

Try again

Possible strategies:

Maybe we should store stuff on obj.
That would be the instance of the class that the instance of our descriptor is attached to.
Sorry!
Or maybe data should live in the descriptor itself - storing in self?

Storing on self

We know we can’t use the same field name for all the pieces of data.

We have to vary by the instance.

Another classic pitfall

>>> class MyDescriptor(object):
...     def __init__(self):
...         self.data = {}
...     def __get__(self, obj, type):
...         return self.data[obj]
...     def __set__(self, obj, val):
...         self.data[obj] = val
...

This works!

But now every instance of any given class the descriptor will be attached to has an extra reference stored in the descriptor’s data dict.

So much for garbage collection.

Weak-references to the rescue!

Go read PEP 205 and then:

>>> from weakref import WeakKeyDictionary
>>> class MyDescriptor(object):
...     def __init__(self):
...         self.data = WeakKeyDictionary()
...     def __get__(self, obj, type):
...         return self.data.get(obj)
...     def __set__(self, obj, val):
...         self.data[obj] = val
...

Kinda sorta

This solves the reference problem… but not everything can weakref’ed.

In particular weakrefs and the use of slots to optimize your class are incompatible and your type must inherit from a type that is weakref-able.

Of course the type must be hashable to be used as a dict key. That means inheriting from mutable types like list or dict won’t work with this solution.

What about storing values on the object itself?

Problem is - we don’t know the name of the attribute our descriptor is stored under.

val = MyDescriptor()

The descriptor constructor can’t know about "val" yet.

So sometimes we see just a little duplication

class MyClass(object):
    val = MyDescriptor("val") # must put in field name manually

Which makes the descriptor easy to write

>>> class MyDescriptor(object):
...     def __init__(self, field=""):
...         self.field = field
...     def __get__(self, obj, type):
...         print "Called __get__"
...         return obj.__dict__.get(self.field)
...     def __set__(self, obj, val):
...         print "Called __set__"
...         obj.__dict__[self.field] = val
...

Everybody gets that right?

If obj.x is always going to get you the descriptor than obj.__dict__['x'] is hidden from normal access and the descriptor can use it to store values…

Fortunately

A little bit of code duplication

… doesn’t bug anybody here, right? RIGHT?

If only we knew something about metaclasses…

Or maybe class decorators …

We could do something …

Like this

>>> def named_descriptors(klass):
...     for name, attr in klass.__dict__.items():
...         if isinstance(attr, MyDescriptor):
...             attr.field = name
...     return klass
...
>>> @named_descriptors
... class MyClass(object):
...     x = MyDescriptor()
...

Which works

>>> obj = MyClass()
>>> obj.x = 10
Called __set__
>>> obj.x
Called __get__
10

But might be too much magic…

What’s the point of all this?

Let’s abandon the details of how we might handle implementation

@property was cool.

Why do I need descriptors anyways?

@property is just sugar

So is @staticmethod and @classmethod.

It’s all the descriptor protocol underneath.

Great!

@property is doing the Pythonic thing and giving me a simple interface to a complicated API.

Do I ever have to write custom descriptors?

Yes!

@property doesn’t work for every case where you need to intercept attribute access.

Imagine a class that needs to store various dollar amounts in attributes. Better use decimal.Decimal! And fix the representation to 2 decimal places.

I know - @property to the rescue!

Just a little code duplication

>>> from decimal import Decimal, ROUND_UP
>>> class BankTransaction(object):
...     _cents = Decimal('.01')
...     def __init__(self, account, before, after, min, max):
...         self.account = account
...         self._before = before
...         self._after = after
...         self._min = min
...         self._max =
...     @property
...     def before(self):
...         return Decimal(self._before).quantize(self._cents, ROUND_UP)
...     @before.setter
...     def before(self, val):
...         self._before = str(val)
...# repeat boilerplate getters and setters over and over and over...

Nope nope nope nope

I thought @property was supposed to save me from boilerplate code!

Takeaway

Descriptors let us write re-usable properties.

Isn’t this much nicer?

class BankTransaction(object):
    before = CurrencyField(0)
    after = CurrencyField(0)

    def __init__(self, account, before, after):
        self.account = account
        self.before = before
        self.after = after

Descriptors are a great solution for attributes with common behaviour across multiple classes

Use cases

Think database fields: each has its own validation logic but might be attached to many different classes with many different names.

class Person(object):
    id = PrimaryKeyField()
    name = VarCharField(max_length=255)

class NickName(object):
    id = PrimaryKeyField()
    person_id = ForeignKey(Person)
    name = VarCharField(max_length=255)

This may look vaguely familiar

Or GUI fields that all need to fire off events when updated.

class PongBall(Widget):
    velocity_x = NumericProperty(0)
    velocity_y = NumericProperty(0)

That too.

It may be cool to simply provide a "declarative" API.

Or implement advanced attribute access patterns like "cached fields".

Every Framework Ever

>>> class LazyProperty(object):
...     def __init__(self, func):
...         self._func = func
...         self.__name__ = func.__name__
...
...     def __get__(self, obj, klass):
...         print "Called the func"
...         result = self._func(obj)
...         obj.__dict__[self.__name__] = result
...         return result
...
>>> class MyClass(object):
...     @LazyProperty
...     def x(self):
...         return 42
...

>>> obj = MyClass()
>>> obj.x
Called the func
42
>>> obj.x
42

Do you get why it works?

Congratulations!

Go forth and wizard!

Helpful Resources

Luciano Ramalho at Pycon 2013 on descriptors - http://pyvideo.org/video/1760/encapsulation-with-descriptors
David Beazley on cool advanced OOP stuff in Python 3 - http://pyvideo.org/video/1716/python-3-metaprogramming
Or follow me on Twitter - @simeonfranklin
I’ll post slides at http://simeonfranklin.com/blog/

Descriptors

About Me

What are descriptors and why do I care?

Great! Show me some descriptor magic!

Classes, objects, and attributes

Object Attribute Access

Class attribute access

Just Three Simple Rules!

Plus inheritance

Three Four Simple Rules

One more thing

Yeah I’m stealing from Raymond Hettinger

Steal from best, right?

@property to the rescue!

Let’s review our attribute access rules.

Five Six Simple Rules?

Rule #1.

What’s a data descriptor?

All Clear?

All Clear?

The Descriptor Protocol

Six Seven Simple Rules?

Who knew

Maybe not!

Writing Descriptors

__get__ and __set__

Ok, let’s do something useful

Whoops! We just re-implemented a class level attribute!

Try again

Storing on self

Another classic pitfall

Weak-references to the rescue!

Kinda sorta

What about storing values on the object itself?

So sometimes we see just a little duplication

Which makes the descriptor easy to write

Everybody gets that right?

Fortunately

If only we knew something about metaclasses…

Like this

Which works

What’s the point of all this?

@property is just sugar

Great!

Yes!

Just a little code duplication

Nope nope nope nope

Takeaway

Descriptors are a great solution for attributes with common behaviour across multiple classes

Use cases

Congratulations!

Go forth and wizard!

get and set