Hi, I’m Simeon Franklin.
Technical instructor at Twitter teaching Python + other stuff to the #flock.
Find me @simeonfranklin or http://simeonfranklin.com/
Everybody already knows about classes and objects, right?
1 2 3 4 5 6 7 8 9 10 | >>> class Circle(object):
... PI = 3.14
... def __init__(self, radius):
... self.radius = radius
...
>>> mycircle = Circle(2)
>>> mycircle.radius
2
>>> mycircle.PI
3.14
|
How about object attributes vs class attributes?
When you access an attribute of an object like mycircle.radius you are actually getting back a value stored in a dict on the object.
>>> mycircle.__dict__
{'radius': 2}
But of course you can fall back to class level attributes which are stored in a dict on the class.
1 2 3 4 5 6 | >>> Circle.PI
3.14
>>> Circle.__dict__
dict_proxy({...'PI': 3.14...})
>>> mycircle.PI
3.14
|
Well - dict -like thing at least. dict_proxy is used by Python where you need a dict but don’t want to allow modifications. You can use this yourself in Python 3.3 with collections.MappingView |
We can build some rules to model our understanding so far.
Accessing an attribute on an object like obj.foo gets you:
the corresponding value in obj.__dict__ if it exists
or else it falls back to look in the type(obj).__dict__
And assignment always creates an entry in obj.__dict__.
Adding inheritance to the mix just means paying attention to the mro.
1 2 3 4 5 6 7 8 | >>> class Widget(object):
... copyright = "Witrett, inc."
...
>>> class Circle(Widget):
... PI = 3.14
... def __init__(self, radius):
... self.radius = radius
...
|
>>> mycircle = Circle(2)
>>> type(mycircle).mro()
[<class '__main__.Circle'>, <class '__main__.Widget'>, <type 'object'>]
>>> mycircle.copyright
'Witrett, inc.'
Got it?
Let’s update our rules:
Accessing an attribute on an object like obj.foo gets you:
the corresponding value in obj.__dict__ if it exists
or else it falls back to look in the type(obj).__dict__ on the class
repeating for each type in the mro until it finds a match
And assignment always creates an entry in obj.__dict__.
then we’ll get to descriptors
Sometimes attributes aren’t enough.
1 2 3 4 5 6 7 8 9 10 | >>> class Circle(Widget):
... PI = 3.14
... def __init__(self, radius):
... self.radius = radius
... self.circumference = 2 * radius * self.PI
...
>>> mycircle = Circle(2)
>>> mycircle.radius = 3
>>> mycircle.circumference # Whoops!
12.56
|
Classic OOP mistake - now I’ve got a broken class!
See his PyCon 2013 talk @ http://pyvideo.org/video/1779/pythons-class-development-toolkit
Everybody knows how to fix this:
1 2 3 4 5 6 7 8 | >>> class Circle(Widget):
... PI = 3.14
... def __init__(self, radius):
... self.radius = radius
... @property
... def circumference(self):
... return 2 * self.radius * self.PI
...
|
>>> mycircle = Circle(2)
>>> mycircle.radius = 3
>>> mycircle.circumference # Fixed!
18.84
We can add getters and setters to our class while maintaining what looks like simple attribute access.
You gotta love properties.
I love making Java-istas Java-ers people forced to write Java envious with this feature.
Accessing an attribute on an object like obj.foo gets you:
The result of the property of the same name if it is defined
Or the corresponding value in obj.__dict__ if it exists
or else it falls back to look in the type(obj).__dict__
repeating for each type in the mro until it finds a match
And assignment always creates an entry in obj.__dict__.
Unless there was a setter property in which case you’re calling a function.
Rule #1 is really:
Accessing an attribute on an object like obj.foo gets you:
the result of the __get__ method of the data descriptor of the same name attached to the class if it exists
Heck - what’s a descriptor?
We’ll look at the implementation and signature of the methods in a moment…
but first… The Descriptor Protocol!
or as we’ve been calling it: Rule #1.
And a new Rule #3.
Plus a few more details…
Accessing an attribute on an object like obj.foo gets you:
The result of the __get__ method of the data descriptor of the same name attached to the class if it exists
Or the corresponding value in obj.__dict__ if it exists
Or the result of the of the __get__ method of the non-data descriptor of the same name on the class
or else it falls back to look in the type(obj).__dict__
repeating for each type in the mro until it finds a match.
And assignment always creates an entry in obj.__dict__.
Unless there was a setter property (which we now know is a descriptor) in which case you’re calling a function.
simple attribute access could be so complicated?
This is the most complicated thing ever!
The signature of __get__, __set__ and __del__ are fixed.
descr.__get__(self, obj, type=None) --> value descr.__set__(self, obj, value) --> None descr.__delete__(self, obj) --> None
We’ll ignore __del__ for now.
Who wants to delete attributes anyways?
Descriptors look weird - they’re attached to the class and the methods have a funky signature.
1 2 3 4 5 6 7 8 9 | >>> class MyDescriptor(object):
... def __get__(self, obj, type):
... print self, obj, type
... def __set__(self, obj, val):
... print "Got %s" % val
...
>>> class MyClass(object):
... x = MyDescriptor() # Attached at class definition time!
...
|
But they allow us to simulate attribute access with functions instead.
>>> obj = MyClass()
>>> obj.x # a function call is hiding here
<...MyDescriptor object ...> <....MyClass object ...> <class '__main__.MyClass'>
>>>
>>> MyClass.x # and here!
<...MyDescriptor object ...> None <class '__main__.MyClass'>
>>>
>>> obj.x = 4 # and here
Got 4
Method signature details:
self and type are both provided on object attribute access, only type is provided on class attribute access. |
Why doesn’t MyClass.x = 5 call the __set__ method of the descriptor? |
We could store values in the descriptor itself. But watch out!
What’s wrong with this code?
1 2 3 4 5 6 | >>> class MyDescriptor(object):
... def __get__(self, obj, type):
... return self.data
... def __set__(self, obj, val):
... self.data = val
...
|
1 2 3 4 5 6 7 8 | >>> class MyClass(object):
... val = MyDescriptor()
...
>>> obj1 = MyClass()
>>> obj1.val = 10
>>> obj2 = MyClass()
>>> obj2.val
10
|
Possible strategies:
We know we can’t use the same field name for all the pieces of data.
We have to vary by the instance.
1 2 3 4 5 6 7 8 | >>> class MyDescriptor(object):
... def __init__(self):
... self.data = {}
... def __get__(self, obj, type):
... return self.data[obj]
... def __set__(self, obj, val):
... self.data[obj] = val
...
|
This works!
But now every instance of any given class the descriptor will be attached to has an extra reference stored in the descriptor’s data dict.
So much for garbage collection.
Go read PEP 205 and then:
1 2 3 4 5 6 7 8 9 | >>> from weakref import WeakKeyDictionary
>>> class MyDescriptor(object):
... def __init__(self):
... self.data = WeakKeyDictionary()
... def __get__(self, obj, type):
... return self.data.get(obj)
... def __set__(self, obj, val):
... self.data[obj] = val
...
|
This solves the reference problem… but not everything can weakref’ed.
In particular weakrefs and the use of slots to optimize your class are incompatible and your type must inherit from a type that is weakref-able.
Of course the type must be hashable to be used as a dict key. That means inheriting from mutable types like list or dict won’t work with this solution. |
Problem is - we don’t know the name of the attribute our descriptor is stored under.
val = MyDescriptor()
The descriptor constructor can’t know about "val" yet.
class MyClass(object):
val = MyDescriptor("val") # must put in field name manually
1 2 3 4 5 6 7 8 9 10 | >>> class MyDescriptor(object):
... def __init__(self, field=""):
... self.field = field
... def __get__(self, obj, type):
... print "Called __get__"
... return obj.__dict__.get(self.field)
... def __set__(self, obj, val):
... print "Called __set__"
... obj.__dict__[self.field] = val
...
|
If obj.x is always going to get you the descriptor than obj.__dict__['x'] is hidden from normal access and the descriptor can use it to store values…
A little bit of code duplication
… doesn’t bug anybody here, right? RIGHT?
Or maybe class decorators …
We could do something …
1 2 3 4 5 6 7 8 9 10 | >>> def named_descriptors(klass):
... for name, attr in klass.__dict__.items():
... if isinstance(attr, MyDescriptor):
... attr.field = name
... return klass
...
>>> @named_descriptors
... class MyClass(object):
... x = MyDescriptor()
...
|
>>> obj = MyClass()
>>> obj.x = 10
Called __set__
>>> obj.x
Called __get__
10
But might be too much magic…
Let’s abandon the details of how we might handle implementation
@property was cool.
Why do I need descriptors anyways?
So is @staticmethod and @classmethod.
It’s all the descriptor protocol underneath.
@property is doing the Pythonic thing and giving me a simple interface to a complicated API.
Do I ever have to write custom descriptors?
@property doesn’t work for every case where you need to intercept attribute access.
Imagine a class that needs to store various dollar amounts in attributes. Better use decimal.Decimal! And fix the representation to 2 decimal places.
I know - @property to the rescue!
>>> from decimal import Decimal, ROUND_UP
>>> class BankTransaction(object):
... _cents = Decimal('.01')
... def __init__(self, account, before, after, min, max):
... self.account = account
... self._before = before
... self._after = after
... self._min = min
... self._max =
... @property
... def before(self):
... return Decimal(self._before).quantize(self._cents, ROUND_UP)
... @before.setter
... def before(self, val):
... self._before = str(val)
...# repeat boilerplate getters and setters over and over and over...
I thought @property was supposed to save me from boilerplate code!
Descriptors let us write re-usable properties.
Isn’t this much nicer?
class BankTransaction(object):
before = CurrencyField(0)
after = CurrencyField(0)
def __init__(self, account, before, after):
self.account = account
self.before = before
self.after = after
Think database fields: each has its own validation logic but might be attached to many different classes with many different names.
class Person(object):
id = PrimaryKeyField()
name = VarCharField(max_length=255)
class NickName(object):
id = PrimaryKeyField()
person_id = ForeignKey(Person)
name = VarCharField(max_length=255)
This may look vaguely familiar
Or GUI fields that all need to fire off events when updated.
class PongBall(Widget):
velocity_x = NumericProperty(0)
velocity_y = NumericProperty(0)
That too.
It may be cool to simply provide a "declarative" API.
Or implement advanced attribute access patterns like "cached fields".
>>> class LazyProperty(object):
... def __init__(self, func):
... self._func = func
... self.__name__ = func.__name__
...
... def __get__(self, obj, klass):
... print "Called the func"
... result = self._func(obj)
... obj.__dict__[self.__name__] = result
... return result
...
>>> class MyClass(object):
... @LazyProperty
... def x(self):
... return 42
...
>>> obj = MyClass()
>>> obj.x
Called the func
42
>>> obj.x
42
Do you get why it works?
Helpful Resources
/
#