[Day19] 六翼 - 導讀Descriptor HowTo Guide：Pure Python Equivalents - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

2023 iThome 鐵人賽

DAY 19

Software Development

Python十翼：與未來的自己對話系列第 19 篇

[Day19] 六翼 - 導讀Descriptor HowTo Guide：Pure Python Equivalents

15th鐵人賽 python

Jerry Wu

2023-10-01 00:07:01

460 瀏覽

分享至

今天讓我們繼續跟著大神的腳步，一起閱讀Descriptor HowTo Guide的Pure Python Equivalents，來看看如何用Python實作property、function and bound method, staticmethod、classmethod及__slots__。

property

__init__中接收fget、fset、fdel及doc四個選擇性給予的變數。如果沒有給doc但是fget內有的話，會取fget的doc作為doc。所以當我們使用@Property來裝飾一個function時，其實就是指定該function為Property的第一個變數fget。
__set_name__會將property instance於class中的名字傳進來。

class Property:
    "Emulate PyProperty_Type() in Objects/descrobject.c"

    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc
        self._name = ''

    def __set_name__(self, owner, name):
        self._name = name
    ...

property實作有__get__、__set__及__delete__，所以就算沒有給定fset或fdel，仍然是一個data descriptor。
__get__中，先檢查obj是否為None。如果是None的話，則表示是由class所呼叫，會返回property instance本身。接著檢查是否已有指定self.fget，如果沒有指定的話，則raise AttributeError。最後呼叫self.fget執行其getter的工作。
__set__中，檢查是否已有指定self.fset，如果沒有指定的話，則raise AttributeError。最後呼叫self.fset執行其setter的工作。
__delete__中，檢查是否已有指定self.fdel，如果沒有指定的話，則raise AttributeError。最後呼叫self.fdel執行其deleter的工作。

class Property:
    ...
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError(f"property '{self._name}' has no getter")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError(f"property '{self._name}' has no setter")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError(f"property '{self._name}' has no deleter")
        self.fdel(obj)

getter、setter與deleter三種function的內容非常像。原則就是每次都建立一個新的property instance。舉getter為例， type(self)其實就是property這個class，我們將傳入的fget指定為Property的第一個參數fget，剩餘的self.fset、self.fdel及 self.__doc__就從self內來取。接著需要手動更新property instance的_name，因為class內有__set_name__的attribute只會在class被定義時呼叫一次（註1），所以當我們後續利用getter、setter或deleter介面加入新function到property instance時，需自己更新。這麼一來就可以像是疊加一樣，彈性地加入需要的function。

class Property:
    ...
    
    def getter(self, fget):
        prop = type(self)(fget, self.fset, self.fdel, self.__doc__)
        prop._name = self._name
        return prop

    def setter(self, fset):
        prop = type(self)(self.fget, fset, self.fdel, self.__doc__)
        prop._name = self._name
        return prop

    def deleter(self, fdel):
        prop = type(self)(self.fget, self.fset, fdel, self.__doc__)
        prop._name = self._name
        return prop

function and bound method

method與function不同的點是，method會自動傳入呼叫它的instance作為第一個參數，就是我們習慣的self。當由instance呼叫在class中的function時，其會變成一個bound method（bound在self上）。

types.MethodType可以幫助我們生成bound method:

MethodType的__init__接受兩個參數，分別為function與要bound的對象。
MethodType的__call__呼叫self.__func__，並以self.__self__作為第一個參數，__call__中所接受*args及**kwargs為剩餘參數，會並傳計算結果。

class MethodType:
    "Emulate PyMethod_Type in Objects/classobject.c"

    def __init__(self, func, obj):
        self.__func__ = func
        self.__self__ = obj

    def __call__(self, *args, **kwargs):
        func = self.__func__
        obj = self.__self__
        return func(obj, *args, **kwargs)

至於function，因為實作有__get__，是non-data descriptor。

function的__get__中，先檢查obj是否為None。如果是None的話，則表示是由class來取，會返回function instance本身。如果不是None的話，則回傳一個MethodType生成的method。這個method是一個bound method，幫助我們將function instance與呼叫其的instance bound在一起。

class Function:
    ...

    def __get__(self, obj, objtype=None):
        "Simulate func_descr_get() in Objects/funcobject.c"
        if obj is None:
            return self
        return MethodType(self, obj)

假設我們現在有以下程式碼，我們來拆解看看，呼叫my_instance.my_func(1, 2)的整個流程。

class MyClass:
    def my_func(self, a, b):
        ...

my_inst = MyClass()
my_inst.my_func(1, 2)

由於my_inst.my_func是個non_data_desc，所以my_inst會先尋找my_inst.__dict__中有沒有my_func。因為沒有找到，所以會使用my_func的__get__。
由於是從my_inst來取my_func，所以會回傳一個MethodType生的bound method，這個method將my_func與my_inst bound在一起。
當我們真正呼叫my_inst.my_func(1, 2)相當於使用bound method中的__call__，它會將my_inst作為my_func的第一個參數，1與2作為my_func的剩餘參數，然後回傳計算結果。這就是為什麼我們可以使用my_inst.my_func(1, 2)，而不需使用my_inst.my_func(my_inst, 1, 2)的由來。

一個有趣的事實是，function的__get__每次由instance呼叫時，都會回傳一個新的MethodType instance，這代表：

>>> my_inst.my_func is my_inst.my_func # False
>>> my_inst.my_func.__func__ is my_inst.my_func.__func__ # True

或許這會讓您意外，但這正是Python巧妙的設計，底層是同一個function，但是每次由my_inst.my_func來取時，都新生成一個bound method。Welcome to Python!

staticmethod

適合使用staticmethod的function，代表其功能與instance或是class沒有關係。staticmethod可以幫忙裝飾底層function，使得我們無論是由instance或是class呼叫，都能使用相同的signature。

__init__中，staticmethod接收一個function，並利用functools.update_wrapper來將function的metadata更新給staticmethod 的instance。
由於staticmethod有實作__get__，所以是一個non-data descriptor。無論是由instance或是由class來取，都返回self.f。
__call__中，self.f不用bound到任何obj，直接搭配___call__接收的*args及**kwds，回傳結果即可。

import functools

class StaticMethod:
    "Emulate PyStaticMethod_Type() in Objects/funcobject.c"

    def __init__(self, f):
        self.f = f
        functools.update_wrapper(self, f)

    def __get__(self, obj, objtype=None):
        return self.f

    def __call__(self, *args, **kwds):
        return self.f(*args, **kwds)

classmethod

classmethod可以將class中的function所bound的對象，由預設的instance改為class。

classmethod的__init__與staticmethod __init__是一樣的。其接收一個function，並利用functools.update_wrapper來將function的metadata更新給classmethod 的instance。
由於classmethod有實作__get__，所以是一個non-data descriptor。當cls是None時，代表是由obj來取，所以利用type(obj)來取得其cls。接下來一樣使用MethodType回傳一個bound method，只是這次是將self.f bound給cls。
__get__有一段被宣告將廢棄的程式碼，其原意是希望能串聯多個decorator。但是Python社群實際使用後發現，這樣的用法會產生許多潛在問題，Raymond也指出允許這樣的行為可能是一種錯誤。

import functools

class ClassMethod:
    "Emulate PyClassMethod_Type() in Objects/funcobject.c"

    def __init__(self, f):
        self.f = f
        functools.update_wrapper(self, f)

    def __get__(self, obj, cls=None):
        if cls is None:
            cls = type(obj)
        if hasattr(type(self.f), '__get__'):
            # This code path was added in Python 3.9
            # and was deprecated in Python 3.11.
            return self.f.__get__(cls, cls)
        return MethodType(self.f, cls)

`slots`

由於__slots__的實作需要用到C的structure及處理記憶體配置，所以Raymond說我們只能盡量仿效，以一個_slotvalues的list來替代真正的slot structure。

__slots__的實作比較複雜，共分為五個部份：

建立Member class，此為一個data descriptor，用來控制已寫在slot_names內attribute的存取。
建立Type metaclass，其功用為針對slot_names中所列名字，建立class variable，並將其值指為相對應的Member instance。
建立Object class，其功用為設定_slotvalues（相當於模擬配置__slots__的記憶體）及當設定或刪除不在slot_names內的attribute時，需raise AttributeError。
建立可用的H class，將使用Type為其metaclass，並繼承Object。
由H生成h instance，實際測試使用。

1. Member

Member是一個具有__get__、__set__及__delete__的data descriptor。

__init__中接收三個變數，分別為其在class中的名字，class name及其位於_slotvalues中的index。
__get__中一樣先檢查obj是否為None。如果是None的話，則表示是由class來取，會返回Member instance本身。接著透過self.offset 為index向obj._slotvalues取值。如果取回來的是預設的sentinel值null的話，表示該index值沒被正確指定又或者已經被刪除，raise AttributeError。最後，如果通過上述檢查的話，則返回所取之值。
__set__直接指定value到obj._slotvalues的self.offset位置。
__delete__與__get__類似。透過self.offset 為index向obj._slotvalues取值。如果取回來的是預設的sentinel值null的話，表示該index值沒被正確指定又或者已經被刪除，raise AttributeError。最後，如果通過上述檢查的話，則將obj._slotvalues[self.offset] 重設為null。
__repr__中，指定Member instance的顯示格式。

null = object()

class Member:

    def __init__(self, name, clsname, offset):
        'Emulate PyMemberDef in Include/structmember.h'
        # Also see descr_new() in Objects/descrobject.c
        self.name = name
        self.clsname = clsname
        self.offset = offset

    def __get__(self, obj, objtype=None):
        'Emulate member_get() in Objects/descrobject.c'
        # Also see PyMember_GetOne() in Python/structmember.c
        if obj is None:
            return self
        value = obj._slotvalues[self.offset]
        if value is null:
            raise AttributeError(self.name)
        return value

    def __set__(self, obj, value):
        'Emulate member_set() in Objects/descrobject.c'
        obj._slotvalues[self.offset] = value

    def __delete__(self, obj):
        'Emulate member_delete() in Objects/descrobject.c'
        value = obj._slotvalues[self.offset]
        if value is null:
            raise AttributeError(self.name)
        obj._slotvalues[self.offset] = null

    def __repr__(self):
        'Emulate member_repr() in Objects/descrobject.c'
        return f'<Member {self.name!r} of {self.clsname!r}>'

2. Type

Type是一個繼承type的metaclass，目的是針對slot_names中所列出的名字，逐一建立相對的Member instance，並加入mapping中，最後呼叫type.__new__生成cls。此舉相當於以slot_names中的名字，建立
class variable，並將其值指為相對應的Member instance。

class Type(type):
    'Simulate how the type metaclass adds member objects for slots'

    def __new__(mcls, clsname, bases, mapping, **kwargs):
        'Emulate type_new() in Objects/typeobject.c'
        # type_new() calls PyTypeReady() which calls add_methods()
        slot_names = mapping.get('slot_names', [])
        for offset, name in enumerate(slot_names):
            mapping[name] = Member(name, clsname, offset)
        return type.__new__(mcls, clsname, bases, mapping, **kwargs)

3. Object

Object class的目的為被後續class繼承。

__new__先利用super().__new__(cls)生成instance。接著看看cls是不是有slot_names，如果有的話就建立一個長度為len(slot_names)的list，並將list中每個值都預設為null。接著透過object.__setattr__將list設為名為_slotvalues的instance variable，並回傳instance。請注意此處object.__setattr__的使用實有其必要（註2）。
__setattr__中會檢查cls中是否有slot_names。如果有的話，檢查其名字是否有在cls.slot_names中，如果不在的話raise AttributeError。如果通過檢查的話，則delegate給super().__setattr__。
__delattr__的邏輯類似__setattr__。如果沒通過檢查的話raise AttributeError，有通過的話，則delegate給super().__delattr__。

class Object:
    'Simulate how object.__new__() allocates memory for __slots__'

    def __new__(cls, *args, **kwargs):
        'Emulate object_new() in Objects/typeobject.c'
        inst = super().__new__(cls)
        if hasattr(cls, 'slot_names'):
            empty_slots = [null] * len(cls.slot_names)
            object.__setattr__(inst, '_slotvalues', empty_slots)
        return inst

    def __setattr__(self, name, value):
        'Emulate _PyObject_GenericSetAttrWithDict() Objects/object.c'
        cls = type(self)
        if hasattr(cls, 'slot_names') and name not in cls.slot_names:
            raise AttributeError(
                f'{cls.__name__!r} object has no attribute {name!r}'
            )
        super().__setattr__(name, value)

    def __delattr__(self, name):
        'Emulate _PyObject_GenericSetAttrWithDict() Objects/object.c'
        cls = type(self)
        if hasattr(cls, 'slot_names') and name not in cls.slot_names:
            raise AttributeError(
                f'{cls.__name__!r} object has no attribute {name!r}'
            )
        super().__delattr__(name)

4. H(class)

H class，使用Type為其metaclass，並繼承Object。slot_names就相當於__slots__，我們可以將允許的instance variable名字放進slot_names這個list中。

class H(Object, metaclass=Type):
    'Instance variables stored in slots'

    slot_names = ['x', 'y']

    def __init__(self, x, y):
        self.x = x
        self.y = y

可以觀察H.__dict__，slot_names及x與y都設定好了。

>>> from pprint import pp
>>> pp(dict(vars(H)))
{'__module__': '__main__',
 '__doc__': 'Instance variables stored in slots',
 'slot_names': ['x', 'y'],
 '__init__': <function H.__init__ at 0x7fb5d302f9d0>,
 'x': <Member 'x' of 'H'>,
 'y': <Member 'y' of 'H'>}

5. h(instance)

instance h可以正常使用，slots之值存於instance.__dict__中的_slotvalues。

>>> h = H(10, 20)
>>> vars(h)
{'_slotvalues': [10, 20]}
>>> h.x = 55
>>> vars(h)
{'_slotvalues': [55, 20]}

當使用不在slot_names的名字時，會raise AttributeError，類似於使用__slots__的效果。

>>> h.xz
Traceback (most recent call last):
    ...
AttributeError: 'H' object has no attribute 'xz'

以`__init_subclass`改寫`slots__`

metaclass的功能非常強大，對於是否一定要使用其來解決問題，我們會慎之又慎。使用decorator來裝飾cls常是避免使用metaclass的一個方法。自從Python3.6加入__init_subclass__後，更是大幅度降低需要實作metaclass的機會。

以下我們嘗試使用__init_subclass__的方法，來修改上述__slots__的實作。

MyObject

MyObject繼承Object，並實作有__init_subclass__。

於__init_subclass__中：

先使用super().__init_subclass__()，確保MRO上的class如果有實作__init_subclass__的話，能確實被呼叫。
接著的步驟與在Type.__new__類似，只是我們這裡是在class生成後，才 mutate class。而Type.__new__是於生成class前，就將這些操作放在mapping。

# 01
...

class MyObject(Object):
    def __init_subclass__(cls):
        'Add member objects for slots'
        super().__init_subclass__()
        slot_names = cls.__dict__.get('slot_names', [])
        clsname = cls.__name__
        for offset, name in enumerate(slot_names):
            setattr(cls, name, Member(name, clsname, offset))
        return cls

H(class)

此時H class只需要繼承MyObject，而不需要客製的metaclass。

# 01
...

class H(MyObject):
    'Instance variables stored in slots'

    slot_names = ['x', 'y']

    def __init__(self, x, y):
        self.x = x
        self.y = y

可以觀察H.__dict__，slot_names及x與y也一樣可以正常設定。

>>> from pprint import pp
>>> pp(dict(vars(H)))
{'__module__': '__main__',
 '__doc__': 'Instance variables stored in slots',
 'slot_names': ['x', 'y'],
 '__init__': <function H.__init__ at 0x00000132D34D9300>,
 'x': <Member 'x' of 'H'>,
 'y': <Member 'y' of 'H'>}

h(instance)

instance h一樣可以正常使用，_slotvalues也設定無誤。

>>> h = H(10, 20)
>>> vars(h)
{'_slotvalues': [10, 20]}
>>> h.x = 55
>>> vars(h)
{'_slotvalues': [55, 20]}

當使用不在slot_names的名字時，一樣會raise AttributeError。

>>> h.xz
Traceback (most recent call last):
    ...
AttributeError: 'H' object has no attribute 'xz'. Did you mean: 'x'?

備註

註1：可參考Python docs於此處的敘述。

註2：這邊不能使用inst._slotvalues = empty_slots或setattr(inst, '_slotvalues', empty_slots)，因為這兩種語法都相當於使用instance的__setattr__。而我們恰恰於Object實作有__setattr__，其會在檢查中raise AttributeError，因為_slotvalues的確不在cls.slot_names中。此外，也不能使用super().__setattr__('_slotvalues', empty_slots)，因為我們是在__new__中，這相當於super(Object, cls).__setattr__('_slotvalues', empty_slots)，並不是我們想要的行為。如果一定要使用super()的話，可以考慮super(Object, inst).__setattr__('_slotvalues', empty_slots)。但這麼一來有點繞來繞去的，直接使用object.__setattr__可能更簡單一點。