Qahirah <https://github.com/ldo/qahirah> is a Pythonic API binding for the 
Cairo 2D graphics library. When designing it, I tried to imagine how the 
graphics API would look if had been created for Python in the first place, 
rather than for C.

One of the important decisions I made was to implement the binding in pure 
Python, using ctypes <https://docs.python.org/3/library/ctypes.html>. ctypes is 
a wonderful library, and I recommend it to anyone else looking to create a 
Python API binding.

There was already a commonly-used Python binding for Cairo, called PyCairo, 
written as an extension module in C. But it was missing important features, and 
had been neglected for some time. I did look at filling in some of the gaps 
<https://github.com/ldo/pycairo>, before realizing that it would be easier and 
quicker to do it over again in Python.

A common principle in most ctypes bindings is: you encapsulate a lower-level 
API object in a Python object. For example, in Qahirah, a Cairo drawing context 
of type “cairo_t” is wrapped in a Python object of type “Context”. All such 
wrapper objects have an internal field “_cairobj” which contains the pointer to 
the lower-level object.

But sometimes you need to map back the other way. For example, a Context has a 
“source” property, which is the current Pattern to use when setting pixel 
values. (A Pattern can be as simple as a single solid colour, or something more 
complex like a gradient or a repeating image.) In particular, I want the 
following result (assuming “ctx” is a Context and “pat” is a pattern):

    >>> ctx.source = pat
      # calls cairo_set_source 
<https://www.cairographics.org/manual/cairo-cairo-t.html#cairo-set-source>
    >>> ctx.source is pat
    True

One way to achieve this would be by cheating: each such settable property would 
save its value in an internal field, and the property getter would simply 
return this field value, rather than calling the underlying Cairo getter. This 
does run the risk of the Python wrapper getting out of sync with the underlying 
API object...

The better way is for the class to maintain a dictionary mapping low-level 
pointers back to Python objects. The property getter would, in this case, call 
cairo_get_source 
<https://www.cairographics.org/manual/cairo-cairo-t.html#cairo-get-source> as 
expected, then look up the result in this dictionary, and return the 
corresponding object.

Entries in this dictionary would automatically be made every time a new object 
was created. To manage this, the object must define a __new__ method 
<https://docs.python.org/3/reference/datamodel.html#object.__new__> to manage 
object creation. Note the following from the docs:

    If __new__() returns an instance of cls, then the new instance’s
    __init__() method will be invoked.

Since in this case __new__ might be returning either a new object or an 
existing one, you only want initialization to be done in the former case, not 
the latter one. But the __new__ method is the only place which knows which is 
the case; therefore it has to handle all the initialization, you cannot have an 
__init__ method at all.

However, the problem with such a dictionary is that it leads to memory leaks: 
without some way of clearing entries from it, objects will never get deleted, 
so if the caller creates and forgets lots of them, memory usage will grow and 
grow. Clearing out unneeded objects is not a chore that the caller should have 
to manage.

Luckily, Python, in common with most sanely-defined dynamic languages, supports 
something called “weak” references 
<https://docs.python.org/3/library/weakref.html>. These are object references 
that do not count towards the usual reference management, so they do not 
prevent the object from being deleted. In particular, that module defines a 
handy class called a “WeakValueDictionary”, where the values are weak object 
references. This means I can maintain my mapping from low-level API references 
to Python objects, and when the caller forgets about the latter, they will also 
disappear automatically from my mapping.

One minor caveat is that use of weak references will cause an attribute called 
“__weakref__” to be added to your objects. If you define __slots__ (as I have 
done here), you will need to make allowance for this.

So, with all this, the basic object-creation mechanism in the Qahirah wrapper 
classes looks like

    _instances = WeakValueDictionary()
      # class variable mapping low-level API references back to Python objects

    def __new__(celf, _cairobj) :
        self = celf._instances.get(_cairobj)
        if self == None :
            self = super().__new__(celf)
            self._cairobj = _cairobj
            ... other internal setup ...
            celf._instances[_cairobj] = self
        else :
            cairo.cairo_xxx_destroy(self._cairobj)
              # lose extra reference created by caller
        #end if
        return \
            self
    #end __new__

Then (continuing the same example) the Context method that calls 
cairo_get_source and returns a Pattern looks like this:

    @property
    def source(self) :
        "the current source Pattern."
        return \
            
Pattern(cairo.cairo_pattern_reference(cairo.cairo_get_source(self._cairobj)))
    #end source

Note that Cairo maintains its own reference counting of objects. Naturally I 
rely on Python’s reference counting for my Python objects. But I still need to 
make sure Cairo objects do not leak, or get prematurely disposed. This is why 
the getter always calls the cairo_xxx_reference method to increment the 
reference count before instantiating the class: that way, the __new__ method 
doesn’t have to distinguish between genuinely new API object creation versus 
retrieving a previously-created API object; either way, it either has to save 
the reference it is given or (if a corresonding Python object already exists) 
get rid of it. The former may be the only possible case when creating a new API 
object, but it doesn’t have to care.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to