Cython – Tips and Tricks

  • Don’t overthink it – your .pyx is converted to a .c file and then converted into a Python callable extension module. This can be confusing at first – including, what language are you coding in? Essentially you are coding in Python, with some extensions/limitations that allow you to mix in some C like constructs and code. Eventually, all of the Cython code is converted to C/C++
  • cdef – learn it, love it, use it. This keyword is used in multiple contexts – it is used to declare variables of primitive C types with no equivalent Python types or to avoid creating/using Python wrappers. It is also used to provide type hints for variables and parameters which allows optimizations and speeds up the code. When you use it to define a Python class, it adds some restrictions but creates a version of the class that is more memory efficient and faster. When you use it for methods (class methods or free methods), it defines native only functions that have lower overhead than a Python function.
  • cdef means accessible from within the Cython module and in native code only. It will not be accessible from Python code.
  • cpdef can be used to generate definitions accessible from both Native and Python code.
  • Use __cinit__ and __dealloc__ for native resource management – native libraries will often return opaque pointers/handles that are managed using custom alloc/free functions. A great way to handle this is without leaks, taking care of allocation failures is to create a cdef wrapper class with the resource type as a cdef member variable and handle allocation + allocation failure handling in the special method __cinit__ and free in the __dealloc__ function.
  • You can use simple array slicing with length parameter to convert a raw, native chunk of memory into a Python byte array – for example, if you have an cdef unsigned char* buf and cdef size_t sz which is a native byte buffer with size and you need to pass it to a Python function foo that takes a byte array, simply do foo(buf[:sz]) and it will work.
  • Avoid buffer copying using typed memory views – the approach above will still result in a copy of the native buffer into a Python bytes object during the call. If you know that you do not need the copy, you can use a typed memory view by doing cdef unsigned char[:] mem_view = <unsigned char [:sz]>buf and then doing foo(mem_view)
  • When wrapping a C library, you only need to declare what you need including struct members. This can allow for concise and simple pxd files and accelerate development time.
  • You cannot use the * operator for pointer dereferencing – use array syntax instead. So instead of dereferencing unsigned int* sz with *sz = 0 you should do sz[0] = 0
  • Some good links on Cython

Leave a Reply

Your email address will not be published. Required fields are marked *