| .. highlight:: c |
| |
| .. _extension-modules: |
| |
| Defining extension modules |
| -------------------------- |
| |
| A C extension for CPython is a shared library (for example, a ``.so`` file |
| on Linux, ``.pyd`` DLL on Windows), which is loadable into the Python process |
| (for example, it is compiled with compatible compiler settings), and which |
| exports an :ref:`initialization function <extension-export-hook>`. |
| |
| To be importable by default (that is, by |
| :py:class:`importlib.machinery.ExtensionFileLoader`), |
| the shared library must be available on :py:attr:`sys.path`, |
| and must be named after the module name plus an extension listed in |
| :py:attr:`importlib.machinery.EXTENSION_SUFFIXES`. |
| |
| .. note:: |
| |
| Building, packaging and distributing extension modules is best done with |
| third-party tools, and is out of scope of this document. |
| One suitable tool is Setuptools, whose documentation can be found at |
| https://setuptools.pypa.io/en/latest/setuptools.html. |
| |
| Normally, the initialization function returns a module definition initialized |
| using :c:func:`PyModuleDef_Init`. |
| This allows splitting the creation process into several phases: |
| |
| - Before any substantial code is executed, Python can determine which |
| capabilities the module supports, and it can adjust the environment or |
| refuse loading an incompatible extension. |
| - By default, Python itself creates the module object -- that is, it does |
| the equivalent of :py:meth:`object.__new__` for classes. |
| It also sets initial attributes like :attr:`~module.__package__` and |
| :attr:`~module.__loader__`. |
| - Afterwards, the module object is initialized using extension-specific |
| code -- the equivalent of :py:meth:`~object.__init__` on classes. |
| |
| This is called *multi-phase initialization* to distinguish it from the legacy |
| (but still supported) *single-phase initialization* scheme, |
| where the initialization function returns a fully constructed module. |
| See the :ref:`single-phase-initialization section below <single-phase-initialization>` |
| for details. |
| |
| .. versionchanged:: 3.5 |
| |
| Added support for multi-phase initialization (:pep:`489`). |
| |
| |
| Multiple module instances |
| ......................... |
| |
| By default, extension modules are not singletons. |
| For example, if the :py:attr:`sys.modules` entry is removed and the module |
| is re-imported, a new module object is created, and typically populated with |
| fresh method and type objects. |
| The old module is subject to normal garbage collection. |
| This mirrors the behavior of pure-Python modules. |
| |
| Additional module instances may be created in |
| :ref:`sub-interpreters <sub-interpreter-support>` |
| or after Python runtime reinitialization |
| (:c:func:`Py_Finalize` and :c:func:`Py_Initialize`). |
| In these cases, sharing Python objects between module instances would likely |
| cause crashes or undefined behavior. |
| |
| To avoid such issues, each instance of an extension module should |
| be *isolated*: changes to one instance should not implicitly affect the others, |
| and all state owned by the module, including references to Python objects, |
| should be specific to a particular module instance. |
| See :ref:`isolating-extensions-howto` for more details and a practical guide. |
| |
| A simpler way to avoid these issues is |
| :ref:`raising an error on repeated initialization <isolating-extensions-optout>`. |
| |
| All modules are expected to support |
| :ref:`sub-interpreters <sub-interpreter-support>`, or otherwise explicitly |
| signal a lack of support. |
| This is usually achieved by isolation or blocking repeated initialization, |
| as above. |
| A module may also be limited to the main interpreter using |
| the :c:data:`Py_mod_multiple_interpreters` slot. |
| |
| |
| .. _extension-export-hook: |
| |
| Initialization function |
| ....................... |
| |
| The initialization function defined by an extension module has the |
| following signature: |
| |
| .. c:function:: PyObject* PyInit_modulename(void) |
| |
| Its name should be :samp:`PyInit_{<name>}`, with ``<name>`` replaced by the |
| name of the module. |
| |
| For modules with ASCII-only names, the function must instead be named |
| :samp:`PyInit_{<name>}`, with ``<name>`` replaced by the name of the module. |
| When using :ref:`multi-phase-initialization`, non-ASCII module names |
| are allowed. In this case, the initialization function name is |
| :samp:`PyInitU_{<name>}`, with ``<name>`` encoded using Python's |
| *punycode* encoding with hyphens replaced by underscores. In Python: |
| |
| .. code-block:: python |
| |
| def initfunc_name(name): |
| try: |
| suffix = b'_' + name.encode('ascii') |
| except UnicodeEncodeError: |
| suffix = b'U_' + name.encode('punycode').replace(b'-', b'_') |
| return b'PyInit' + suffix |
| |
| It is recommended to define the initialization function using a helper macro: |
| |
| .. c:macro:: PyMODINIT_FUNC |
| |
| Declare an extension module initialization function. |
| This macro: |
| |
| * specifies the :c:expr:`PyObject*` return type, |
| * adds any special linkage declarations required by the platform, and |
| * for C++, declares the function as ``extern "C"``. |
| |
| For example, a module called ``spam`` would be defined like this:: |
| |
| static struct PyModuleDef spam_module = { |
| .m_base = PyModuleDef_HEAD_INIT, |
| .m_name = "spam", |
| ... |
| }; |
| |
| PyMODINIT_FUNC |
| PyInit_spam(void) |
| { |
| return PyModuleDef_Init(&spam_module); |
| } |
| |
| It is possible to export multiple modules from a single shared library by |
| defining multiple initialization functions. However, importing them requires |
| using symbolic links or a custom importer, because by default only the |
| function corresponding to the filename is found. |
| See the `Multiple modules in one library <https://peps.python.org/pep-0489/#multiple-modules-in-one-library>`__ |
| section in :pep:`489` for details. |
| |
| The initialization function is typically the only non-\ ``static`` |
| item defined in the module's C source. |
| |
| |
| .. _multi-phase-initialization: |
| |
| Multi-phase initialization |
| .......................... |
| |
| Normally, the :ref:`initialization function <extension-export-hook>` |
| (``PyInit_modulename``) returns a :c:type:`PyModuleDef` instance with |
| non-``NULL`` :c:member:`~PyModuleDef.m_slots`. |
| Before it is returned, the ``PyModuleDef`` instance must be initialized |
| using the following function: |
| |
| |
| .. c:function:: PyObject* PyModuleDef_Init(PyModuleDef *def) |
| |
| Ensure a module definition is a properly initialized Python object that |
| correctly reports its type and a reference count. |
| |
| Return *def* cast to ``PyObject*``, or ``NULL`` if an error occurred. |
| |
| Calling this function is required for :ref:`multi-phase-initialization`. |
| It should not be used in other contexts. |
| |
| Note that Python assumes that ``PyModuleDef`` structures are statically |
| allocated. |
| This function may return either a new reference or a borrowed one; |
| this reference must not be released. |
| |
| .. versionadded:: 3.5 |
| |
| |
| .. _single-phase-initialization: |
| |
| Legacy single-phase initialization |
| .................................. |
| |
| .. attention:: |
| Single-phase initialization is a legacy mechanism to initialize extension |
| modules, with known drawbacks and design flaws. Extension module authors |
| are encouraged to use multi-phase initialization instead. |
| |
| In single-phase initialization, the |
| :ref:`initialization function <extension-export-hook>` (``PyInit_modulename``) |
| should create, populate and return a module object. |
| This is typically done using :c:func:`PyModule_Create` and functions like |
| :c:func:`PyModule_AddObjectRef`. |
| |
| Single-phase initialization differs from the :ref:`default <multi-phase-initialization>` |
| in the following ways: |
| |
| * Single-phase modules are, or rather *contain*, “singletons”. |
| |
| When the module is first initialized, Python saves the contents of |
| the module's ``__dict__`` (that is, typically, the module's functions and |
| types). |
| |
| For subsequent imports, Python does not call the initialization function |
| again. |
| Instead, it creates a new module object with a new ``__dict__``, and copies |
| the saved contents to it. |
| For example, given a single-phase module ``_testsinglephase`` |
| [#testsinglephase]_ that defines a function ``sum`` and an exception class |
| ``error``: |
| |
| .. code-block:: python |
| |
| >>> import sys |
| >>> import _testsinglephase as one |
| >>> del sys.modules['_testsinglephase'] |
| >>> import _testsinglephase as two |
| >>> one is two |
| False |
| >>> one.__dict__ is two.__dict__ |
| False |
| >>> one.sum is two.sum |
| True |
| >>> one.error is two.error |
| True |
| |
| The exact behavior should be considered a CPython implementation detail. |
| |
| * To work around the fact that ``PyInit_modulename`` does not take a *spec* |
| argument, some state of the import machinery is saved and applied to the |
| first suitable module created during the ``PyInit_modulename`` call. |
| Specifically, when a sub-module is imported, this mechanism prepends the |
| parent package name to the name of the module. |
| |
| A single-phase ``PyInit_modulename`` function should create “its” module |
| object as soon as possible, before any other module objects can be created. |
| |
| * Non-ASCII module names (``PyInitU_modulename``) are not supported. |
| |
| * Single-phase modules support module lookup functions like |
| :c:func:`PyState_FindModule`. |
| |
| .. [#testsinglephase] ``_testsinglephase`` is an internal module used |
| in CPython's self-test suite; your installation may or may not |
| include it. |