aboutsummaryrefslogtreecommitdiff
path: root/docs/userguide/package_discovery.rst
blob: ee8e98365985bb61d344052ec1c378dfde19508f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
.. _`package_discovery`:

========================================
Package Discovery and Namespace Package
========================================

.. note::
    a full specification for the keyword supplied to ``setup.cfg`` or
    ``setup.py`` can be found at :doc:`keywords reference <keywords>`

.. note::
    the examples provided here are only to demonstrate the functionality
    introduced. More metadata and options arguments need to be supplied
    if you want to replicate them on your system. If you are completely
    new to setuptools, the :doc:`quickstart section <quickstart>` is a good
    place to start.

``Setuptools`` provide powerful tools to handle package discovery, including
support for namespace package.

Normally, you would specify the package to be included manually in the following manner:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        #...
        packages =
            mypkg1
            mypkg2

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            packages=['mypkg1', 'mypkg2']
        )

.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_

    .. code-block:: toml

        # ...
        [tool.setuptools]
        packages = ["mypkg1", "mypkg2"]
        # ...


If your packages are not in the root of the repository you also need to
configure ``package_dir``:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        # ...
        package_dir =
            = src
            # directory containing all the packages (e.g.  src/mypkg1, src/mypkg2)
        # OR
        package_dir =
            mypkg1 = lib1
            # mypkg1.mod corresponds to lib1/mod.py
            # mypkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
            mypkg2 = lib2
            # mypkg2.mod corresponds to lib2/mod.py
            mypkg2.subpkg = lib3
            # mypkg2.subpkg.mod corresponds to lib3/mod.py

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            package_dir = {"": "src"}
            # directory containing all the packages (e.g.  src/mypkg1, src/mypkg2)
        )

        # OR

        setup(
            # ...
            package_dir = {
                "mypkg1": "lib1",   # mypkg1.mod corresponds to lib1/mod.py
                                    # mypkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
                "mypkg2": "lib2",   # mypkg2.mod corresponds to lib2/mod.py
                "mypkg2.subpkg": "lib3"  # mypkg2.subpkg.mod corresponds to lib3/mod.py
                # ...
        )

.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_

    .. code-block:: toml

        [tool.setuptools]
        # ...
        package-dir = {"" = "src"}
            # directory containing all the packages (e.g.  src/mypkg1, src/mypkg2)

        # OR

        [tool.setuptools.package-dir]
        mypkg1 = "lib1"
            # mypkg1.mod corresponds to lib1/mod.py
            # mypkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
        mypkg2 = "lib2"
            # mypkg2.mod corresponds to lib2/mod.py
        "mypkg2.subpkg" = "lib3"
            # mypkg2.subpkg.mod corresponds to lib3/mod.py
        # ...

This can get tiresome really quickly. To speed things up, you can rely on
setuptools automatic discovery, or use the provided tools, as explained in
the following sections.


.. _auto-discovery:

Automatic discovery
===================

.. warning:: Automatic discovery is an **experimental** feature and might change
   (or be completely removed) in the future.
   See :ref:`custom-discovery` for a stable way of configuring ``setuptools``.

By default ``setuptools`` will consider 2 popular project layouts, each one with
its own set of advantages and disadvantages [#layout1]_ [#layout2]_ as
discussed in the following sections.

Setuptools will automatically scan your project directory looking for these
layouts and try to guess the correct values for the :ref:`packages <declarative
config>` and :doc:`py_modules </references/keywords>` configuration.

.. important::
   Automatic discovery will **only** be enabled if you **don't** provide any
   configuration for ``packages`` and ``py_modules``.
   If at least one of them is explicitly set, automatic discovery will not take place.

   **Note**: specifying ``ext_modules`` might also prevent auto-discover from
   taking place, unless your opt into :doc:`pyproject_config` (which will
   disable the backward compatible behaviour).

.. _src-layout:

src-layout
----------
The project should contain a ``src`` directory under the project root and
all modules and packages meant for distribution are placed inside this
directory::

    project_root_directory
    ├── pyproject.toml
    ├── setup.cfg  # or setup.py
    ├── ...
    └── src/
        └── mypkg/
            ├── __init__.py
            ├── ...
            └── mymodule.py

This layout is very handy when you wish to use automatic discovery,
since you don't have to worry about other Python files or folders in your
project root being distributed by mistake. In some circumstances it can be
also less error-prone for testing or when using :pep:`420`-style packages.
On the other hand you cannot rely on the implicit ``PYTHONPATH=.`` to fire
up the Python REPL and play with your package (you will need an
`editable install`_ to be able to do that).

.. _flat-layout:

flat-layout
-----------
*(also known as "adhoc")*

The package folder(s) are placed directly under the project root::

    project_root_directory
    ├── pyproject.toml
    ├── setup.cfg  # or setup.py
    ├── ...
    └── mypkg/
        ├── __init__.py
        ├── ...
        └── mymodule.py

This layout is very practical for using the REPL, but in some situations
it can be can be more error-prone (e.g. during tests or if you have a bunch
of folders or Python files hanging around your project root)

To avoid confusion, file and folder names that are used by popular tools (or
that correspond to well-known conventions, such as distributing documentation
alongside the project code) are automatically filtered out in the case of
*flat-layout*:

.. autoattribute:: setuptools.discovery.FlatLayoutPackageFinder.DEFAULT_EXCLUDE

.. autoattribute:: setuptools.discovery.FlatLayoutModuleFinder.DEFAULT_EXCLUDE

.. warning::
   If you are using auto-discovery with *flat-layout*, ``setuptools`` will
   refuse to create :term:`distribution archives <Distribution Package>` with
   multiple top-level packages or modules.

   This is done to prevent common errors such as accidentally publishing code
   not meant for distribution (e.g. maintenance-related scripts).

   Users that purposefully want to create multi-package distributions are
   advised to use :ref:`custom-discovery` or the ``src-layout``.

There is also a handy variation of the *flat-layout* for utilities/libraries
that can be implemented with a single Python file:

single-module distribution
^^^^^^^^^^^^^^^^^^^^^^^^^^

A standalone module is placed directly under the project root, instead of
inside a package folder::

    project_root_directory
    ├── pyproject.toml
    ├── setup.cfg  # or setup.py
    ├── ...
    └── single_file_lib.py


.. _custom-discovery:

Custom discovery
================

If the automatic discovery does not work for you
(e.g., you want to *include* in the distribution top-level packages with
reserved names such as ``tasks``, ``example`` or ``docs``, or you want to
*exclude* nested packages that would be otherwise included), you can use
the provided tools for package discovery:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        packages = find:
        #or
        packages = find_namespace:

.. tab:: setup.py

    .. code-block:: python

        from setuptools import find_packages
        # or
        from setuptools import find_namespace_packages

.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_

    .. code-block:: toml

        # ...
        [tool.setuptools.packages]
        find = {}  # Scanning implicit namespaces is active by default
        # OR
        find = {namespace = false}  # Disable implicit namespaces


Finding simple packages
-----------------------
Let's start with the first tool. ``find:`` (``find_packages()``) takes a source
directory and two lists of package name patterns to exclude and include, and
then return a list of ``str`` representing the packages it could find. To use
it, consider the following directory::

    mypkg
    ├── setup.cfg  # and/or setup.py, pyproject.toml
    └── src
        ├── pkg1
        │   └── __init__.py
        ├── pkg2
        │   └── __init__.py
        ├── aditional
        │   └── __init__.py
        └── pkg
            └── namespace
                └── __init__.py

To have setuptools to automatically include packages found
in ``src`` that starts with the name ``pkg`` and not ``additional``:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        packages = find:
        package_dir =
            =src

        [options.packages.find]
        where = src
        include = pkg*
        exclude = additional

    .. note::
        ``pkg`` does not contain an ``__init__.py`` file, therefore
        ``pkg.namespace`` is ignored by ``find:`` (see ``find_namespace:`` below).

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            packages=find_packages(
                where='src',
                include=['pkg*'],
                exclude=['additional'],
            ),
            package_dir={"": "src"}
            # ...
        )


    .. note::
        ``pkg`` does not contain an ``__init__.py`` file, therefore
        ``pkg.namespace`` is ignored by ``find_packages()``
        (see ``find_namespace_packages()`` below).

.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_

    .. code-block:: toml

        [tool.setuptools.packages.find]
        where = ["src"]
        include = ["pkg*"]
        exclude = ["additional"]
        namespaces = false

    .. note::
        When using ``tool.setuptools.packages.find`` in ``pyproject.toml``,
        setuptools will consider :pep:`implicit namespaces <420>` by default when
        scanning your project directory.
        To avoid ``pkg.namespace`` from being added to your package list
        you can set ``namespaces = false``. This will prevent any folder
        without an ``__init__.py`` file from being scanned.

.. important::
   ``include`` and ``exclude`` accept strings representing :mod:`glob` patterns.
   These patterns should match the **full** name of the Python module (as if it
   was written in an ``import`` statement).

   For example if you have ``util`` pattern, it will match
   ``util/__init__.py`` but not ``util/files/__init__.py``.

   The fact that the parent package is matched by the pattern will not dictate
   if the submodule will be included or excluded from the distribution.
   You will need to explicitly add a wildcard (e.g. ``util*``)
   if you want the pattern to also match submodules.

.. _Namespace Packages:

Finding namespace packages
--------------------------
``setuptools``  provides the ``find_namespace:`` (``find_namespace_packages()``)
which behaves similarly to ``find:`` but works with namespace package.

Before diving in, it is important to have a good understanding of what
:pep:`namespace packages <420>` are. Here is a quick recap.

When you have two packages organized as follows:

.. code-block:: bash

    /Users/Desktop/timmins/foo/__init__.py
    /Library/timmins/bar/__init__.py

If both ``Desktop`` and ``Library`` are on your ``PYTHONPATH``, then a
namespace package called ``timmins`` will be created automatically for you when
you invoke the import mechanism, allowing you to accomplish the following:

.. code-block:: pycon

    >>> import timmins.foo
    >>> import timmins.bar

as if there is only one ``timmins`` on your system. The two packages can then
be distributed separately and installed individually without affecting the
other one.

Now, suppose you decide to package the ``foo`` part for distribution and start
by creating a project directory organized as follows::

   foo
   ├── setup.cfg  # and/or setup.py, pyproject.toml
   └── src
       └── timmins
           └── foo
               └── __init__.py

If you want the ``timmins.foo`` to be automatically included in the
distribution, then you will need to specify:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        package_dir =
            =src
        packages = find_namespace:

        [options.packages.find]
        where = src

    ``find:`` won't work because timmins doesn't contain ``__init__.py``
    directly, instead, you have to use ``find_namespace:``.

    You can think of ``find_namespace:`` as identical to ``find:`` except it
    would count a directory as a package even if it doesn't contain ``__init__.py``
    file directly.

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            packages=find_namespace_packages(where='src'),
            package_dir={"": "src"}
            # ...
        )

    When you use ``find_packages()``, all directories without an
    ``__init__.py`` file will be disconsidered.
    On the other hand, ``find_namespace_packages()`` will scan all
    directories.

.. tab:: pyproject.toml (**EXPERIMENTAL**) [#experimental]_

    .. code-block:: toml

        [tool.setuptools.packages.find]
        where = ["src"]

    When using ``tool.setuptools.packages.find`` in ``pyproject.toml``,
    setuptools will consider :pep:`implicit namespaces <420>` by default when
    scanning your project directory.

After installing the package distribution, ``timmins.foo`` would become
available to your interpreter.

.. warning::
   Please have in mind that ``find_namespace:`` (setup.cfg),
   ``find_namespace_packages()`` (setup.py) and ``find`` (pyproject.toml) will
   scan **all** folders that you have in your project directory if you use a
   :ref:`flat-layout`.

   If used naïvely, this might result in unwanted files being added to your
   final wheel. For example, with a project directory organized as follows::

       foo
       ├── docs
       │   └── conf.py
       ├── timmins
       │   └── foo
       │       └── __init__.py
       └── tests
           └── tests_foo
               └── __init__.py

   final users will end up installing not only ``timmins.foo``, but also
   ``docs`` and ``tests.tests_foo``.

   A simple way to fix this is to adopt the aforementioned :ref:`src-layout`,
   or make sure to properly configure the ``include`` and/or ``exclude``
   accordingly.

.. tip::
   After :ref:`building your package <building>`, you can have a look if all
   the files are correct (nothing missing or extra), by running the following
   commands:

   .. code-block:: bash

      tar tf dist/*.tar.gz
      unzip -l dist/*.whl

   This requires the ``tar`` and ``unzip`` to be installed in your OS.
   On Windows you can also use a GUI program such as 7zip_.


Legacy Namespace Packages
=========================
The fact you can create namespace package so effortlessly above is credited
to `PEP 420 <https://www.python.org/dev/peps/pep-0420/>`_. It use to be more
cumbersome to accomplish the same result. Historically, there were two methods
to create namespace packages. One is the ``pkg_resources`` style supported by
``setuptools`` and the other one being ``pkgutils`` style offered by
``pkgutils`` module in Python. Both are now considered deprecated despite the
fact they still linger in many existing packages. These two differ in many
subtle yet significant aspects and you can find out more on `Python packaging
user guide <https://packaging.python.org/guides/packaging-namespace-packages/>`_


``pkg_resource`` style namespace package
----------------------------------------
This is the method ``setuptools`` directly supports. Starting with the same
layout, there are two pieces you need to add to it. First, an ``__init__.py``
file directly under your namespace package directory that contains the
following:

.. code-block:: python

    __import__("pkg_resources").declare_namespace(__name__)

And the ``namespace_packages`` keyword in your ``setup.cfg`` or ``setup.py``:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        namespace_packages = timmins

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            namespace_packages=['timmins']
        )

And your directory should look like this

.. code-block:: bash

   foo
   ├── setup.cfg  # and/or setup.py, pyproject.toml
   └── src
       └── timmins
           ├── __init__.py
           └── foo
               └── __init__.py

Repeat the same for other packages and you can achieve the same result as
the previous section.

``pkgutil`` style namespace package
-----------------------------------
This method is almost identical to the ``pkg_resource`` except that the
``namespace_packages`` declaration is omitted and the ``__init__.py``
file contains the following:

.. code-block:: python

    __path__ = __import__('pkgutil').extend_path(__path__, __name__)

The project layout remains the same and ``setup.cfg`` remains the same.


----


.. [#experimental]
   Support for specifying package metadata and build configuration options via
   ``pyproject.toml`` is experimental and might change (or be completely
   removed) in the future. See :doc:`/userguide/pyproject_config`.
.. [#layout1] https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
.. [#layout2] https://blog.ionelmc.ro/2017/09/25/rehashing-the-src-layout/

.. _editable install: https://pip.pypa.io/en/stable/cli/pip_install/#editable-installs
.. _7zip: https://www.7-zip.org