diff options
author | Jason R. Coombs <jaraco@jaraco.com> | 2021-11-12 19:37:53 -0500 |
---|---|---|
committer | Jason R. Coombs <jaraco@jaraco.com> | 2021-11-12 19:37:53 -0500 |
commit | 26d7411464115c6170ed666e72e388d53d4b7fd2 (patch) | |
tree | 40a141e309c856e1af1df3dc2ec7699288d3d384 /docs/userguide/datafiles.rst | |
parent | 342e02e7a4dbedf0e3a04c4d2d213b5340d56010 (diff) | |
parent | 77678abf97b4a8ee5e6e67b14cb21f543cd6bfd9 (diff) | |
download | python-setuptools-git-feature/local-schemes.tar.gz |
Merge branch 'main' into feature/local-schemesfeature/local-schemes
Diffstat (limited to 'docs/userguide/datafiles.rst')
-rw-r--r-- | docs/userguide/datafiles.rst | 177 |
1 files changed, 177 insertions, 0 deletions
diff --git a/docs/userguide/datafiles.rst b/docs/userguide/datafiles.rst new file mode 100644 index 00000000..69cf36e6 --- /dev/null +++ b/docs/userguide/datafiles.rst @@ -0,0 +1,177 @@ +==================== +Data Files Support +==================== + +The distutils have traditionally allowed installation of "data files", which +are placed in a platform-specific location. However, the most common use case +for data files distributed with a package is for use *by* the package, usually +by including the data files in the package directory. + +Setuptools offers three ways to specify data files to be included in your +packages. First, you can simply use the ``include_package_data`` keyword, +e.g.:: + + from setuptools import setup, find_packages + setup( + ... + include_package_data=True + ) + +This tells setuptools to install any data files it finds in your packages. +The data files must be specified via the distutils' ``MANIFEST.in`` file. +(They can also be tracked by a revision control system, using an appropriate +plugin. See the section below on :ref:`Adding Support for Revision +Control Systems` for information on how to write such plugins.) + +If you want finer-grained control over what files are included (for example, +if you have documentation files in your package directories and want to exclude +them from installation), then you can also use the ``package_data`` keyword, +e.g.:: + + from setuptools import setup, find_packages + setup( + ... + package_data={ + # If any package contains *.txt or *.rst files, include them: + "": ["*.txt", "*.rst"], + # And include any *.msg files found in the "hello" package, too: + "hello": ["*.msg"], + } + ) + +The ``package_data`` argument is a dictionary that maps from package names to +lists of glob patterns. The globs may include subdirectory names, if the data +files are contained in a subdirectory of the package. For example, if the +package tree looks like this:: + + setup.py + src/ + mypkg/ + __init__.py + mypkg.txt + data/ + somefile.dat + otherdata.dat + +The setuptools setup file might look like this:: + + from setuptools import setup, find_packages + setup( + ... + packages=find_packages("src"), # include all packages under src + package_dir={"": "src"}, # tell distutils packages are under src + + package_data={ + # If any package contains *.txt files, include them: + "": ["*.txt"], + # And include any *.dat files found in the "data" subdirectory + # of the "mypkg" package, also: + "mypkg": ["data/*.dat"], + } + ) + +Notice that if you list patterns in ``package_data`` under the empty string, +these patterns are used to find files in every package, even ones that also +have their own patterns listed. Thus, in the above example, the ``mypkg.txt`` +file gets included even though it's not listed in the patterns for ``mypkg``. + +Also notice that if you use paths, you *must* use a forward slash (``/``) as +the path separator, even if you are on Windows. Setuptools automatically +converts slashes to appropriate platform-specific separators at build time. + +If datafiles are contained in a subdirectory of a package that isn't a package +itself (no ``__init__.py``), then the subdirectory names (or ``*``) are required +in the ``package_data`` argument (as shown above with ``"data/*.dat"``). + +When building an ``sdist``, the datafiles are also drawn from the +``package_name.egg-info/SOURCES.txt`` file, so make sure that this is removed if +the ``setup.py`` ``package_data`` list is updated before calling ``setup.py``. + +(Note: although the ``package_data`` argument was previously only available in +``setuptools``, it was also added to the Python ``distutils`` package as of +Python 2.4; there is `some documentation for the feature`__ available on the +python.org website. If using the setuptools-specific ``include_package_data`` +argument, files specified by ``package_data`` will *not* be automatically +added to the manifest unless they are listed in the MANIFEST.in file.) + +__ https://docs.python.org/3/distutils/setupscript.html#installing-package-data + +Sometimes, the ``include_package_data`` or ``package_data`` options alone +aren't sufficient to precisely define what files you want included. For +example, you may want to include package README files in your revision control +system and source distributions, but exclude them from being installed. So, +setuptools offers an ``exclude_package_data`` option as well, that allows you +to do things like this:: + + from setuptools import setup, find_packages + setup( + ... + packages=find_packages("src"), # include all packages under src + package_dir={"": "src"}, # tell distutils packages are under src + + include_package_data=True, # include everything in source control + + # ...but exclude README.txt from all packages + exclude_package_data={"": ["README.txt"]}, + ) + +The ``exclude_package_data`` option is a dictionary mapping package names to +lists of wildcard patterns, just like the ``package_data`` option. And, just +as with that option, a key of ``""`` will apply the given pattern(s) to all +packages. However, any files that match these patterns will be *excluded* +from installation, even if they were listed in ``package_data`` or were +included as a result of using ``include_package_data``. + +In summary, the three options allow you to: + +``include_package_data`` + Accept all data files and directories matched by ``MANIFEST.in``. + +``package_data`` + Specify additional patterns to match files that may or may + not be matched by ``MANIFEST.in`` or found in source control. + +``exclude_package_data`` + Specify patterns for data files and directories that should *not* be + included when a package is installed, even if they would otherwise have + been included due to the use of the preceding options. + +NOTE: Due to the way the distutils build process works, a data file that you +include in your project and then stop including may be "orphaned" in your +project's build directories, requiring you to run ``setup.py clean --all`` to +fully remove them. This may also be important for your users and contributors +if they track intermediate revisions of your project using Subversion; be sure +to let them know when you make changes that remove files from inclusion so they +can run ``setup.py clean --all``. + + +.. _Accessing Data Files at Runtime: + +Accessing Data Files at Runtime +------------------------------- + +Typically, existing programs manipulate a package's ``__file__`` attribute in +order to find the location of data files. However, this manipulation isn't +compatible with PEP 302-based import hooks, including importing from zip files +and Python Eggs. It is strongly recommended that, if you are using data files, +you should use the :ref:`ResourceManager API` of ``pkg_resources`` to access +them. The ``pkg_resources`` module is distributed as part of setuptools, so if +you're using setuptools to distribute your package, there is no reason not to +use its resource management API. See also `Importlib Resources`_ for +a quick example of converting code that uses ``__file__`` to use +``pkg_resources`` instead. + +.. _Importlib Resources: https://docs.python.org/3/library/importlib.html#module-importlib.resources + + +Non-Package Data Files +---------------------- + +Historically, ``setuptools`` by way of ``easy_install`` would encapsulate data +files from the distribution into the egg (see `the old docs +<https://github.com/pypa/setuptools/blob/52aacd5b276fedd6849c3a648a0014f5da563e93/docs/setuptools.txt#L970-L1001>`_). As eggs are deprecated and pip-based installs +fall back to the platform-specific location for installing data files, there is +no supported facility to reliably retrieve these resources. + +Instead, the PyPA recommends that any data files you wish to be accessible at +run time be included in the package. |