如何向车轮添加其他文件?

如何控制什么文件包含在一个车轮? 似乎 MANIFEST.in不使用的 python setup.py bdist_wheel

更新 :

关于从源 tarball 安装与从轮子安装之间的区别,我错了。源发行版包括在 MANIFEST.in中指定的文件,但是安装的包只有 python 文件。需要采取步骤来确定应该安装的其他文件,无论是通过源代码分发、 egg 还是滚轮进行安装。也就是说,对于额外的包文件,需要使用 Package _ data,对于包之外的文件,如命令行脚本或系统配置文件,则需要使用 Data _ files

原始问题

我有 一个项目,我一直在使用 python setup.py sdist来建立我的软件包,MANIFEST.in来控制包含和排除的文件,以及 火焰清单来确认我的设置。

我最近将其转换为双 Python 2/3代码,并添加了一个 setup.cfg

[bdist_wheel]
universal = 1

我可以建立一个车轮与 python setup.py bdist_wheel,它似乎是一个通用的车轮所需要的。但是,它不包括 MANIFEST.in中指定的所有文件。

安装什么?

我深入研究了一下,现在对包装和车轮有了更多的了解。以下是我学到的:

我上传两个包文件到 PyPi 上的 multigtfs 项目:

  • multigtfs-0.4.2.tar.gz-源焦油球,它包含 MANIFEST.in中的所有文件。
  • multigtfs-0.4.2-py2.py3-none-any.whl-所讨论的二进制分布。

我使用 Python 2.7.5创建了两个新的虚拟环境,并安装了每个包(pip install multigtfs-0.4.2.tar.gz)。这两个环境几乎完全相同。它们有不同的 .pyc文件,即“编译”的 Python 文件。有记录磁盘上不同路径的日志文件。从源焦油球的安装包括一个文件夹 multigtfs-0.4.2-py27.egg-info,详细的安装,和车轮安装有一个 multigtfs-0.4.2.dist-info文件夹,详细的过程。但是,从使用 multigtfs 项目的代码的角度来看,这两种安装方法没有区别。

显然,我的测试也没有使用. zip 文件,因此测试套件将会失败:

$ django-admin startproject demo
$ cd demo
$ pip install psycopg2  # DB driver for PostGIS project
$ createdb demo         # Create PostgreSQL database
$ psql -d demo -c "CREATE EXTENSION postgis" # Make it a PostGIS database
$ vi demo/settings.py   # Add multigtfs to INSTALLED_APPS,
# Update DATABASE to set ENGINE to django.contrib.gis.db.backends.postgis
# Update DATABASE to set NAME to test
$ ./manage.py test multigtfs.tests  # Run the tests
...
IOError: [Errno 2] No such file or directory: u'/Users/john/.virtualenvs/test/lib/python2.7/site-packages/multigtfs/tests/fixtures/test3.zip'

指定其他文件

利用答案中的建议,我在 setup.py中添加了一些额外的指令:

from __future__ import unicode_literals
# setup.py now requires some funky binary strings
...
setup(
name='multigtfs',
packages=find_packages(),
package_data={b'multigtfs': ['test/fixtures/*.zip']},
include_package_data=True,
...
)

这会将 zip 文件(以及 README)安装到文件夹中,测试现在可以正确运行了。谢谢你的建议!

48952 次浏览

You can specify extra files to install using the data_files directive. Is that what you're looking for? Here's a small example:

from setuptools import setup
from glob import glob


setup(
name='extra',
version='0.0.1',
py_modules=['extra'],
data_files=[
('images', glob('assets/*.png')),
],
)

Have you tried using package_data in your setup.py? MANIFEST.in seems targetted for python versions <= 2.6, I'm not sure if higher versions even look at it.

After exploring https://github.com/pypa/sampleproject, their MANIFEST.in says:

# If using Python 2.6 or less, then have to include package data, even though
# it's already declared in setup.py
include sample/*.dat

which seems to imply this method is outdated. Meanwhile, in setup.py they declare:

setup(
name='sample',
...
# If there are data files included in your packages that need to be
# installed, specify them here.  If using Python 2.6 or less, then these
# have to be included in MANIFEST.in as well.
include_package_data=True,
package_data={
'sample': ['package_data.dat'],
},
...
)

(I'm not sure why they chose a wildcard in MANIFEST.in and a filename in setup.py. They refer to the same file)

Which, along with being simpler, again seems to imply that the package_data route is superior to the MANIFEST.in method. Well, unless you have to support 2.6 that is, in which case my prayers go out to you.

You can use package_data and data_files in setup.py to specify additional files, but they are ridiculously hard to get right (and buggy).

An alternative is to use MANIFEST.in and add include_package_data=True in setup() of your setup.py as indicated here.

With this directive, the MANIFEST.in will be used to specify the files to include not only in source tarball/zip, but also in wheel and win32 installer. This also works with any python version (i tested on a project from py2.6 to py3.6).

UPDATE 2020: it seems the MANIFEST.in is not honored anymore by the wheel in Python 3, although it still is in the tar.gz, even if you set include_package_data=True.

Here is how to fix that: you need to specify both include_package_data and packages.

If your Python module is inside a folder "pymod", here's the adequate setup:

setup( ...
include_package_data = True,
packages = ['pymod'],
)

If your python scripts are at the root, use:

setup( ...
include_package_data = True,
packages = ['.'],
)

Then you can open your .whl file with a zip archival software such as 7-zip to check that all the files you want are indeed inside.

Before you make any changes in MANIFEST.in or setup.py you must remove old output directories. Setuptools is caching some of the data and this can lead to unexpected results.

rm -rf build *.egg-info

If you don't do this, expect nothing to work correctly.

Now that is out of the way.

  1. If you are building a source distribution (sdist) then you can use any method below.

  2. If you are building a wheel (bdist_wheel), then include_package_data and MANIFEST.in are ignored and you must use package_data and data_files.

INCLUDE_PACKAGE_DATA

This is a good option, but bdist_wheel does not honor it.

setup(
...
include_package_data=True
)


# MANIFEST.in
include package/data.json

DATA_FILES for non-package data

This is most flexible option because you can add any file from your repo to a sdist or bdist_wheel

setup(
....
data_files=[
('output_dir',['conf/data.json']),
]
# For sdist, output_dir is ignored!
#
# For bdist_wheel, data.json from conf dir in root of your repo
# and stored at `output_dir/` inside of the sdist package.
)

PACKAGE_DATA for non-python files inside of the package

Similar to above, but for a bdist_wheel let's you put your data files inside of the package. It is identical for sdist but has more limitations than data_files because files can only source from your package subdir.

setup(
...
package_data={'package':'data.json'},
# data.json must be inside of your actual package
)

I had config/ directory with JSON files in it, which I needed to add to the wheel package. So, I've added these lines to MANIFEST.in:

recursive-include config/ *.json

The following directive to setup.py:

setup(
...
include_package_data=True,
)

And nothing worked. Until I've created an empty file called __init__.py inside config/ directory.

(Python 3.6.7, wheel 3.6.7, setuptools 39.0.1)

include_package_data is the way to go, and it works for sdist and wheels.

However you have to do it right, and it took me months to figure this out, so here is what I learned.

The trick is essentially given in the name of the option include_PACKAGE_data: The data files need to be in a package subfolder

If and only if

  • include_package_data is True
  • the data files are listed in MANIFEST.in (*see also my note at the end about setuptools_scm)
  • and the data files are under a package directory

then the data files will be included.

Working Example:

Given the project has the following structure and files:

|- MANIFEST.in
|- setup.cfg
|- setup.py
|
\---foo
|- __init__.py
|
\---data
- example.png


And the following configuration:

Manifest.in:

recursive-include foo/data *

setup.py

import setuptools


setuptools.setup()

setup.cfg

[metadata]
name = wheel-data-files-example
url = www.example.com
maintainer = None
maintainer_email = none@example.com


[options]
packages =
foo
include_package_data = True

sdist packages and your wheels you build will contain the example.png datafile as well.

(of course, instead of setup.cfg the config can also be directly specified in setup.py. But this is not relevant for the example.)

Update: For src layout projects

This should also work for projects that use a src layout, looking like this:

|- MANIFEST.in
|- setup.cfg
|- setup.py
|
\---src
|
\---foo
|- __init__.py
|
\---data
- example.png

To make it work, tell setuptools about the src directory using package_dir:

setup.cfg

[metadata]
name = wheel-data-files-example
url = www.example.com
maintainer = None
maintainer_email = none@example.com


[options]
packages =
foo
include_package_data = True
package_dir =
=src

And in the manifest adjust the path:

Manifest.in:

recursive-include src/foo/data *

Note: No Manifest.in necessary if you use setuptools_scm

If you happen to use setuptools and add the setuptools_scm plugin (on pypi), then you don't need to manage a Manifest.in file. Instead setuptools_scm will take care that all files that are tracked by git are added in the package.

So for this case the rule for if or if not a file is added to the sdist/wheel is: If and only if

  • include_package_data is True
  • the file is tracked by git (or another setuptools_scm supported tool)
  • and the data files are under a package directory

then the data files will be included.

To add files directly into the top level of a wheel (and not under a folder inside the wheel) simply use Poetry. Create a pyproject.toml with:

poetry init

Port your dependencies with:

cat requirements.txt | xargs poetry add

Add a line like this in your pyproject.toml

include = ["Yourfile"]

Then run:

poetry build

Note: IntelliJ products make it ease and fast to browse your wheels with this plugin.