deb-pkg-tools: Debian packaging tools

https://travis-ci.org/xolox/python-deb-pkg-tools.svg?branch=master https://coveralls.io/repos/xolox/python-deb-pkg-tools/badge.png?branch=master

The Python package deb-pkg-tools is a collection of functions to build and inspect Debian binary packages and repositories of binary packages. Its primary use case is to automate builds.

Some of the functionality is exposed in the command line interface (documented below) because it’s very convenient to use in shell scripts, while other functionality is meant to be used as a Python API. The package is currently tested on cPython 2.6, 2.7, 3.4, 3.5, 3.6 and PyPy (2.7).

Please note that deb-pkg-tools is quite opinionated about how Debian binary packages should be built and it enforces some of these opinions on its users. Most of this can be avoided with optional function arguments and/or environment variables. If you find something that doesn’t work to your liking and you can’t work around it, feel free to ask for an additional configuration option; I try to keep an open mind about the possible use cases of my projects.

Status

On the one hand the deb-pkg-tools package is based on my experiences with Debian packages and repositories over the past couple of years, on the other hand deb-pkg-tools itself is quite young. Then again most functionality is covered by automated tests; at the time of writing coverage is around 90% (some of the error handling is quite tricky to test if we also want to test the non-error case, which is of course the main focus :-)

Installation

The deb-pkg-tools package is available on PyPI which means installation should be as simple as:

$ pip install deb-pkg-tools

There’s actually a multitude of ways to install Python packages (e.g. the per user site-packages directory, virtual environments or just installing system wide) and I have no intention of getting into that discussion here, so if this intimidates you then read up on your options before returning to these instructions ;-).

Under the hood deb-pkg-tools uses several programs provided by Debian, the details are available in the dependencies section. To install these programs:

$ sudo apt-get install dpkg-dev fakeroot lintian

Usage

There are two ways to use the deb-pkg-tools package: As a command line program and as a Python API. For details about the Python API please refer to the API documentation available on Read the Docs. The command line interface is described below.

Usage: deb-pkg-tools [OPTIONS] ...

Wrapper for the deb-pkg-tools Python project that implements various tools to inspect, build and manipulate Debian binary package archives and related entities like trivial repositories.

Supported options:

Option Description
-i, --inspect=FILE Inspect the metadata in the Debian binary package archive given by FILE (similar to “dpkg --info”).
-c, --collect=DIR Copy the package archive(s) given as positional arguments (and all package archives required by the given package archives) into the directory given by DIR.
-C, --check=FILE Perform static analysis on a package archive and its dependencies in order to recognize common errors as soon as possible.
-p, --patch=FILE Patch fields into the existing control file given by FILE. To be used together with the -s, --set option.
-s, --set=LINE A line to patch into the control file (syntax: “Name: Value”). To be used together with the -p, --patch option.
-b, --build=DIR Build a Debian binary package with “dpkg-deb --build” (and lots of intermediate Python magic, refer to the API documentation of the project for full details) based on the binary package template in the directory given by DIR. The resulting archive is located in the system wide temporary directory (usually /tmp).
-u, --update-repo=DIR Create or update the trivial Debian binary package repository in the directory given by DIR.
-a, --activate-repo=DIR Enable “apt-get” to install packages from the trivial repository (requires root/sudo privilege) in the directory given by DIR. Alternatively you can use the -w, --with-repo option.
-d, --deactivate-repo=DIR Cleans up after --activate-repo (requires root/sudo privilege). Alternatively you can use the -w, --with-repo option.
-w, --with-repo=DIR Create or update a trivial package repository, activate the repository, run the positional arguments as an external command (usually “apt-get install”) and finally deactivate the repository.
--gc, --garbage-collect Force removal of stale entries from the persistent (on disk) package metadata cache. Garbage collection is performed automatically by the deb-pkg-tools command line interface when the last garbage collection cycle was more than 24 hours ago, so you only need to do it manually when you want to control when it happens (for example by a daily cron job scheduled during idle hours :-).
-y, --yes Assume the answer to interactive questions is yes.
-v, --verbose Make more noise! (useful during debugging)
-h, --help Show this message and exit.

One thing to note is that the operation of deb-pkg-tools --update-repo can be influenced by a configuration file. For details about this, please refer to the documentation on deb_pkg_tools.repo.select_gpg_key().

Dependencies

The following external programs are required by deb-pkg-tools (depending on which functionality you want to use of course):

Program Package
apt-ftparchive apt-utils
apt-get apt
cp coreutils
dpkg-deb dpkg
dpkg-architecture dpkg-dev
du coreutils
fakeroot fakeroot
gpg gnupg
gzip gzip
lintian lintian

The majority of these programs/packages will already be installed on most Debian based systems so you should only need the following to get started:

$ sudo apt-get install dpkg-dev fakeroot lintian

Platform compatibility

Several things can be tweaked via environment variables if they don’t work for your system or platform. For example on Mac OS X the cp command doesn’t have an -l parameter and the root user and group may not exist, but despite these things it can still be useful to test package builds on Mac OS X. The following environment variables can be used to adjust such factors:

Variable Default Description
DPT_CHOWN_FILES true Normalize ownership of files during packaging.
DPT_ROOT_USER root During package builds the ownership of all directories and files is reset to this user.
DPT_ROOT_GROUP root During package builds the ownership of all directories and files is reset to this group.
DPT_RESET_SETGID true Reset sticky bit on directories inside package templates before building.
DPT_ALLOW_FAKEROOT_OR_SUDO true Run commands using either fakeroot or sudo (depending on which is available).
DPT_SUDO true Enable the usage of sudo during operations that normally require elevated privileges.
DPT_HARD_LINKS true Allow the usage of hard links to speed up file copies between directories on the same file system.
DPT_FORCE_ENTROPY false Force the system to generate entropy based on disk I/O.
SHELL /bin/bash Shell to use for the deb-pkg-tools --with-repo command.

Environment variables for boolean options support the strings yes, true, 1, no, false and 0 (case is ignored).

Disabling sudo usage

To disable any use of sudo you can use the following:

export DPT_ALLOW_FAKEROOT_OR_SUDO=false
export DPT_CHOWN_FILES=false
export DPT_RESET_SETGID=false
export DPT_SUDO=false

Contact

The latest version of deb-pkg-tools is available on PyPI and GitHub. The documentation is hosted on Read the Docs. For bug reports please create an issue on GitHub. If you have questions, suggestions, etc. feel free to send me an e-mail at peter@peterodding.com.

License

This software is licensed under the MIT license.

© 2017 Peter Odding.

Function reference

The following documentation is based on the source code of version 4.2 of the deb-pkg-tools package.

Note

Most of the functions defined by deb-pkg-tools depend on external programs. If these programs fail unexpectedly (end with a nonzero exit code) executor.ExternalCommandFailed is raised.

Package metadata cache

Debian binary package metadata cache.

The PackageCache class implements a persistent, multiprocess cache for Debian binary package metadata. The cache supports the following binary package metadata:

  • The control fields of packages;
  • The files installed by packages;
  • The MD5, SHA1 and SHA256 sums of packages.

The package metadata cache can speed up the following functions:

Because a lot of functionality in deb-pkg-tools uses inspect_package() and its variants, the package metadata cache almost always provides a speedup compared to recalculating metadata on demand.

The cache is especially useful when you’re manipulating large package repositories where relatively little metadata changes (which is a pretty common use case if you’re using deb-pkg-tools seriously).

Internals

For several years the package metadata cache was based on SQLite and this worked fine. Then I started experimenting with concurrent builds on the same build server and I ran into SQLite raising lock timeout errors. I switched SQLite to use the Write-Ahead Log (WAL) and things seemed to improve until I experienced several corrupt databases in situations where multiple writers and multiple readers were all hitting the cache at the same time.

At this point I looked around for alternative cache backends with the following requirements:

  • Support for concurrent reading and writing without any locking or blocking.
  • It should not be possible to corrupt the cache, regardless of concurrency.
  • To keep system requirements to a minimum, it should not be required to have a server (daemon) process running just for the cache to function.

These conflicting requirements left me with basically no options :-). Based on previous good experiences I decided to try using the filesystem to store the cache, with individual files representing cache entries. Through atomic filesystem operations this strategy basically delegates all locking to the filesystem, which should be guaranteed to do the right thing (POSIX).

Storing the cache on the filesystem like this has indeed appeared to solve all locking and corruption issues, but when the filesystem cache is cold (for example because you’ve just run a couple of heavy builds) it’s still damn slow to scan the package metadata of a full repository with hundreds of archives...

As a pragmatic performance optimization memcached was added to the mix. Any errors involving memcached are silently ignored which means memcached isn’t required to use the cache; it’s an optional optimization.

deb_pkg_tools.cache.CACHE_FORMAT_REVISION = 2

The version number of the cache format (an integer).

deb_pkg_tools.cache.get_default_cache()

Load the default package cache stored inside the user’s home directory.

The location of the cache is configurable using the option package_cache_directory, however make sure you set that option before calling get_default_cache() because the cache will be initialized only once.

Returns:A PackageCache object.
class deb_pkg_tools.cache.PackageCache(directory)

A persistent, multiprocess cache for Debian binary package metadata.

__init__(directory)

Initialize a package cache.

Parameters:directory – The pathname of the package cache directory (a string).
__getstate__()

Save a pickle compatible PackageCache representation.

The __getstate__() and __setstate__() methods make PackageCache objects compatible with multiprocessing (which uses pickle). This capability is used by deb_pkg_tools.cli.collect_packages() to enable concurrent package collection.

__setstate__(state)

Load a pickle compatible PackageCache representation.

get_entry(category, pathname)

Get an object representing a cache entry.

Parameters:
  • category – The type of metadata that this cache entry represents (a string like ‘control-fields’, ‘package-fields’ or ‘contents’).
  • pathname – The pathname of the package archive (a string).
Returns:

A CacheEntry object.

collect_garbage(force=False, interval=86400)

Delete any entries in the persistent cache that refer to deleted archives.

Parameters:
  • forceTrue to force a full garbage collection run (defaults to False which means garbage collection is performed only once per interval).
  • interval – The number of seconds to delay garbage collection when force is False (a number, defaults to the equivalent of 24 hours).
class deb_pkg_tools.cache.CacheEntry(cache, category, pathname)

An entry in the package metadata cache provided by PackageCache.

__init__(cache, category, pathname)

Initialize a CacheEntry object.

Parameters:
  • cache – The PackageCache that created this entry.
  • category – The type of metadata that this cache entry represents (a string like ‘control-fields’, ‘package-fields’ or ‘contents’).
  • pathname – The pathname of the package archive (a string).
get_value()

Get the cache entry’s value.

Returns:A previously cached value or None (when the value isn’t available in the cache).
set_value(value)

Set the cache entry’s value.

Parameters:value – The metadata to save in the cache.
set_memcached()

Helper for get_value() and set_value() to write to memcached.

up_to_date(value)

Helper for get_value() to validate cached values.

write_file(filename)

Helper for set_value() to cache values on the filesystem.

Static analysis of package archives

Static analysis of Debian binary packages to detect common problems.

The deb_pkg_tools.checks module attempts to detect common problems in Debian binary package archives using static analysis. Currently there’s a check that detects duplicate files in dependency sets and a check that detects version conflicts in repositories.

deb_pkg_tools.checks.check_package(archive, cache=None)

Perform static checks on a package’s dependency set.

Parameters:
  • archive – The pathname of an existing *.deb archive (a string).
  • cache – The PackageCache to use (defaults to None).
Raises:

BrokenPackage when one or more checks failed.

deb_pkg_tools.checks.check_duplicate_files(dependency_set, cache=None)

Check a collection of Debian package archives for conflicts.

Parameters:
  • dependency_set – A list of filenames (strings) of *.deb files.
  • cache – The PackageCache to use (defaults to None).
Raises:

exceptions.ValueError when less than two package archives are given (the duplicate check obviously only works if there are packages to compare :-).

Raises:

DuplicateFilesFound when duplicate files are found within a group of package archives.

This check looks for duplicate files in package archives that concern different packages. Ignores groups of packages that have their ‘Provides’ and ‘Replaces’ fields set to a common value. Other variants of ‘Conflicts’ are not supported yet.

Because this analysis involves both the package control file fields and the pathnames of files installed by packages it can be really slow. To make it faster you can use the PackageCache.

deb_pkg_tools.checks.check_version_conflicts(dependency_set, cache=None)

Check for version conflicts in a dependency set.

Parameters:
  • dependency_set – A list of filenames (strings) of *.deb files.
  • cache – The PackageCache to use (defaults to None).
Raises:

VersionConflictFound when one or more version conflicts are found.

For each Debian binary package archive given, check if a newer version of the same package exists in the same repository (directory). This analysis can be very slow. To make it faster you can use the PackageCache.

exception deb_pkg_tools.checks.BrokenPackage

Base class for exceptions raised by the checks defined in deb_pkg_tools.checks.

exception deb_pkg_tools.checks.DuplicateFilesFound

Raised by check_duplicate_files() when duplicates are found.

exception deb_pkg_tools.checks.VersionConflictFound

Raised by check_version_conflicts() when version conflicts are found.

Command line interface

Usage: deb-pkg-tools [OPTIONS] ...

Wrapper for the deb-pkg-tools Python project that implements various tools to inspect, build and manipulate Debian binary package archives and related entities like trivial repositories.

Supported options:

Option Description
-i, --inspect=FILE Inspect the metadata in the Debian binary package archive given by FILE (similar to “dpkg --info”).
-c, --collect=DIR Copy the package archive(s) given as positional arguments (and all package archives required by the given package archives) into the directory given by DIR.
-C, --check=FILE Perform static analysis on a package archive and its dependencies in order to recognize common errors as soon as possible.
-p, --patch=FILE Patch fields into the existing control file given by FILE. To be used together with the -s, --set option.
-s, --set=LINE A line to patch into the control file (syntax: “Name: Value”). To be used together with the -p, --patch option.
-b, --build=DIR Build a Debian binary package with “dpkg-deb --build” (and lots of intermediate Python magic, refer to the API documentation of the project for full details) based on the binary package template in the directory given by DIR. The resulting archive is located in the system wide temporary directory (usually /tmp).
-u, --update-repo=DIR Create or update the trivial Debian binary package repository in the directory given by DIR.
-a, --activate-repo=DIR Enable “apt-get” to install packages from the trivial repository (requires root/sudo privilege) in the directory given by DIR. Alternatively you can use the -w, --with-repo option.
-d, --deactivate-repo=DIR Cleans up after --activate-repo (requires root/sudo privilege). Alternatively you can use the -w, --with-repo option.
-w, --with-repo=DIR Create or update a trivial package repository, activate the repository, run the positional arguments as an external command (usually “apt-get install”) and finally deactivate the repository.
--gc, --garbage-collect Force removal of stale entries from the persistent (on disk) package metadata cache. Garbage collection is performed automatically by the deb-pkg-tools command line interface when the last garbage collection cycle was more than 24 hours ago, so you only need to do it manually when you want to control when it happens (for example by a daily cron job scheduled during idle hours :-).
-y, --yes Assume the answer to interactive questions is yes.
-v, --verbose Make more noise! (useful during debugging)
-h, --help Show this message and exit.
deb_pkg_tools.cli.main()

Command line interface for the deb-pkg-tools program.

deb_pkg_tools.cli.show_package_metadata(archive)

Show the metadata and contents of a Debian archive on the terminal.

Parameters:archive – The pathname of an existing *.deb archive (a string).
deb_pkg_tools.cli.highlight(text)

Highlight a piece of text using ANSI escape sequences.

Parameters:text – The text to highlight (a string).
Returns:The highlighted text (when standard output is connected to a terminal) or the original text (when standard output is not connected to a terminal).
deb_pkg_tools.cli.collect_packages(archives, directory, prompt=True, cache=None, concurrency=None)

Interactively copy packages and their dependencies.

Parameters:
  • archives – An iterable of strings with the filenames of one or more *.deb files.
  • directory – The pathname of a directory where the package archives and dependencies should be copied to (a string).
  • promptTrue (the default) to ask confirmation from the operator (using a confirmation prompt rendered on the terminal), False to skip the prompt.
  • cache – The PackageCache to use (defaults to None).
  • concurrency – Override the number of concurrent processes (defaults to the number of archives given or to the value of multiprocessing.cpu_count(), whichever is smaller).
Raises:

ValueError when no archives are given.

When more than one archive is given a multiprocessing pool is used to collect related archives concurrently, in order to speed up the process of collecting large dependency sets.

deb_pkg_tools.cli.collect_packages_worker(args)

Helper for collect_packages() that enables concurrent collection.

deb_pkg_tools.cli.smart_copy(src, dst)

Create a hard link to or copy of a file.

Parameters:
  • src – The pathname of the source file (a string).
  • dst – The pathname of the target file (a string).

This function first tries to create a hard link dst pointing to src and if that fails it will perform a regular file copy from src to dst. This is used by collect_packages() in an attempt to conserve disk space when copying package archives between repositories on the same filesystem.

deb_pkg_tools.cli.with_repository_wrapper(directory, command, cache)

Command line wrapper for deb_pkg_tools.repo.with_repository().

Parameters:
  • directory – The pathname of a directory with *.deb archives (a string).
  • command – The command to execute (a list of strings).
  • cache – The PackageCache to use (defaults to None).
deb_pkg_tools.cli.check_directory(argument)

Make sure a command line argument points to an existing directory.

Parameters:argument – The original command line argument.
Returns:The absolute pathname of an existing directory.
deb_pkg_tools.cli.say(text, *args, **kw)

Reliably print Unicode strings to the terminal (standard output stream).

Configuration defaults

Configuration defaults for the deb-pkg-tools package.

deb_pkg_tools.config.system_config_directory = '/etc/deb-pkg-tools'

The pathname of the global (system wide) configuration directory used by deb-pkg-tools (a string).

deb_pkg_tools.config.system_cache_directory = '/var/cache/deb-pkg-tools'

The pathname of the global (system wide) package cache directory (a string).

deb_pkg_tools.config.user_config_directory = '/home/docs/.deb-pkg-tools'

The pathname of the current user’s configuration directory used by deb-pkg-tools (a string).

Default:The expanded value of ~/.deb-pkg-tools.
deb_pkg_tools.config.user_cache_directory = '/home/docs/.cache/deb-pkg-tools'

The pathname of the current user’s package cache directory (a string).

Default:The expanded value of ~/.cache/deb-pkg-tools.
deb_pkg_tools.config.package_cache_directory = '/home/docs/.cache/deb-pkg-tools'

The pathname of the selected package cache directory (a string).

Default:The value of system_cache_directory when running as root, the value of user_cache_directory otherwise.
deb_pkg_tools.config.repo_config_file = 'repos.ini'

The base name of the configuration file with user-defined Debian package repositories (a string).

This configuration file is loaded from system_config_directory and/or user_config_directory.

Default:The string repos.ini.

Control file manipulation

Functions to manipulate Debian control files.

The functions in the deb_pkg_tools.control module can be used to manipulate Debian control files. It was developed specifically for control files of binary packages, however the code is very generic. This module builds on top of the debian.deb822.Deb822 class from the python-debian package.

deb_pkg_tools.control.MANDATORY_BINARY_CONTROL_FIELDS = ('Architecture', 'Description', 'Maintainer', 'Package', 'Version')

A tuple of strings with the canonical names of the mandatory binary control file fields as defined by the Debian policy manual.

deb_pkg_tools.control.DEFAULT_CONTROL_FIELDS = {'Priority': 'optional', 'Section': 'misc', 'Architecture': 'all'}

A dictionary with string key/value pairs. Each key is the canonical name of a binary control file field and each value is the default value given to that field by create_control_file() when the caller hasn’t defined a value for the field.

deb_pkg_tools.control.DEPENDS_LIKE_FIELDS = ('Breaks', 'Conflicts', 'Depends', 'Enhances', 'Pre-Depends', 'Provides', 'Recommends', 'Replaces', 'Suggests', 'Build-Conflicts', 'Build-Conflicts-Arch', 'Build-Conflicts-Indep', 'Build-Depends', 'Build-Depends-Arch', 'Build-Depends-Indep', 'Built-Using')

A tuple of strings with the canonical names of control file fields that are similar to the Depends field (in the sense that they contain a comma separated list of package names with optional version specifications).

deb_pkg_tools.control.load_control_file(control_file)

Load a control file and return the parsed control fields.

Parameters:control_file – The filename of the control file to load (a string).
Returns:A dictionary created by parse_control_fields().
deb_pkg_tools.control.create_control_file(control_file, control_fields)

Create a Debian control file.

Parameters:
  • control_file – The filename of the control file to create (a string).
  • control_fields – A dictionary with control file fields. This dictionary is merged with the values in DEFAULT_CONTROL_FIELDS.
Raises:

ValueError when a mandatory binary control field is not present in the provided control fields (see also MANDATORY_BINARY_CONTROL_FIELDS).

deb_pkg_tools.control.patch_control_file(control_file, overrides)

Patch the fields of a Debian control file.

Parameters:
  • control_file – The filename of the control file to patch (a string).
  • overrides – A dictionary with fields that should override default name/value pairs. Values of the fields Depends, Provides, Replaces and Conflicts are merged while values of other fields are overwritten.
deb_pkg_tools.control.merge_control_fields(defaults, overrides)

Merge the fields of two Debian control files.

Parameters:
  • defaults – A dictionary with existing control field name/value pairs (may be an instance of debian.deb822.Deb822 but doesn’t have to be).
  • overrides – A dictionary with fields that should override default name/value pairs. Values of the fields Depends, Provides, Replaces and Conflicts are merged while values of other fields are overwritten.
Returns:

An instance of debian.deb822.Deb822 that contains the merged control field name/value pairs.

deb_pkg_tools.control.parse_control_fields(input_fields)

Parse Debian control file fields.

Parameters:input_fields – The dictionary to convert (may be an instance of debian.deb822.Deb822 but doesn’t have to be).
Returns:A dict object with the converted fields.

The debian.deb822.Deb822 class can be used to parse Debian control files but the result is a simple dict with string name/value pairs. This function takes an existing debian.deb822.Deb822 instance and converts the following fields into friendlier formats:

  • The values of the fields given by DEPENDS_LIKE_FIELDS are parsed into Python data structures using parse_depends().
  • The value of the Installed-Size field is converted to an integer.

Let’s look at an example. We start with the raw control file contents so you can see the complete input:

>>> from deb_pkg_tools.control import deb822_from_string
>>> unparsed_fields = deb822_from_string('''
... Package: python3.4-minimal
... Version: 3.4.0-1+precise1
... Architecture: amd64
... Installed-Size: 3586
... Pre-Depends: libc6 (>= 2.15)
... Depends: libpython3.4-minimal (= 3.4.0-1+precise1), libexpat1 (>= 1.95.8), libgcc1 (>= 1:4.1.1), zlib1g (>= 1:1.2.0), foo | bar
... Recommends: python3.4
... Suggests: binfmt-support
... Conflicts: binfmt-support (<< 1.1.2)
... ''')

Here are the control file fields as parsed by the debian.deb822 module:

>>> print(repr(unparsed_fields))
{'Architecture': u'amd64',
 'Conflicts': u'binfmt-support (<< 1.1.2)',
 'Depends': u'libpython3.4-minimal (= 3.4.0-1+precise1), libexpat1 (>= 1.95.8), libgcc1 (>= 1:4.1.1), zlib1g (>= 1:1.2.0), foo | bar',
 'Installed-Size': u'3586',
 'Package': u'python3.4-minimal',
 'Pre-Depends': u'libc6 (>= 2.15)',
 'Recommends': u'python3.4',
 'Suggests': u'binfmt-support',
 'Version': u'3.4.0-1+precise1'}

Notice the value of the Depends line is a comma separated string, i.e. it hasn’t been parsed. Now here are the control file fields parsed by the parse_control_fields() function:

>>> from deb_pkg_tools.control import parse_control_fields
>>> parsed_fields = parse_control_fields(unparsed_fields)
>>> print(repr(parsed_fields))
{'Architecture': u'amd64',
 'Conflicts': RelationshipSet(VersionedRelationship(name=u'binfmt-support', operator=u'<<', version=u'1.1.2')),
 'Depends': RelationshipSet(VersionedRelationship(name=u'libpython3.4-minimal', operator=u'=', version=u'3.4.0-1+precise1'),
                            VersionedRelationship(name=u'libexpat1', operator=u'>=', version=u'1.95.8'),
                            VersionedRelationship(name=u'libgcc1', operator=u'>=', version=u'1:4.1.1'),
                            VersionedRelationship(name=u'zlib1g', operator=u'>=', version=u'1:1.2.0'),
                            AlternativeRelationship(Relationship(name=u'foo'), Relationship(name=u'bar'))),
 'Installed-Size': 3586,
 'Package': u'python3.4-minimal',
 'Pre-Depends': RelationshipSet(VersionedRelationship(name=u'libc6', operator=u'>=', version=u'2.15')),
 'Recommends': u'python3.4',
 'Suggests': RelationshipSet(Relationship(name=u'binfmt-support')),
 'Version': u'3.4.0-1+precise1'}

For more information about fields like Depends and Suggests please refer to the documentation of parse_depends().

deb_pkg_tools.control.unparse_control_fields(input_fields)

Unparse (undo the parsing of) Debian control file fields.

Parameters:input_fields – A dict object previously returned by parse_control_fields().
Returns:A debian.deb822.Deb822 object.

This function converts dictionaries created by parse_control_fields() back into debian.deb822.Deb822 objects. Fields with an empty value are omitted. This makes it possible to delete fields from a control file with patch_control_file() by setting the value of a field to None in the overrides...

deb_pkg_tools.control.normalize_control_field_name(name)

Normalize the case of a field name in a Debian control file.

Parameters:name – The name of a control file field (a string).
Returns:The normalized name (a string).

Normalization of control file field names is useful to simplify control file manipulation and in particular the merging of control files.

According to the Debian Policy Manual (section 5.1, Syntax of control files) field names are not case-sensitive, however in my experience deviating from the standard capitalization can break things. Hence this function (which is used by the other functions in the deb_pkg_tools.control module).

Note

This function doesn’t adhere 100% to the Debian policy because it lacks special casing (no pun intended ;-) for fields like DM-Upload-Allowed. It’s not clear to me if this will ever become a relevant problem for building simple binary packages... (which explains why I didn’t bother to implement special casing)

deb_pkg_tools.control.deb822_from_string(string)

Create a debian.deb822.Deb822 object from a string.

Parameters:string – The string containing the control fields to parse.
Returns:A debian.deb822.Deb822 object.

Relationship parsing and evaluation

Parsing and evaluation of Debian package relationship declarations.

The deb_pkg_tools.deps module provides functions to parse and evaluate Debian package relationship declarations as defined in chapter 7 of the Debian policy manual. The most important function is parse_depends() which returns a RelationshipSet object. The RelationshipSet.matches() function can be used to evaluate relationship expressions. The relationship parsing is implemented in pure Python (no external dependencies) but relationship evaluation uses the external command dpkg --compare-versions to ensure compatibility with Debian’s package version comparison algorithm.

To give you an impression of how to use this module:

>>> from deb_pkg_tools.deps import parse_depends
>>> dependencies = parse_depends('python (>= 2.6), python (<< 3) | python (>= 3.4)')
>>> dependencies.matches('python', '2.5')
False
>>> dependencies.matches('python', '3.0')
False
>>> dependencies.matches('python', '2.6')
True
>>> dependencies.matches('python', '3.4')
True
>>> print(repr(dependencies))
RelationshipSet(VersionedRelationship(name='python', operator='>=', version='2.6', architectures=()),
                AlternativeRelationship(VersionedRelationship(name='python', operator='<<', version='3', architectures=()),
                                        VersionedRelationship(name='python', operator='>=', version='3.4', architectures=())))
>>> print(str(dependencies))
python (>= 2.6), python (<< 3) | python (>= 3.4)

As you can see the repr() output of the relationship set shows the object tree and the str output is the dependency line.

deb_pkg_tools.deps.parse_depends(relationships)

Parse a Debian package relationship declaration line.

Parameters:relationships – A string containing one or more comma separated package relationships or a list of strings with package relationships.
Returns:A RelationshipSet object.
Raises:ValueError when parsing fails.

This function parses a list of package relationships of the form python (>= 2.6), python (<< 3), i.e. a comma separated list of relationship expressions. Uses parse_alternatives() to parse each comma separated expression.

Here’s an example:

>>> from deb_pkg_tools.deps import parse_depends
>>> dependencies = parse_depends('python (>= 2.6), python (<< 3)')
>>> print(repr(dependencies))
RelationshipSet(VersionedRelationship(name='python', operator='>=', version='2.6'),
                VersionedRelationship(name='python', operator='<<', version='3'))
>>> dependencies.matches('python', '2.5')
False
>>> dependencies.matches('python', '2.6')
True
>>> dependencies.matches('python', '2.7')
True
>>> dependencies.matches('python', '3.0')
False
deb_pkg_tools.deps.parse_alternatives(expression)

Parse an expression containing one or more alternative relationships.

Parameters:expression – A relationship expression (a string).
Returns:A Relationship object.
Raises:ValueError when parsing fails.

This function parses an expression containing one or more alternative relationships of the form python2.6 | python2.7., i.e. a list of relationship expressions separated by | tokens. Uses parse_relationship() to parse each | separated expression.

An example:

>>> from deb_pkg_tools.deps import parse_alternatives
>>> parse_alternatives('python2.6')
Relationship(name='python2.6')
>>> parse_alternatives('python2.6 | python2.7')
AlternativeRelationship(Relationship(name='python2.6'),
                        Relationship(name='python2.7'))
deb_pkg_tools.deps.parse_relationship(expression)

Parse an expression containing a package name and optional version/architecture restrictions.

Parameters:expression – A relationship expression (a string).
Returns:A Relationship object.
Raises:ValueError when parsing fails.

This function parses relationship expressions containing a package name and (optionally) a version relation of the form python (>= 2.6) and/or an architecture restriction (refer to the Debian policy manual’s documentation on the syntax of relationship fields for details). Here’s an example:

>>> from deb_pkg_tools.deps import parse_relationship
>>> parse_relationship('python')
Relationship(name='python')
>>> parse_relationship('python (<< 3)')
VersionedRelationship(name='python', operator='<<', version='3')
deb_pkg_tools.deps.cache_matches(f)

High performance memoizing decorator for overrides of Relationship.matches().

Before writing this function I tried out several caching decorators from PyPI, unfortunately all of them were bloated. I benchmarked using collect_related_packages() and where this decorator would get a total runtime of 8 seconds the other caching decorators would get something like 40 seconds...

class deb_pkg_tools.deps.AbstractRelationship(**kw)

Abstract base class for the various types of relationship objects defined in deb_pkg_tools.deps.

names

The name(s) of the packages in the relationship.

Returns:A set of package names (strings).

Note

This property needs to be implemented by subclasses.

matches(name, version=None)

Check if the relationship matches a given package and version.

Parameters:
  • name – The name of a package (a string).
  • version – The version number of a package (a string, optional).
Returns:

One of the values True, False or None meaning the following:

  • True if the name matches and the version doesn’t invalidate the match,
  • False if the name matches but the version invalidates the match,
  • None if the name doesn’t match at all.

Note

This method needs to be implemented by subclasses.

class deb_pkg_tools.deps.Relationship(**kw)

A simple package relationship referring only to the name of a package.

Created by parse_relationship().

name

The name of a package (a string).

Note

The name property is a key_property. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named name (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.

architectures

The architecture restriction(s) on the relationship (a tuple of strings).

Note

The architectures property is a key_property. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named architectures (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.

names

The name(s) of the packages in the relationship.

matches(name, version=None)

Check if the relationship matches a given package name.

Parameters:
  • name – The name of a package (a string).
  • version – The version number of a package (this parameter is ignored).
Returns:

True if the name matches, None otherwise.

Raises:

NotImplementedError when architectures is not empty (because evaluation of architecture restrictions hasn’t been implemented).

__repr__()

Serialize a Relationship object to a Python expression.

__unicode__()

Serialize a Relationship object to a Debian package relationship expression.

class deb_pkg_tools.deps.VersionedRelationship(**kw)

A conditional package relationship that refers to a package and certain versions of that package.

Created by parse_relationship().

operator

An operator that compares Debian package version numbers (a string).

Note

The operator property is a key_property. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named operator (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.

version

The version number of a package (a string).

Note

The version property is a key_property. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named version (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.

__repr__()

Serialize a VersionedRelationship object to a Python expression.

__unicode__()

Serialize a VersionedRelationship object to a Debian package relationship expression.

class deb_pkg_tools.deps.AlternativeRelationship(*relationships)

A package relationship that refers to one of several alternative packages.

Created by parse_alternatives().

__init__(*relationships)

Initialize an AlternativeRelationship object.

Parameters:relationships – One or more Relationship objects.
relationships

A tuple of Relationship objects.

Note

The relationships property is a key_property. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named relationships (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.

names

Get the name(s) of the packages in the alternative relationship.

Returns:A set of package names (strings).
__repr__()

Serialize an AlternativeRelationship object to a Python expression.

__unicode__()

Serialize an AlternativeRelationship object to a Debian package relationship expression.

class deb_pkg_tools.deps.RelationshipSet(*relationships)

A set of package relationships. Created by parse_depends().

__init__(*relationships)

Initialize a :class RelationshipSet object.

Parameters:relationships – One or more Relationship objects.
relationships

A tuple of Relationship objects.

Note

The relationships property is a key_property. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named relationships (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). Once this property has been assigned a value you are not allowed to assign a new value to the property.

names

Get the name(s) of the packages in the relationship set.

Returns:A set of package names (strings).
__repr__(pretty=False, indent=0)

Serialize a RelationshipSet object to a Python expression.

__unicode__()

Serialize a RelationshipSet object to a Debian package relationship expression.

__iter__()

Iterate over the relationships in a relationship set.

GPG key pair handling

GPG key pair generation and signing of Release files.

The deb_pkg_tools.gpg module is used to manage GPG key pairs. It allows callers to specify which GPG key pair and/or key ID they want to use and will automatically generate GPG key pairs that don’t exist yet.

deb_pkg_tools.gpg.GPG_AGENT_VARIABLE = 'GPG_AGENT_INFO'

The name of the environment variable used to communicate between the GPG agent and gpg processes (a string).

deb_pkg_tools.gpg.initialize_gnupg()

Make sure the ~/.gnupg directory exists.

Older versions of GPG can/will fail when the ~/.gnupg directory doesn’t exist (e.g. in a newly created chroot). GPG itself creates the directory after noticing that it’s missing, but then still fails! Later runs work fine however. To avoid this problem we make sure ~/.gnupg exists before we run GPG.

class deb_pkg_tools.gpg.GPGKey(name=None, description=None, secret_key_file=None, public_key_file=None, key_id=None)

Container for generating GPG key pairs and signing release files.

This class is used to sign Release files in Debian package repositories. If the given GPG key pair doesn’t exist yet it will be automatically created without user interaction (except gathering of entropy, which is not something I can automate :-).

__init__(name=None, description=None, secret_key_file=None, public_key_file=None, key_id=None)

Initialize a GPG key object.

Parameters:
  • name – The name of the GPG key pair (a string). Used only when the key pair is generated because it doesn’t exist yet.
  • description – The description of the GPG key pair (a string). Used only when the key pair is generated because it doesn’t exist yet.
  • secret_key_file – The absolute pathname of the secret key file (a string). Defaults to ~/.gnupg/secring.gpg.
  • public_key_file – The absolute pathname of the public key file (a string). Defaults to ~/.gnupg/pubring.gpg.
  • key_id – The key ID of an existing key pair to use (a string). If this argument is provided then the key pair’s secret and public key files must already exist.

This method initializes a GPG key object in one of several ways:

  1. If key_id is specified then the GPG key must have been created previously. If secret_key_file and public_key_file are not specified they default to ~/.gnupg/secring.gpg and ~/.gnupg/pubring.gpg. In this case key_id is the only required argument.

    The following example assumes that the provided GPG key ID is defined in the default keyring of the current user:

    >>> from deb_pkg_tools.gpg import GPGKey
    >>> key = GPGKey(key_id='58B6B02B')
    >>> key.gpg_command
    'gpg --no-default-keyring --secret-keyring /home/peter/.gnupg/secring.gpg --keyring /home/peter/.gnupg/pubring.gpg --recipient 58B6B02B'
    
  2. If secret_key_file and public_key_file are specified but the files don’t exist yet, a GPG key will be generated for you. In this case name and description are required arguments and key_id must be None (the default). An example:

    >>> name = 'deb-pkg-tools'
    >>> description = 'Automatic signing key for deb-pkg-tools'
    >>> secret_key_file = '/home/peter/.deb-pkg-tools/automatic-signing-key.sec'
    >>> public_key_file = '/home/peter/.deb-pkg-tools/automatic-signing-key.pub'
    >>> key = GPGKey(name, description, secret_key_file, public_key_file)
    >>> key.gpg_command
    'gpg --no-default-keyring --secret-keyring /home/peter/.deb-pkg-tools/automatic-signing-key.sec --keyring /home/peter/.deb-pkg-tools/automatic-signing-key.pub'
    
gpg_command

The GPG command line that can be used to sign using the key, export the key, etc (a string).

The documentation of GPGKey.__init__() contains two examples.

use_agent

Whether to enable the use of the GPG agent (a boolean).

This property checks whether the environment variable given by GPG_AGENT_VARIABLE is set to a nonempty value. If it is then gpg_command will include the --use-agent option. This makes it possible to integrate repository signing with the GPG agent, so that a password is asked for once instead of every time something is signed.

class deb_pkg_tools.gpg.EntropyGenerator

Force the system to generate entropy based on disk I/O.

The deb-pkg-tools test suite runs on Travis CI which uses virtual machines to isolate tests. Because the deb-pkg-tools test suite generates several GPG keys it risks the chance of getting stuck and being killed after 10 minutes of inactivity. This happens because of a lack of entropy which is a very common problem in virtualized environments. There are tricks to use fake entropy to avoid this problem:

  • The rng-tools package/daemon can feed /dev/random based on /dev/urandom. Unfortunately this package doesn’t work on Travis CI because they use OpenVZ which uses read only /dev/random devices.
  • GPG version 2 supports the --debug-quick-random option but I haven’t investigated how easy it is to switch.

Instances of this class can be used as a context manager to generate endless disk I/O which is one of the few sources of entropy on virtualized systems. Entropy generation is enabled when the environment variable $DPT_FORCE_ENTROPY is set to yes, true or 1.

__init__()

Initialize a EntropyGenerator object.

__enter__()

Enable entropy generation.

__exit__(exc_type, exc_value, traceback)

Disable entropy generation.

deb_pkg_tools.gpg.generate_entropy()

Force the system to generate entropy based on disk I/O.

This function is run in a separate process by EntropyGenerator. It scans the complete file system and reads every file it finds in blocks of 1 KB. This function never returns; it has to be killed.

Package manipulation

Functions to build and inspect Debian binary package archives (*.deb files).

deb_pkg_tools.package.parse_filename(filename)

Parse the filename of a Debian binary package archive.

Parameters:filename – The pathname of a Debian binary package archive (a string).
Returns:A PackageFile object.
Raises:ValueError when the given filename cannot be parsed.

This function parses the filename of a Debian binary package archive into three fields: the name of the package, its version and its architecture. See also determine_package_archive().

Here’s an example:

>>> from deb_pkg_tools.package import parse_filename
>>> components = parse_filename('/var/cache/apt/archives/python2.7_2.7.3-0ubuntu3.4_amd64.deb')
>>> print(repr(components))
PackageFile(name='python2.7',
            version='2.7.3-0ubuntu3.4',
            architecture='amd64',
            filename='/var/cache/apt/archives/python2.7_2.7.3-0ubuntu3.4_amd64.deb')
class deb_pkg_tools.package.PackageFile

A named tuple with the result of parse_filename().

The function parse_filename() reports the fields of a package archive’s filename as a PackageFile object (a named tuple). Here are the fields supported by these named tuples:

The values of the directory, other_versions and newer_versions properties are generated on demand.

PackageFile objects support sorting according to Debian’s package version comparison algorithm as implemented in dpkg --compare-versions.

directory

The absolute pathname of the directory containing the package archive (a string).

other_versions

A list of PackageFile objects with other versions of the same package in the same directory.

newer_versions

A list of PackageFile objects with newer versions of the same package in the same directory.

deb_pkg_tools.package.find_package_archives(directory)

Find the Debian package archive(s) in the given directory.

Parameters:directory – The pathname of a directory (a string).
Returns:A list of PackageFile objects.

Collect the package archive(s) related to the given package archive.

Parameters:
  • filename – The filename of an existing *.deb archive (a string).
  • cache – The PackageCache to use (defaults to None).
  • interactiveTrue to draw an interactive spinner on the terminal (see Spinner), False to skip the interactive spinner or None to detect whether we’re connected to an interactive terminal.
Returns:

A list of PackageFile objects.

This works by parsing and resolving the dependencies of the given package to filenames of package archives, then parsing and resolving the dependencies of those package archives, etc. until no more relationships can be resolved to existing package archives.

Known limitations / sharp edges of this function:

  • Only Depends and Pre-Depends relationships are processed, Provides is ignored. I’m not yet sure whether it makes sense to add support for Conflicts, Provides and Replaces (and how to implement it).
  • Unsatisfied relationships don’t trigger a warning or error because this function doesn’t know in what context a package can be installed (e.g. which additional repositories a given apt client has access to).
  • Please thoroughly test this functionality before you start to rely on it. What this function tries to do is a complex operation to do correctly (given the limited information this function has to work with) and the implementation is far from perfect. Bugs have been found and fixed in this code and more bugs will undoubtedly be discovered. You’ve been warned :-).
  • This function can be rather slow on large package repositories and dependency sets due to the incremental nature of the related package collection. It’s a known issue / limitation.

This function is used to implement the deb-pkg-tools --collect command:

$ deb-pkg-tools -c /tmp python-deb-pkg-tools_1.13-1_all.deb
2014-05-18 08:33:42 deb_pkg_tools.package INFO Collecting packages related to ~/python-deb-pkg-tools_1.13-1_all.deb ..
2014-05-18 08:33:42 deb_pkg_tools.package INFO Scanning ~/python-deb-pkg-tools_1.13-1_all.deb ..
2014-05-18 08:33:42 deb_pkg_tools.package INFO Scanning ~/python-coloredlogs_0.4.8-1_all.deb ..
2014-05-18 08:33:42 deb_pkg_tools.package INFO Scanning ~/python-chardet_2.2.1-1_all.deb ..
2014-05-18 08:33:42 deb_pkg_tools.package INFO Scanning ~/python-humanfriendly_1.7.1-1_all.deb ..
2014-05-18 08:33:42 deb_pkg_tools.package INFO Scanning ~/python-debian_0.1.21-1_all.deb ..
Found 5 package archives:
 - ~/python-chardet_2.2.1-1_all.deb
 - ~/python-coloredlogs_0.4.8-1_all.deb
 - ~/python-deb-pkg-tools_1.13-1_all.deb
 - ~/python-humanfriendly_1.7.1-1_all.deb
 - ~/python-debian_0.1.21-1_all.deb
Copy 5 package archives to /tmp? [Y/n] y
2014-05-18 08:33:44 deb_pkg_tools.cli INFO Done! Copied 5 package archives to /tmp.

Internal helper for package collection to enable simple conflict resolution.

deb_pkg_tools.package.match_relationships(package_archive, relationship_sets)

Internal helper for package collection to validate that all relationships are satisfied.

This function enables collect_related_packages_helper() to validate that all relationships are satisfied while the set of related package archives is being collected and again afterwards to make sure that no previously drawn conclusions were invalidated by additionally collected package archives.

exception deb_pkg_tools.package.CollectedPackagesConflict(conflicts)

Exception raised by collect_related_packages_helper().

__init__(conflicts)

Construct a CollectedPackagesConflict exception.

Parameters:conflicts – A list of conflicting PackageFile objects.
deb_pkg_tools.package.find_latest_version(packages)

Find the package archive with the highest version number.

Parameters:packages – A list of filenames (strings) and/or PackageFile objects.
Returns:The PackageFile with the highest version number.
Raises:ValueError when not all of the given package archives share the same package name.

This function uses Version objects for version comparison.

deb_pkg_tools.package.group_by_latest_versions(packages)

Group package archives by name of package and find latest version of each.

Parameters:packages – A list of filenames (strings) and/or PackageFile objects.
Returns:A dictionary with package names as keys and PackageFile objects as values.
deb_pkg_tools.package.inspect_package(archive, cache=None)

Get the metadata and contents from a *.deb archive.

Parameters:
  • archive – The pathname of an existing *.deb archive.
  • cache – The PackageCache to use (defaults to None).
Returns:

A tuple with two dictionaries:

  1. The result of inspect_package_fields().
  2. The result of inspect_package_contents().

deb_pkg_tools.package.inspect_package_fields(archive, cache=None)

Get the fields (metadata) from a *.deb archive.

Parameters:
  • archive – The pathname of an existing *.deb archive.
  • cache – The PackageCache to use (defaults to None).
Returns:

A dictionary with control file fields (the result of parse_control_fields()).

Here’s an example:

>>> from deb_pkg_tools.package import inspect_package_fields
>>> print(repr(inspect_package_fields('python3.4-minimal_3.4.0-1+precise1_amd64.deb')))
{'Architecture': u'amd64',
 'Conflicts': RelationshipSet(VersionedRelationship(name=u'binfmt-support', operator=u'<<', version=u'1.1.2')),
 'Depends': RelationshipSet(VersionedRelationship(name=u'libpython3.4-minimal', operator=u'=', version=u'3.4.0-1+precise1'),
                            VersionedRelationship(name=u'libexpat1', operator=u'>=', version=u'1.95.8'),
                            VersionedRelationship(name=u'libgcc1', operator=u'>=', version=u'1:4.1.1'),
                            VersionedRelationship(name=u'zlib1g', operator=u'>=', version=u'1:1.2.0')),
 'Description': u'Minimal subset of the Python language (version 3.4)\n This package contains the interpreter and some essential modules.  It can\n be used in the boot process for some basic tasks.\n See /usr/share/doc/python3.4-minimal/README.Debian for a list of the modules\n contained in this package.',
 'Installed-Size': 3586,
 'Maintainer': u'Felix Krull <f_krull@gmx.de>',
 'Multi-Arch': u'allowed',
 'Original-Maintainer': u'Matthias Klose <doko@debian.org>',
 'Package': u'python3.4-minimal',
 'Pre-Depends': RelationshipSet(VersionedRelationship(name=u'libc6', operator=u'>=', version=u'2.15')),
 'Priority': u'optional',
 'Recommends': u'python3.4',
 'Section': u'python',
 'Source': u'python3.4',
 'Suggests': RelationshipSet(Relationship(name=u'binfmt-support')),
 'Version': u'3.4.0-1+precise1'}
deb_pkg_tools.package.inspect_package_contents(archive, cache=None)

Get the contents from a *.deb archive.

Parameters:
  • archive – The pathname of an existing *.deb archive.
  • cache – The PackageCache to use (defaults to None).
Returns:

A dictionary with the directories and files contained in the package. The dictionary keys are the absolute pathnames and the dictionary values are ArchiveEntry objects (see the example below).

An example:

>>> from deb_pkg_tools.package import inspect_package_contents
>>> print(repr(inspect_package_contents('python3.4-minimal_3.4.0-1+precise1_amd64.deb')))
{u'/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:54', target=u''),
 u'/usr/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:52', target=u''),
 u'/usr/bin/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:54', target=u''),
 u'/usr/bin/python3.4': ArchiveEntry(permissions=u'-rwxr-xr-x', owner=u'root', group=u'root', size=3536680, modified=u'2014-03-20 23:54', target=u''),
 u'/usr/bin/python3.4m': ArchiveEntry(permissions=u'hrwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:54', target=u'/usr/bin/python3.4'),
 u'/usr/share/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:53', target=u''),
 u'/usr/share/binfmts/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:53', target=u''),
 u'/usr/share/binfmts/python3.4': ArchiveEntry(permissions=u'-rw-r--r--', owner=u'root', group=u'root', size=72, modified=u'2014-03-20 23:53', target=u''),
 u'/usr/share/doc/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:53', target=u''),
 u'/usr/share/doc/python3.4-minimal/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:54', target=u''),
 u'/usr/share/doc/python3.4-minimal/README.Debian': ArchiveEntry(permissions=u'-rw-r--r--', owner=u'root', group=u'root', size=3779, modified=u'2014-03-20 23:52', target=u''),
 u'/usr/share/doc/python3.4-minimal/changelog.Debian.gz': ArchiveEntry(permissions=u'-rw-r--r--', owner=u'root', group=u'root', size=28528, modified=u'2014-03-20 22:32', target=u''),
 u'/usr/share/doc/python3.4-minimal/copyright': ArchiveEntry(permissions=u'-rw-r--r--', owner=u'root', group=u'root', size=51835, modified=u'2014-03-20 20:37', target=u''),
 u'/usr/share/man/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:52', target=u''),
 u'/usr/share/man/man1/': ArchiveEntry(permissions=u'drwxr-xr-x', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:54', target=u''),
 u'/usr/share/man/man1/python3.4.1.gz': ArchiveEntry(permissions=u'-rw-r--r--', owner=u'root', group=u'root', size=5340, modified=u'2014-03-20 23:30', target=u''),
 u'/usr/share/man/man1/python3.4m.1.gz': ArchiveEntry(permissions=u'lrwxrwxrwx', owner=u'root', group=u'root', size=0, modified=u'2014-03-20 23:54', target=u'python3.4.1.gz')}
class deb_pkg_tools.package.ArchiveEntry

A named tuple with the result of inspect_package().

The function inspect_package() reports the contents of package archives as a dictionary containing named tuples. Here are the fields supported by those named tuples:

deb_pkg_tools.package.build_package(directory, repository=None, check_package=True, copy_files=True)

Create a Debian package using the dpkg-deb --build command.

Parameters:
  • directory – The pathname of a directory tree suitable for packaging with dpkg-deb --build.
  • repository

    The pathname of the directory where the generated *.deb archive should be stored.

    By default a temporary directory is created to store the generated archive, in this case the caller is responsible for cleaning up the directory.

    Before deb-pkg-tools 2.0 this defaulted to the system wide temporary directory which could result in corrupted archives during concurrent builds.

  • check_package – If True (the default) Lintian is run to check the resulting package archive for possible issues.
  • copy_files – If True (the default) the package’s files are copied to a temporary directory before being modified. You can set this to False if you’re already working on a copy and don’t want yet another copy to be made.
Returns:

The pathname of the generated *.deb archive.

Raises:

executor.ExternalCommandFailed if any of the external commands invoked by this function fail.

The dpkg-deb --build command requires a certain directory tree layout and specific files; for more information about this topic please refer to the Debian Binary Package Building HOWTO. The build_package() function performs the following steps to build a package:

  1. Copies the files in the source directory to a temporary build directory.
  2. Updates the Installed-Size field in the DEBIAN/control file based on the size of the given directory (using update_installed_size()).
  3. Sets the owner and group of all files to root because this is the only user account guaranteed to always be available. This uses the fakeroot command so you don’t actually need root access to use build_package().
  4. Runs the command fakeroot dpkg-deb --build to generate a Debian package from the files in the build directory.
  5. Runs Lintian to check the resulting package archive for possible issues. The result of Lintian is purely informational: If ‘errors’ are reported and Lintian exits with a nonzero status code, this is ignored by build_package().
deb_pkg_tools.package.determine_package_archive(directory)

Determine the name of a package archive before building it.

Parameters:source_directory – The pathname of a directory tree suitable for packaging with dpkg-deb --build.
Returns:The filename of the *.deb archive to be built.

This function determines the name of the *.deb package archive that will be generated from a directory tree suitable for packaging with dpkg-deb --build. See also parse_filename().

deb_pkg_tools.package.copy_package_files(from_directory, to_directory, hard_links=True)

Copy package files to a temporary directory, using hard links when possible.

Parameters:
  • from_directory – The pathname of a directory tree suitable for packaging with dpkg-deb --build.
  • to_directory – The pathname of a temporary build directory.
  • hard_links – Use hard links to speed up copying when possible.

This function copies a directory tree suitable for packaging with dpkg-deb --build to a temporary build directory so that individual files can be replaced without changing the original directory tree. If the build directory is on the same file system as the source directory, hard links are used to speed up the copy. This function is used by build_package().

deb_pkg_tools.package.clean_package_tree(directory, remove_dirs=('.bzr', '.git', '.hg', '.svn', '__pycache__'), remove_files=('*.pyc', '*.pyo', '*~', '.*.s??', '.bzrignore', '.DS_Store', '.DS_Store.gz', '._*', '.gitignore', '.hg_archival.txt', '.hgignore', '.hgtags', '.s??'))

Clean up files that should not be included in a Debian package from the given directory.

Parameters:
  • directory – The pathname of the directory to clean (a string).
  • remove_dirs – An iterable with filename patterns of directories that should not be included in the package (e.g. version control directories like .git and .hg).
  • remove_files – An iterable with filename patterns of files that should not be included in the package (e.g. version control files like .gitignore and .hgignore).

Uses the fnmatch module for directory and filename matching. Matching is done on the base name of each directory and file. This function assumes it is safe to unlink files from the given directory (which it should be when copy_package_files() was previously called, e.g. by build_package()).

deb_pkg_tools.package.update_conffiles(directory)

Make sure the DEBIAN/conffiles file is up to date.

Parameters:directory – The pathname of a directory tree suitable for packaging with dpkg-deb --build.

Given a directory tree suitable for packaging with dpkg-deb --build this function updates the entries in the DEBIAN/conffiles file. This function is used by build_package().

deb_pkg_tools.package.update_installed_size(directory)

Make sure the Installed-Size field in DEBIAN/control is up to date.

Parameters:directory – The pathname of a directory tree suitable for packaging with dpkg-deb --build.

Given a directory tree suitable for packaging with dpkg-deb --build this function updates the Installed-Size field in the DEBIAN/control file. This function is used by build_package().

Repository management

Create, update and activate trivial Debian package repositories.

The functions in the deb_pkg_tools.repo module make it possible to transform a directory of *.deb archives into a (temporary) Debian package repository:

All of the functions in this module can raise executor.ExternalCommandFailed.

You can configure the GPG key(s) used by this module through a configuration file, please refer to the documentation of select_gpg_key().

deb_pkg_tools.repo.scan_packages(repository, packages_file=None, cache=None)

A reimplementation of the dpkg-scanpackages -m command in Python.

Updates a Packages file based on the Debian package archive(s) found in the given directory. Uses PackageCache to (optionally) speed up the process significantly by caching package metadata and hashes on disk. This explains why this function can be much faster than dpkg-scanpackages -m.

Parameters:
  • repository – The pathname of a directory containing Debian package archives (a string).
  • packages_file – The pathname of the Packages file to update (a string). Defaults to the Packages file in the given directory.
  • cache – The PackageCache to use (defaults to None).
deb_pkg_tools.repo.get_packages_entry(pathname, cache=None)

Get a dictionary with the control fields required in a Packages file.

Parameters:
  • pathname – The pathname of the package archive (a string).
  • cache – The PackageCache to use (defaults to None).
Returns:

A dictionary with control fields (see below).

Used by scan_packages() to generate Packages files. The format of Packages files (part of the Debian binary package repository format) is fairly simple:

  • All of the fields extracted from a package archive’s control file using inspect_package_fields() are listed (you have to get these fields yourself and combine the dictionaries returned by inspect_package_fields() and get_packages_entry());

  • The field Filename contains the filename of the package archive relative to the Packages file (which is in the same directory in our case, because update_repository() generates trivial repositories);

  • The field Size contains the size of the package archive in bytes;

  • The following fields contain package archive checksums:

    MD5sum

    Calculated using the md5() constructor of the hashlib module.

    SHA1

    Calculated using the sha1() constructor of the hashlib module.

    SHA256

    Calculated using the sha256() constructor of the hashlib module.

The three checksums are calculated simultaneously by reading the package archive once, in blocks of a kilobyte. This is probably why this function seems to be faster than dpkg-scanpackages -m (even when used without caching).

deb_pkg_tools.repo.update_repository(directory, release_fields={}, gpg_key=None, cache=None)

Create or update a trivial repository.

Parameters:
  • directory – The pathname of a directory with *.deb packages.
  • release_fields – An optional dictionary with fields to set inside the Release file.
  • gpg_key – The GPGKey object used to sign the repository. Defaults to the result of select_gpg_key().
  • cache – The PackageCache to use (defaults to None).
Raises:

ResourceLockedException when the given repository directory is being updated by another process.

This function is based on the Debian commands dpkg-scanpackages (reimplemented as scan_packages()) and apt-ftparchive (also uses the external programs gpg and gzip).

deb_pkg_tools.repo.activate_repository(directory, gpg_key=None)

Activate a local trivial repository.

Parameters:
  • directory – The pathname of a directory with *.deb packages.
  • gpg_key – The GPGKey object used to sign the repository. Defaults to the result of select_gpg_key().

This function sets everything up so that a trivial Debian package repository can be used to install packages without a webserver. This uses the file:// URL scheme to point apt-get to a directory on the local file system.

Warning

This function requires root privileges to:

  1. create the directory /etc/apt/sources.list.d,
  2. create a *.list file in /etc/apt/sources.list.d and
  3. run apt-get update.

This function will use sudo to gain root privileges when it’s not already running as root.

deb_pkg_tools.repo.deactivate_repository(directory)

Deactivate a local repository that was previously activated using activate_repository().

Parameters:directory – The pathname of a directory with *.deb packages.

Warning

This function requires root privileges to:

  1. delete a *.list file in /etc/apt/sources.list.d and
  2. run apt-get update.

This function will use sudo to gain root privileges when it’s not already running as root.

deb_pkg_tools.repo.with_repository(directory, *command, **kw)

Execute an external command while a repository is activated.

Parameters:
  • directory – The pathname of a directory containing *.deb archives (a string).
  • command – The command to execute (a tuple of strings, passed verbatim to executor.execute()).
  • cache – The PackageCache to use (defaults to None).
Raises:

executor.ExternalCommandFailed if any external commands fail.

This function create or updates a trivial package repository, activates the repository, runs an external command (usually apt-get install) and finally deactivates the repository again. Also deactivates the repository when the external command fails and executor.ExternalCommandFailed is raised.

deb_pkg_tools.repo.apt_supports_trusted_option()

Figure out whether apt supports the [trusted=yes] option.

Returns:True if the option is supported, False if it is not.

Since apt version 0.8.16~exp3 the option [trusted=yes] can be used in a sources.list file to disable GPG key checking (see Debian bug #596498). This version of apt is included with Ubuntu 12.04 and later, but deb-pkg-tools also has to support older versions of apt. The apt_supports_trusted_option() function checks if the installed version of apt supports the [trusted=yes] option, so that deb-pkg-tools can use it when possible.

deb_pkg_tools.repo.select_gpg_key(directory)

Select a suitable GPG key for repository signing.

Parameters:directory – The pathname of the directory that contains the package repository to sign (a string).
Returns:A GPGKey object or None.

This function is used by update_repository() and activate_repository() to select a default GPG key.

First the following locations are checked for a configuration file:

  1. ~/.deb-pkg-tools/repos.ini
  2. /etc/deb-pkg-tools/repos.ini

If both files exist the first one is used. Here is an example configuration with an explicit repository/key pair and a default key:

[default]
public-key-file = ~/.deb-pkg-tools/default.pub
secret-key-file = ~/.deb-pkg-tools/default.sec

[test]
public-key-file = ~/.deb-pkg-tools/test.pub
secret-key-file = ~/.deb-pkg-tools/test.sec
directory = /tmp

Hopefully this is self explanatory: If the repository directory is /tmp the ‘test’ key pair is used, otherwise the ‘default’ key pair is used. The ‘directory’ field can contain globbing wildcards like ? and *. Of course you’re free to put the actual *.pub and *.sec files anywhere you like; that’s the point of having them be configurable :-)

If no GPG keys are configured but apt requires local repositories to be signed then this function falls back to selecting an automatically generated signing key. The generated public key and secret key are stored in the directory ~/.deb-pkg-tools.

deb_pkg_tools.repo.load_config(repository)

Load repository configuration from a repos.ini file.

Miscellaneous functions

Utility functions.

The functions in the deb_pkg_tools.utils module are not directly related to Debian packages/repositories, however they are used by the other modules in the deb-pkg-tools package.

deb_pkg_tools.utils.compact(text, *args, **kw)

Alias for backwards compatibility.

deb_pkg_tools.utils.sha1(text)

Calculate the SHA1 fingerprint of text.

Parameters:text – The text to fingerprint (a string).
Returns:The fingerprint of the text (a string).
deb_pkg_tools.utils.makedirs(directory)

Create a directory and any missing parent directories.

It is not an error if the directory already exists.

Parameters:directory – The pathname of a directory (a string).
Returns:True if the directory was created, False if it already exists.
deb_pkg_tools.utils.optimize_order(package_archives)

Shuffle a list of package archives in random order.

Usually when scanning a large group of package archives, it really doesn’t matter in which order we scan them. However the progress reported using humanfriendly.Spinner can be more accurate when we shuffle the order. Why would that happen? When the following conditions are met:

  1. The package repository contains multiple versions of the same packages;
  2. The package repository contains both small and (very) big packages.

If you scan the package archives in usual sorting order you will first hit a batch of multiple versions of the same small package which can be scanned very quickly (the progress counter will jump). Then you’ll hit a batch of multiple versions of the same big package and scanning becomes much slower (the progress counter will hang). Shuffling mostly avoids this effect.

deb_pkg_tools.utils.find_debian_architecture()

Find the Debian architecture of the current environment.

Uses os.uname() to determine the current machine architecture (the fifth value returned by os.uname()) and translates it into one of the machine architecture labels used in the Debian packaging system:

Machine architecture Debian architecture
i686 i386
x86_64 amd64
armv6l armhf

When the machine architecture is not listed above, this function falls back to the external command dpkg-architecture -qDEB_BUILD_ARCH (provided by the dpkg-dev package). This command is not used by default because:

  1. deb-pkg-tools doesn’t have a strict dependency on dpkg-dev.
  2. The dpkg-architecture program enables callers to set the current architecture and the exact semantics of this are unclear to me at the time of writing (it can’t automagically provide a cross compilation environment, so what exactly does it do?).
Returns:The Debian architecture (a string like i386, amd64, armhf, etc).
Raises:ExternalCommandFailed when the dpkg-architecture program is not available or reports an error.
deb_pkg_tools.utils.find_installed_version(package_name)

Find the installed version of a Debian system package.

Uses the dpkg-query --show --showformat='${Version}' ... command.

Parameters:package_name – The name of the package (a string).
Returns:The installed version of the package (a string) or None if the version can’t be found.
class deb_pkg_tools.utils.atomic_lock(pathname, wait=True)

Context manager for atomic locking of files and directories.

This context manager exploits the fact that os.mkdir() on UNIX is an atomic operation, which means it will only work on UNIX.

Intended to be used with Python’s with statement:

with atomic_lock('/var/www/apt-archive/some/repository'):
   # Inside the with block you have exclusive access.
   pass
__init__(pathname, wait=True)

Prepare to atomically lock the given pathname.

Parameters:
  • pathname – The pathname of a file or directory (a string).
  • wait – Block until the lock can be claimed (a boolean, defaults to True).

If wait=False and the file or directory cannot be locked, ResourceLockedException will be raised when entering the with block.

__enter__()

Atomically lock the given pathname.

__exit__(exc_type=None, exc_value=None, traceback=None)

Unlock the previously locked pathname.

exception deb_pkg_tools.utils.ResourceLockedException

Raised by atomic_lock() when the lock can’t be claimed.

Version comparison

Version sorting according to Debian semantics.

This module supports version comparison and sorting according to section 5.6.12 of the Debian Policy Manual. It does so by using the python-apt binding (see compare_versions_with_python_apt()) and/or the external command dpkg --compare-versions (see compare_versions_with_dpkg()).

deb_pkg_tools.version.compare_versions_with_python_apt(version1, operator, version2)

Compare Debian package versions using the python-apt binding.

Parameters:
  • version1 – The version on the left side of the comparison (a string).
  • operator – The operator to use in the comparison (a string).
  • version2 – The version on the right side of the comparison (a string).
Returns:

True if the comparison succeeds, False if it fails.

Raises:

NotImplementedError if python-apt is not available (neither of the functions mentioned below can be imported).

This function is compatible with newer versions of python-apt (apt_pkg.version_compare()) and older versions (apt.VersionCompare()).

deb_pkg_tools.version.compare_versions_with_dpkg(version1, operator, version2)

Compare Debian package versions using the external command dpkg --compare-versions ....

Parameters:
  • version1 – The version on the left side of the comparison (a string).
  • operator – The operator to use in the comparison (a string).
  • version2 – The version on the right side of the comparison (a string).
Returns:

True if the comparison succeeds, False if it fails.

deb_pkg_tools.version.compare_versions(version1, operator, version2)

Compare Debian package versions using the best available method.

Parameters:
  • version1 – The version on the left side of the comparison (a string).
  • operator – The operator to use in the comparison (a string).
  • version2 – The version on the right side of the comparison (a string).
Returns:

True if the comparison succeeds, False if it fails.

This function prefers using the python-apt binding (see compare_versions_with_python_apt()) but will fall back to the external command dpkg --compare-versions when required (see compare_versions_with_dpkg()).

class deb_pkg_tools.version.Version

Rich comparison of Debian package versions as first-class Python objects.

The Version class is a subclass of the built in str type that implements rich comparison according to the version sorting order defined in the Debian Policy Manual. Use it to sort Debian package versions like this:

>>> from deb_pkg_tools.version import Version
>>> unsorted = ['0.1', '0.5', '1.0', '2.0', '3.0', '1:0.4', '2:0.3']
>>> print(sorted(Version(s) for s in unsorted))
['0.1', '0.5', '1.0', '2.0', '3.0', '1:0.4', '2:0.3']

This example uses ‘epoch’ numbers (the numbers before the colons) to demonstrate that this version sorting order is different from regular sorting and ‘natural order sorting’.

__hash__()

Enable adding objects to sets and using them as dictionary keys.

__eq__(other)

Enable equality comparison between version objects.

__ne__(other)

Enable non-equality comparison between version objects.

__lt__(other)

Enable less-than comparison between version objects.

__le__(other)

Enable less-than-or-equal comparison between version objects.

__gt__(other)

Enable greater-than comparison between version objects.

__ge__(other)

Enable greater-than-or-equal comparison between version objects.