Python Project Tooling explained

When you first learn Python, somebody explains you that you can add the folder in which your source files sit to the PYTHONPATH environment variable and this code will then be importable from other locations. Too often this person forgets to say that this is a very bad idea in most scenarios. Some people discover this on the internet, some others just realize this by themselves. Too many people (especially non-programmers) just believe there's no alternative.

This blog post is for all of them, because even if you know that an alternative exists, it's not always easy to grasp. Python tooling is confusing because there are many software, built on top of each other, with a lot of overlaps with their concerns. It's hard to understand how they fit in the big picture.

For this reason I've decided to create a post that lists the most important tools, when and why they are used and what problem they solve. I will try to explain with simple words how you should approach each one of these tools. If a tool is here, it means that, as a Python programmer, you're supposed to at least know its existence. I will list only tools that can be applied to any project or workflow and that you should consider every time you start a new project. This doesn't mean you always have to use each one of them on every single project. Too much tooling can be easily be an overkill and become hard to maintain in some cases.

Basic

Setuptools

Setuptools is the standard way to create packages in Python. It's everywhere, it works and it fulfill its role.

For: building egg, zip or wheel files from source, define metadata for your project, share code in a structured and standardized way When: basically every time you want to write code that should run on somebody's else machine

Alternatives: Poetry, Flit

virtualenv

Virtualenv is a virtual environment manager. These isolated environments can be understood as self-contained python installations with independent packages installed. Using virtualenv means that you don't need to (and you shouldn't anyway) install packages using your default system's python installation.

For: keeping your dependencies separate, supporting multiple python versions in the same system, moving dependencies around easily When: you want to develop code, when you want to use a python version different than your default without going crazy

Alternatives: Docker or equivalent

Pip

Pip is the most common package management tool for Python. It allows you to take local or remote packages and install them in your virtual environment or system's Python.

For: installing and uninstalling packages, tracking versions of the packages you're using When: always

Alternatives: Poetry, Conda

Packaging and Distribution

For a more detailed overview, python.org has a dedicated [page].(https://packaging.python.org/).

distutils

distutils is a precursor of setuptools. The latter makes heavy use of features from distutils so it's not uncommon to have to interact with this tool. It's not something you should pick for your tool belt directly but you should be aware of where it fits in the big picture.

Pypi

Pypi is the Python Package Index. It's a big repository of all your favourite Python Modules. Pip, for example, takes built packages from here.

For: publishing your code When: when you have a package that you want to make publicly available

Pypiserver

Pypiserver is one implementation of the Package Index API used by Pypi. You can set up your own repository, for example for your whole company and publish packages there without releasing them to the public.

For: sharing code inside an organization When: this code shouldn't go public and you want to have control Alternatives: Warehouse (the one used by Pypi), djangopypi

Poetry

Poetry is an alternative packaging system that replaces setuptools, pip and some of the tools built on top of them. It's an attempt to do a complete overhaul of how the Python packaging system works. So far it got some traction and lot of positive feedback but it's far from being the prevalent option.

For: handling and distributing packages, managing your dependencies, avoiding dependency resolution problems When: you have a fresh project and you're not afraid to use a relatively niche tool Alternatives: Pipenv

Pipenv

Pipenv, like Poetry, is a tool to structure your Python project dependencies and configurations in a more sane way. Through a Pipfile, it manages the dependencies for your project and ensure consistency and ease of use.

For: handling and distributing packages, managing your dependencies When: you want something like Poetry that will raise less questions Alternatives: Poetry

Documentation

Sphinx

Sphinx is a tool to build documentation. It was born originally to handle Python's documentation but has now graduated to a general purpose documentation tool. It remains the most common option for Python Projects.

For: producing PDF or HTML documents from reStructuredText sources When: you want to have external documentation for your project, your APIs and your code Alternatives: Docutils, Doxygen

autodoc

autodoc is a fundamental extension to Sphinx that allows you to generate restructuredText files from Python source code with entries for each class, function, module and so on.

For: documenting your code or APIs When: probably every time you're using Sphinx for a project Alternatives: autosummary

Testing

py.test

py.test is, in my opinion, the best test suite available in Python. It has plenty of features even though not all of them are presented properly so it takes some time to discover all the many possibilities supported by this software

For: testing your code When: always, you lazy ass Alternatives: unittest, nose

Hypothesis

Hypothesis is a tool for property-based testing. Briefly said, it generates random test scenarios according to your specifications until it finds a case that makes your test fails. Take some time to learn the principles behind before starting to use this tool.

For: testing code, especially data processing When: you need to test non-trivial logic with a wide input space (numbers, strings, structured data)

tox

tox is at its core a virtualenv manager for testing. It means that you can configure it to run your tests in a series of clean, customizable virtual environments to ensure that your code will be able to work under different conditions. All of this without any manual work required.

For: code that needs to run in different conditions and environments. Also useful for CI. When: your code needs to support different Python versions, run in different environments and in different operative systems Alternatives: bash scrips, CI pipelines

Other

pyenv

pyenv is a python version manager. It aims to simplify the local workflow of developers when handling multiple versions.

For: running different projects with different python versions When: you need to run with global python versions and you have many Alternatives: manual management, virtualenv, Poetry, Pipenv

PyScaffold

PyScaffold is a tool to initialize your project structure in a standardized way and to provide some of the tools we listed before without need to configure them manually. It's heavily customizable.

For: bootstrapping projects, have multiple projects with uniform tooling and structure When: always (as long as you know the tool, don't use it the first time if you're in a rush) Alternatives: python-project-template, Cookiecutter

flake8

flake8 is one of the most used linters for Python. It runs different scripts to verify the compliance of your code with Python's style guide requirements (PEP-8).

For: verifying and guaranteeing good code style in your project When every time your project needs to be read by somebody, including yourself Alternatives: pylint

Black

Black is an automatic code formatter. It means that instead of just checking your code for compliance, Black will actually modify it to make it compliant.

For: formatting your code automatically When: you have no problem giving up manual control over your code look Alternatives: autopep8, yapf,