From quickly written python script to shared package¶
Techdays, Strasbourg, March 25, 2025
Matthieu Boileau
Common situation in a mathematics laboratory¶
To solve a problem, a researcher has written a Python script, usually in the form of a Jupyter notebook. He wishes to:
- make it a publication,
- develop it as a team,
- make it available to collaborators.
This script contains lots of brilliant ideas, but...
- it is monolithic: no modularity
- it mixes code and data:
- if you want to change the input data, you have to change the code thus multiply the versions
- the produced data ends up in the code sources
- it is only validated by its author: no review, no tests, no documentation
This presentation proposes to follow a path that leads from the isolated script to a tested, documented, installable, and published Python package.
The Stairway of Competence¶
This path can be seen as a stairway where each step corresponds to a skill to acquire:

First step: splitting the script into functions and CLI¶
from IPython.display import Code
Code(filename='linewave/linewave.py', language="python")
"""Solve the 1D wave equation using the leap-frog scheme"""
import argparse
import numpy as np
import matplotlib.pyplot as plt
c: float = 1.0 # Wave speed
def sinus(x: np.ndarray, t: float) -> np.ndarray:
"""
Compute the analytical solution of the 1D wave equation
Args:
x: Grid points
t: Time
Returns:
u: Solution of the wave equation
"""
return np.sin(2 * np.pi * (x - c * t))
def compute_wave(
L: float, T: float, CFL: float, N: int
) -> tuple[float, np.ndarray, np.ndarray]:
"""
Compute the solution of the 1D wave equation using the leap-frog scheme
Args:
L: Length of the domain
T: Final time
CFL: CFL number
N: Number of grid points
Returns:
t: Final time
x: Grid points
u: Solution of the wave equation
"""
# Discretization (we remove the endpoint because of periodic boundary conditions)
x, dx = np.linspace(0, L, N, endpoint=False, retstep=True)
dt = CFL * dx / c # Time step
# Set initial solution
un = sinus(x, 0.0)
unm1 = sinus(x, -dt)
# Leap-frog scheme
t: float = 0.0
while t < T:
t += dt
unp1 = (
-unm1
+ 2 * un
+ CFL**2 * (np.roll(un, 1) - 2 * un + np.roll(un, -1))
)
# Exchange array references for avoiding a copy
unm1, un, unp1 = un, unp1, unm1
return t, x, un
def L2_error(t: float, x: np.ndarray, u: np.ndarray) -> float:
"""
Compute the L2 error norm
Args:
t: Final time
x: Grid points
u: Solution of the wave equation
Returns:
L2 error norm
"""
return np.linalg.norm(u - sinus(x, t)) / np.linalg.norm(sinus(x, t))
def plot(t: float, x: np.ndarray, u: np.ndarray):
"""
Plot the solution using matplotlib
Args:
t: Final time
x: Grid points
u: Solution of the wave equation
"""
plt.plot(x, u, "o", label=f"t = {t:.2f}")
plt.plot(x, sinus(x, t), label="Analytical")
plt.title(f"Leap-frog solution with N = {len(x)}")
plt.xlabel("x")
plt.ylabel("u")
plt.legend()
plt.show()
def main():
"""Main function with CLI"""
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
)
parser.add_argument(
"--L", type=float, default=1.0, help="Length of the domain"
)
parser.add_argument("--T", type=float, default=100.0, help="Final time")
parser.add_argument("--CFL", type=float, default=0.99, help="CFL number")
parser.add_argument(
"--N", type=int, default=40, help="Number of grid points"
)
args = parser.parse_args()
t, x, u = compute_wave(**vars(args))
plot(t, x, u)
print(f"L2 error norm: {L2_error(t, x, u):.3e}")
if __name__ == "__main__":
main()
We can now run the script from the command line. We display the help:
%run linewave/linewave.py --help
usage: linewave.py [-h] [--L L] [--T T] [--CFL CFL] [--N N] Solve the 1D wave equation using the leap-frog scheme options: -h, --help show this help message and exit --L L Length of the domain (default: 1.0) --T T Final time (default: 100.0) --CFL CFL CFL number (default: 0.99) --N N Number of grid points (default: 40)
And run it with its default values:
%run linewave/linewave.py
L2 error norm: 1.289e-02
Or with other parameters:
%run linewave/linewave.py --T 1000 --N 20
L2 error norm: 5.134e-01
In this 1D configuration, the method is exact for CFL = 1:
%run linewave/linewave.py --T 1000 --N 20 --CFL 1.
L2 error norm: 2.273e-09
Unit tests with pytest¶
We test several functions of the linewave module by writing a file named test_linewave.py:
Code(filename='test_linewave.py')
import numpy as np
from pytest import approx
from linewave.linewave import sinus, compute_wave, L2_error
def test_linewave():
t, x, u = compute_wave(T=50, N=80, CFL=0.99, L=2.0)
assert x.max() == 2.0 - 2.0 / 80
assert t >= 50
assert L2_error(t, x, u) < 0.01
def test_analytical_solution():
x = np.linspace(0.0, 1.5, 50)
assert sinus(x, t=3) == approx(np.sin(2 * np.pi * x), abs=1e-14)
def test_L2_error():
x = np.linspace(0.0, 1.5, 50)
u = np.sin(2 * np.pi * x)
# arrays are the same
assert L2_error(t=0, x=x, u=u) == approx(0.0, abs=1e-16)
# arrays are shifted by a phase
assert L2_error(t=3, x=x, u=u) == approx(0.0, abs=1e-14)
Now let's have it tested.
!pytest -v
============================= test session starts ============================== platform darwin -- Python 3.11.11, pytest-8.3.5, pluggy-1.5.0 -- /Users/boileau/Documents/Conf/2025/Techdays2025/script2package/.venv/bin/python cachedir: .pytest_cache rootdir: /Users/boileau/Documents/Conf/2025/Techdays2025/script2package plugins: anyio-4.9.0 collected 3 items test_linewave.py::test_linewave PASSED [ 33%] test_linewave.py::test_analytical_solution PASSED [ 66%] test_linewave.py::test_L2_error PASSED [100%] ============================== 3 passed in 0.17s ===============================
Some guides to start with pytest:
- pyopensci guide: why you should write tests,
- pytest guide: some good practices.
GitLab & Co¶
At this stage, we take a HUGE shortcut: we use cookiecutter to create a GitLab project from a Python package skeleton:
cookiecutter https://gitlab.math.unistra.fr/boileau/cookiecutter.git --directory python/irma
This command:
- creates a Python project containing:
- the python sources in
src/linewave, - the tests in
tests/, - the Sphinx documentation in
docs/, - a
LICENSEfile, - a
README.mdfile, - a
CHANGELOG.mdfile, - the
pyproject.tomlfile which describes the Python project, - the
.gitlab-ci.ymlfile which describes the continuous integration pipeline
- the python sources in
- publishes this project on GitLab at the address https://gitlab.math.unistra.fr/boileau/linewave
Project structure¶
linewave
├── pyproject.toml
├── LICENSE
├── README.md
├── CHANGELOG.md
├── docs
│ ├── make.bat
│ ├── Makefile
│ └── source
│ ├── conf.py
│ ├── installation.md
│ └── index.md
├── src
│ └── linewave
│ ├── linewave.py
│ └── __init__.py
└── tests
└── test_linewave.py
Benefits of putting the linewave package in a src directory:
- the package is isolated from the tests and the documentation
- other python files won't be considered as part of the package
- it helps prevent accidental import of test modules
- it encourages a clearer separation of concerns within the project structure
See Setup tools documentation for more details.
README file¶
The README.md file is the entry point of the project. It should contain at least:
- a description of the project,
- an invitation to test the project,
- a link to the documentation.
A common practice is to include badges in the README.md file. For example:
- the build status badge,
- the coverage badge,
- the license badge,
- the version badge.
LICENSE file¶
The LICENSE file contains the license under which the project is distributed.
A license is mandatory if you want to share your project.
If a project does not have a license, it means that it is copyrighted!
Find your license on choosealicense.com.
CHANGELOG file¶
The CHANGELOG.md file is a log of changes made to the project. It should contain:
- the version number,
- the date of the change,
- a description of the change.
keepachangelog.com is a simple guide that promotes a consistent format for changelogs.
In particular, it recommends using Semantic Versioning where version numbers are in the form MAJOR.MINOR.PATCH and are incremented as follows:
- MAJOR version when you make incompatible API changes,
- MINOR version when you add functionality in a backwards compatible manner, and
- PATCH version when you make backwards compatible bug fixes.
Anatomy of pyproject.toml¶
Code(filename='../linewave/pyproject.toml')
[build-system]
requires = ["setuptools>=61.2"]
build-backend = "setuptools.build_meta"
[project]
name = "linewave"
authors = [
{ name = "Matthieu Boileau", email = "matthieu.boileau@math.unistra.fr" },
]
description = "A 1D linear wave solver"
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
requires-python = ">=3.8"
dependencies = [] # Add your python dependencies here
dynamic = ["version"] # the version is defined in the [tool.setuptools.dynamic.version] section
[project.optional-dependencies]
test = ["pytest", "pytest-cov"]
doc = [
"Sphinx >= 7.2.2", # 7.2.2 is the first version that supports Python 3.9
"myst-parser", # Markdown support for Sphinx
"furo", # A modern theme for Sphinx
"sphinx-copybutton", # Add copy buttons to code blocks
"sphinx-autobuild", # Auto-rebuild Sphinx documentation when editing
]
[project.license]
text = "MIT"
[project.readme]
file = "README.md"
content-type = "text/markdown"
[project.urls]
Homepage = "https://gitlab.math.unistra.fr/boileau/linewave"
Documentation = "https://boileau.pages.math.unistra.fr/linewave" # Hosted on GitHub or GitLab using Pages
Repository = "https://gitlab.math.unistra.fr/boileau/linewave.git"
Issues = "https://gitlab.math.unistra.fr/boileau/linewave/-/issues"
Changelog = "https://gitlab.math.unistra.fr/boileau/linewave/-/blob/main/CHANGELOG.md"
[project.scripts]
# An entry for the command line interface
"linewave" = "linewave.linewave:main"
[tool.setuptools]
include-package-data = true # Include non-python files in the package
license-files = ["LICENSE"] # Include the license file in the package
[tool.setuptools.package-dir]
"" = "src" # The package is in the src directory
[tool.setuptools.packages.find]
where = ["src"] # Look for packages in the src directory
namespaces = false # Do not use namespace packages
[tool.setuptools.dynamic.version] # The version is defined in the __init__.py file
attr = "linewave.__version__"
A build-system section that changes according to the package manager used:
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
Here, we use setuptools as a build backend but we could use flit, poetry, hatchling, etc.
A project section that describes the project:
[project]
name = "linewave"
authors = [
{ name = "Matthieu Boileau", email = "matthieu.boileau@math.unistra.fr" },
]
description = "A 1D linear wave solver"
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
requires-python = ">=3.8"
dependencies = ["numpy", "matplotlib"] # Add your python dependencies here
dynamic = ["version"] # the version is defined in the __init__.py file
A project.optional-dependencies section that lists the optional dependencies:
[project.optional-dependencies]
test = ["pytest", "pytest-cov"]
doc = [
"Sphinx >= 7.2.2", # 7.2.2 is the first version that supports Python 3.9
"myst-parser", # Markdown support for Sphinx
"furo", # A modern theme for Sphinx
"sphinx-copybutton", # Add copy buttons to code blocks
"sphinx-autobuild", # Auto-rebuild Sphinx documentation when editing
]
So the optional dependencies can be installed with:
pip install -e ".[doc,test]"
Information sections that describe the project:
[project.license]
text = "MIT"
[project.readme]
file = "README.md"
content-type = "text/markdown"
[project.urls]
Homepage = "https://gitlab.math.unistra.fr/boileau/linewave"
Documentation = "https://boileau.pages.math.unistra.fr/linewave"
Repository = "https://gitlab.math.unistra.fr/boileau/linewave.git"
Issues = "https://gitlab.math.unistra.fr/boileau/linewave/-/issues"
Changelog = "https://gitlab.math.unistra.fr/boileau/linewave/-/blob/main/CHANGELOG.md"
A scripts section that lists the scripts to be installed:
[project.scripts]
# An entry for the command line interface
"linewave" = "linewave.linewave:main"
So the command line interface can be run from anywhere with:
linewave
This command will run the main() function in the linewave module of the linewave package.
Some setuptools options:
[tool.setuptools.package-dir]
"" = "src" # The package is in the src directory
[tool.setuptools.packages.find]
where = ["src"] # Look for packages in the src directory
namespaces = false # Do not use namespace packages
[tool.setuptools.dynamic.version] # The version is defined in the __init__.py file
attr = "linewave.__version__"
Installing the package¶
Thanks to the pyproject.toml file, the project can be installed with:
pip install .
Doing so, the project is installed in the current environment and can be imported in any Python module from anywhere:
import linewave
or run from the command line from anywhere:
linewave
This plays an important role in separating the data from the code.
Editable mode¶
When developing the project, it is recommended to install it in editable mode:
pip install -e .
So that the changes made to the project sources are visible without having to reinstall it.
Creating a documentation¶
Python native documentation tool is Sphinx.
Remember that you are writting for two kind of readers:
- the users of your package,
- the developers of your package.
A Sphinx documentation is composed essentially of:
- a
sourcedirectory containing the sources of the documentation, - a
conf.pyfile that configures the documentation. - a
index.rst/index.mdfile that is the entry point of the documentation. .rst/.mdfiles that contain the documentation.
Auto-generating the documentation¶
By using the autodoc extension, you can automatically generate the documentation of the modules, classes and functions of your package.
The main benefit is that the documentation source is very close to the code: if you write (carefully) the docstrings and keep them updated, your documentation will always reflect the latest changes in your code. Otherwise...
Building the documentation¶
To build the documentation locally, you may use:
sphinx-autobuild docs/source docs/build
This command will:
- build the documentation in the
docs/builddirectory, - start a web server that serves the documentation at http://localhost:8000,
- rebuild the documentation each time a file in the
docs/sourcedirectory is modified.
Publishing the documentation¶
Your documentation can be published online using GitLab Pages (see below) or Read the Docs.
Starting with Sphinx¶
Sphinx is a vast tool with many features and possible extensions. You may start with this nice guide from pyopensci. And for more information, see the Sphinx documentation.
An overview of GitLab-CI¶
CI stands for Continuous Integration. It is a practice that consists of verifying each code integration by an automated build (including tests) to detect errors as quickly as possible.
Principles of the GitLab CI/CD pipeline:
- it is defined in a
.gitlab-ci.ymlfile at the root of the project, - it is triggered by a push on GitLab,
- it is composed of stages (build, test, deploy, etc.),
- each stage is composed of jobs,
- each job must meet a GitLab runner to be executed.
A GitLab runner is a service that can be installed on any machine (even your laptop) to execute the jobs of the pipeline. There are two types of runners:
- shared runners: managed by the GitLab administrator,
- specific runners: managed by the project administrator.
The Docker GitLab runner is a type of runner that uses Docker containers to run jobs in isolation. This allows for a consistent and reproducible environment for all jobs in the pipeline.
A .gitlab-ci.yml file¶
Code(filename="../linewave/.gitlab-ci.yml")
default:
image: "python:3.12" # use the specified docker image
tags:
- docker
stages:
- test
- build
- deploy
- release
before_script:
## Prepare python virtual environment
- python -m venv .venv/
- source .venv/bin/activate
- pip install -U pip
test:
stage: test
script:
- pip install -e .[test] # install test dependencies
- pytest --durations=0 --cov=src/linewave -sv # run pytest with code coverage
- coverage html -d public/coverage # generate coverage html report
coverage: '/(?i)total.*? (100(?:\.0+)?\%|[1-9]?\d(?:\.\d+)?\%)$/' # extract the coverage rate for the badge
artifacts:
paths:
- public # export the coverage html report
doc:
stage: test # so the documentation is built in parallel with the tests
script:
- pip install -e .[doc] # install doc dependencies
- sphinx-build -b html docs/source/ public/ # build the documentation
artifacts:
paths:
- public # export the documentation
pages:
stage: deploy
before_script: [] # no need to prepare python virtual environment
script:
- echo 'Nothing to do...' # public dir is already prepared by the previous jobs
only:
- main # deploy the documentation only for the main branch
artifacts:
paths:
- public # export the coverage and documentation
release_to_gitlab: # Create a release on GitLab from the git tag
stage: release
before_script: [] # no need to prepare python virtual environment
image: registry.gitlab.com/gitlab-org/release-cli:latest
rules:
- if: $CI_COMMIT_TAG # Run this job when a tag is created
script:
- echo "running release_job"
release: # See https://docs.gitlab.com/ee/ci/yaml/#release for available properties
tag_name: "$CI_COMMIT_TAG"
description: "$CI_COMMIT_TAG_MESSAGE"
Editing the project skeleton¶
We copy our code into the skeleton:
!cp linewave/linewave.py ../linewave/src/linewave/
!cp test_linewave.py ../linewave/tests/
- we install
linewavelocally in editable mode - we verify that the tests pass
- we push it to GitLab.
Steps to create a release¶
In the
developbranch:- update the
CHANGELOG.mdfile with the changes made since the last release. - increase the version number in the
__init__.pyfile. - commit and push the changes.
- create a merge request to the
mainbranch. - merge the merge request if the pipeline is successful.
- update the
In the
mainbranch on GitLab, create a tag with the version number and the CHANGELOG entry as a message. It triggers therelease_to_gitlabCI job.Once the gitlab-ci pipeline is finished, the release is created on GitLab and the release badge is updated.
Conclusion¶
Once you have separated the code from the data, your code can use:
- a versionning system: Git and GitLab
- a testing framework: pytest using
test/directory - a documentation framework:
README,LICENSE,CHANGELOG, docstring + Sphinx - a packaging tool: setuptools using
pyproject.toml - a CI pipeline: GitLab-CI using
.gitlab-ci.yml
Your python project is now a package that can be installed with:
# Private gitlab project
pip install git+ssh://git@gitlab.math.unistra.fr/boileau/linewave.git@v0.1.0
# Public gitlab project
pip install git+https://gitlab.math.unistra.fr/boileau/linewave.git@v0.1.0
# Official PyPI repository if a PyPI job is defined in the pipeline (see Extra)
pip install linewave
A significant amount of boilerplate code is added but can be easily automated with a templating tool such as cookiecutter.
However this apparent ease can be deceptive as it requires mastering:
- version tracking with git,
- the environment of a GitLab forge,
- the principles of the GitLab-CI workflow,
- the basics of packaging, testing, and documentation in python.
Next steps to climb¶
- Add contribution guidelines and code of conduct: require a consensus among the early contributors. See this little guide.
- Enforce a code style: See Black and the pre-commit tool.
- Reference your code on Software Heritage: easy, no excuse!
- Only a
codemeta.jsonfile in the project root is needed - Software Heritage will automatically archive the code and metadata
- It will produce a citation and a unique identifier for the code and its versions
- Only a
- Make your code reproducible: pretty hard but should be considered! See Guix.
References¶
Extra: deploying the package on PyPI¶
Another job in the pipeline can deploy the package to the Python Package Index (PyPI) or to a private repository.
deploy_to_pypi: # Deploy the package to PyPI
stage: release
before_script: [] # no need to prepare python virtual environment
image: python:3.12
rules:
- if: $CI_COMMIT_TAG # Run this job when a tag is created
script:
- pip install --upgrade pip
- pip install build twine
- python -m build
- TWINE_PASSWORD=${PYPI_TOKEN} TWINE_USERNAME=__token__ python -m twine upload dist/*
This requires to:

