Reproducible environment in nbdev with Pipenv

nbdev
pipenv
Author

David Dobrinskiy

Published

December 19, 2022

Introduction

If you’re reading this, you’re probably familiar with nbdev, a system for creating Python libraries from Jupyter notebooks.

The only trouble I have with it is keeping track of dependencies, especially when you have multiple projects or expect other people to use your code.

In this post, I’ll show you how to set up a new nbdev project using Pipenv, a tool for managing Python dependencies. This will allow you to easily create a new project, install dependencies, and run your code without having to worry about setting up a virtual environment.

Some background on Pipenv

If you’re not familiar with Pipenv, it’s a tool for managing Python dependencies. It’s similar to virtualenv, conda or poetry, but it’s a bit more user-friendly.

Setting up a new project with Pipenv is easy. You just need to run pipenv install in the project directory. This will create a new virtual environment and install all the dependencies listed in the Pipfile.

If you want to add a new dependency, you can run pipenv install <package_name>. This will install the package and add it to the Pipfile, keeping track of the version you installed.

If you want to install a specific version of a package, you can run pipenv install <package_name>==<version>. This will install the package and add it to the Pipfile with the specified version.

Creating a Pipenv environment

install pipenv:

pip install pipenv

install some generic dependencies for our project (optional):

pipenv install "requests==2.28.1" "pandas==1.5.2"

Now install the required dependencies for our guide:

pipenv install pipenv-setup "vistir==0.6.1" nbdev --dev

There is now a Pipfile and Pipfile.lock in the root of our project. Let’s add the following to the Pipfile:

Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
requests = "==2.28.1"
pandas = "==1.5.2"

[dev-packages]
nbdev = "*"
pipenv-setup = "*"
vistir = "==0.6.1"

[requires]
python_version = "3.10"

Activate your new environment by running:

pipenv shell

Syncing your Pipfile to setup.py

Lucky for us, this has already been solved in pipenv-setup package.

First, install it by running

pipenv install pipenv-setup "vistir==0.6.1" --dev

Now let’s sync our setup.py with Pipfile:

$ pipenv-setup sync --pipfile --dev

No setup() call found in setup.py
can not perform sync

Huh, an error. What’s going on here? Let’s check the setup.py file that was generated by nbdev:

setup.py
setuptools.setup(
    ...
    install_requires = requirements,
    extras_require={ 'dev': dev_requirements },
    dependency_links = cfg.get('dep_links','').split(),
    ...
    )

The issue is that pipenv-setup expects us to call setup(...), not setuptools.setup(...).

That is easy enough to fix by changing our function call:

setup.py
from setuptools import setup
setup(
    ...
    install_requires = requirements,
    extras_require={ 'dev': dev_requirements },
    dependency_links = cfg.get('dep_links','').split(),
    ...
    )

Running the command, we get another error:

$ pipenv-setup sync --pipfile --dev

Error parsing setup.py: install_requires is not a list
can not perform sync

Let’s make our setup.py compatible with pipenv-setup by changing several arguments, namely:

  • install_requires should be a list
  • extras_require should be a dictionary
  • dependency_links should be a list
setup.py
from setuptools import setup
setup(
    ...
    install_requires=[],
    extras_require={},
    dependency_links=[],
    ...
    )

Running again:

$ pipenv-setup sync --pipfile --dev

reformatted setup.py
setup.py was successfully updated
5 default packages from Pipfile synced to setup.py
13 dev packages from Pipfile synced to setup.py

🎉 SUCCESS! 🎉

Our setup.py is now in sync with our Pipfile

now every time someone installs our package, they will get the same dependencies as we do.

Installation with only the default dependencies:

pip install -e .

OR

pip install your_package_name # if you've already published your package

Installation with all the dependencies:

pip install -e '.[dev]'

updated setup.py example:

setup.py
    install_requires=["requests==2.28", "pandas", "bleak", "rich", "aranet4"],
    extras_require={
        "dev": [
            "black[jupyter]==22.12.0",
            "blacken-docs==1.12.1",
            "isort==5.10.1",
            "jupyter==1.*",
            "mypy",
            "nbdev==2.3.9",
            "nbqa==1.5.3",
            "pre-commit==2.20",
            "types-requests",
            "types-toml",
            "ipykernel",
            "pipenv-setup",
            "vistir==0.6.1",
        ],
    },
    dependency_links=[],

pre-commit hook

Odds are you’ll forget to run pipenv-setup sync every time you add a new dependency.

To fix this, we can add a pre-commit hook to our project. This will run pipenv-setup sync every time we commit changes to our Pipfile.

First, install and activate pre-commit in your repo:

pipenv install pre-commit --dev

pre-commit install

Then, add the following to your .pre-commit-config.yaml file:

.pre-commit-config.yaml
repos:
  - repo: https://github.com/Madoshakalaka/pipenv-setup
    rev: v3.2.0
    hooks:
      - id: pipenv-setup
        args: [--dev, --pipfile]
        additional_dependencies: ["vistir==0.6.1"]

Now on every commit, pipenv-setup will check that your Pipfile and setup.py are in sync.

If you forget to run pipenv-setup sync --pipfile --dev, you’ll get an error and won’t be able to commit until you fix it.

Extra reading: