Four Kitchens
Insights

How to write maintainable Python packages (for Node.js developers)

6 Min. ReadDevelopment

Full disclosure: I’m a Node.js developer. I’ve worked with Node.js — the open-source JavaScript runtime built on Chrome’s V8 engine — for years. I like it there. I’m very comfortable with package development within its robust (even if ever-changing) ecosystem of tools.

So when I had reason recently to write a Python package, I admit I felt a little lost. Where was eslint? What about prettier? And how was I supposed to run unit tests? Linting, automatic formatting, testing, static typing, etc: These are all essential components to maintainable package development. Surely there are solutions to be found from Python’s similarly robust development community.

Good news! There were! With some exploration, I found the Python counterparts to all the Node.js tools I had come to love over the years. And now I’ll share them with you!

What’s a package?

But first, let’s make sure we’re all on the same page about packages.

A package is a collection of code files that work together to perform a function or set of related functions.

If you’ve ever written a require or import statement to use someone else’s code in your own, then you’ve likely used packages. Packages are usually distributed and installed through some kind of package manager and repository. In Node.js, that’s NPM or yarn. In Python, the most popular is pip and PyPI.

Packages should be maintainable

The whole point of a package is that it’s bundled up for easy re-use. Hopefully your package is popular and gets used many many times over! However, that would mean at some point, you’re likely to need to fix some bugs, add new features, or adjust to changing requirements. Hard experience has taught many of us that over time, your packaged code will become more and more fragile, and more and more difficult to change without breaking something else. 

The Python tools I’m about to discuss — and their Node.js counterparts — are great for helping you keep your code clean and correct. I’m not going to share every possible tool for each job, but these are the ones that I’ve used and loved for the tasks that will help maintain your package for the long haul.

Manifest

A manifest is the first essential component of any package. It is a file that contains important information about your package like name, description, and version.

Manifests provide a standardized way for other software to interact with your package. A package manager or repository, like NPM or PyPi, uses a package manifest to publish information about your package and to enable other people to install and use your package in their own projects.

In Node.js: The manifest is the package.json file. It handles everything from metadata, to scripts, to package dependencies.

In Python: The corresponding file is named setup.py. Package dependencies, however, are often listed in separate files like requirements.txt, or requirements-dev.txt for development dependencies. That makes it easy to install dependencies in a development environment with pip `pip install -r requirements.txt`. Any dependencies which must be installed when your package is installed by pip, will need to be specified in the `install_requires` argument to the `setuptools.setup() function`.

Unit tests

Probably the most important component after a manifest, unit testing is a code-testing methodology in which tests are written to ensure the correct execution of a small unit within your package — a function, say, or a module — in isolation. Unit tests provide a high degree of assurance that each small part of your package is performing its intended task. These tests often uncover edge-cases and hidden requirements that you might otherwise miss.

In Node.js: My go-to unit test runner is Jest. This JavaScript testing framework takes the batteries-included approach to running tests, providing mocking and spying utilities, coverage reports, live-reloading of your test suite, and more.

In Python: I ended up using not one but three Python packages to get a similar unit test experience in Python: 

pytest works a lot like Jest, but it’s much less opinionated and doesn’t do everything out of the box like Jest does. For example, if you want coverage reports, you’ll also need pytest-cov. And if you want live-reloading, you’ll need pytest-watch.

Linting

We all make mistakes. Some of them are really easy for computers to catch. This is where a linter comes in.

A linter analyzes the code in your package to identify common known mistakes and to enforce code quality or style rules. Working on a code base when multiple engineers are contributing wildly different code is no fun. Linters help you and your code have fun.

In Node.js: ESLint is my favorite linter. It’s popular, easy to configure, and runs with multiple parsers (e.g., flow, typescript).

In Python: I discovered that Flake8 scratches all the same itches as ESLint. And it was easy to get running. Not much else to say — but in this case, no news really is good news.

Automatic code formatting

Have you ever had a fruitless debate with a fellow engineer about indentation or single quotes versus double? Stop wasting time: Let the computers decide! Actually, just let the computers fix things without asking!

Automatic code formatting is a bit like linting, except it’s often even more opinionated about nitpicky stylistic rules. And it changes your code automatically on file save, commit, or really any other event you choose and configure. Just write your code however you want and it’ll automatically be formatted like the rest of the code base. Now we’ve really ramped up the fun!

In Node.js: I use Prettier to never make another formatting decision again.

In Python: I use Black to leave the formatting up to the machines. They know best!

Static types

Right up there with unit tests, static types are one of the primary lines of defense against bugs in my code. Static types provide type safety not only within your package, but also for all the people who write code that depends on your package.

A little explanation: Both JavaScript and Python are dynamically typed languages, meaning that types are associated with values (e.g., variables, function arguments and returns) in your code during execution. This approach might enable you to write code more quickly, and it’s certainly more flexible, but in my experience it can be a little too flexible. 

By contrast, static types are associated with values at compile time. So when using a statically typed language, you write type definitions along with your code. If you try to do something in your code that would violate your type definitions, you get a build or compile error. This might sound like a pain, but it’s better to feel that pain upfront — before you’ve shipped broken code — than to spend your weekend debugging.

In Node.js: As I mentioned, both JavaScript and Python are dynamically typed. So how do you include static types in your JavaScript package? The answer is compilation! TypeScript is actually a different programming language; it’s JavaScript with static types included. After you write TypeScript, you use the TypeScript compiler to validate types and transform your source code into pure JavaScript, which then serves as the package’s executable code.

In Python: You don’t need to compile your code to take advantage of static type checking. Mypy uses function annotations to check types in standard Python 3 (and Python 2) programs — no compilation required!

And a few more for the road

Configuration, release management, and continuous integration/deployment are all necessary elements of maintainable packages. 

  • If you’re going to use all these other tools in your package, you’re also going to want to configure them! In Node.js, configuration is done in a number of different ways from the package.json file to [name].config.js files. Best bet is to follow the instructions for each individual tool. In Python, things seem to be a touch more standardized with configuration living in a setup.cfg file.
  • For release management in Node.js, I love semantic-release, a tool that automates the release workflow by evaluating commit messages and determining the appropriate version number bump according to the Semantic Versioning specification. It turns out someone built a python implementation of the exact same tool! Check out python-semantic-release to enjoy the same automated release workflow in Python.
  • For continuous integration and deployment, the most popular in both the Node.js and Python communities are Travis CI and CircleCI.

Whether you’re writing packages in Node.js or Python, these tools are bound to make your life easier. Use them and create more maintainable packages!