Skip to content

Python Code Best Practices

This page outlines the best practices for writing Python code in 2024.

You will learn about how to expose Python public interfaces, code formatting and linting, and proper typing.

Define public interfaces

Python codebases should expose end-user functionality via clearly defined public interfaces.

The public interfaces should be easy to import. Let's take a look at how to in the pandas library for example:

import pandas as pd

This import makes it easy to import the public functions like pd.read_parquet or pd.read_csv.

Python codebases must only expose the public interface in the documentation. Don't make the mistake of improperly structuring your code and not clearly defining a public interface.

Python Code autoformatting tools

Python codebases can be autoformatted with Black or Ruff.

Repos should contain clear instructions on how to autoformat the code via text editor integrations, pre commit hooks, or CI. The code should be setup so that autoformatting the code is straightforward for all developers.

This saves the whole team a lot of wasted effort talking about code styling and from providing code formatting nits in PR reviews.

Python Code linting

Ruff is a good tool for linting code.

Code linters ensure that code is compliant with PEP8.

You should lint all your code and prevent new code from getting merged unless it is properly linted.

Python Type hints

Modern Python code should use type hints.

Here's an example of a function without type hints:

TODO

Here's the same function with type hints:

TODO

Type hints have a variety of advantages:

  • make it easier to invoke the function
  • prevent the function from being invoked with inproper arguments
  • provide higher quality documentation for end users with clear inputs and outputs

Python build tools

There are many popular build tools for Python projects like Poetry and TODO.

Poetry has nice features and encourages coding best practices.

  • users can specify different dependency groups
  • dependencies can be listed in the pypoetry.toml file
  • The exact versions of all dependencies and transitive dependencies is specified in the poetry.lock file to allow for deterministic builds
  • nice developer quality of life features like single commands to build wheel files and deploy to PyPI

Using Poetry to properly specify dependencies is better than listing dependencies in a requirements.txt file for the following reasons:

  • requirements.txt files don't specify exact versions of dependencies / transitive dependencies, so builds are not deterministic
  • no quality of life helper functions

You should build your Python projects with a proper build tool.

Python documentation

You should have well-documented instructions to generate documentation for the public interface of your project.

The public facing documentation should provide examples and give users a solid understanding of the functionality provided by your library.

The best alternative is a user guide that's generated by humans for a high level overview of the project and programatic generated API documentation with easily accessible details for each component of the public interface.

Limiting Python dependencies

You should limit the number of dependencies and transitive dependencies in your projects, especially if you're building a library.

Remember that if you depend on one library which in turn depends on 10 other library, then you have 11 dependencies, not just one dependency.

Dependencies can cause dependency hell for end users, so you always need to througoly vet all dependencies. You also need to analyze all the transitive dependencies that are pulled in and make sure the projects are maintained well for the long term viability of your codebase.

Optional Python dependencies

Some dependencies in Python projects can be specified as optional if the funcionality is only relevant for a subset of users.

Use optional dependencies whenever possible rather than requiring all users to install all dependencies, even when they're not needed.

PyPI has a nice installation process for optional dependencies. Here's an example of how to install Polars with the optional deltalake dependency:

TODO

This optional dependency powers the pl.read_delta and pl.scan_delta functionality. It's only required for Polars users that want to read Delta Lake tables.

Supported Python versions

Your Python library should support Python versions applicable to your users.

You can't just support the latest Python version because that will make your code inaccessible to users running older versions of Python.

This unfortunately means the Python library developers generally can't use the latest features of Python (unless they've been backported to earlier Python versions).

Python code performance

TODO

Python Code style

There are various Python code style guides that recommend the object oriented/functional programming style, max lines in functions, and max lines in files.

You can even quantify the code quality with tools like TODO.

Nice code style is preferable, but it's comparatively less important than a clean public interface, a proper number of dependencies, and code that meets performance needs.