PyData London 2025

Package Your Python Code as a CLI
06-06, 11:00–12:30 (Europe/London), Hardwick Hub

Learn how to transform your Python code into a command-line tool. Jeroen Janssens, author of Data Science at the Command Line, guides you through the process of turning your scripts into reusable, executable tools, integrating them into your data workflows and harnessing the power of the Unix command line.


If you're not sure whether this tutorial is for you, we recommend you watch Jeroen's talk Embrace the Unix Command Line and Supercharge Your PyData Workflow.

Note: This tutorial assumes that you're using macOS or a Linux distribution. If you're using Windows, please install WSL or a suitable Docker image.

As your Python scripts evolve, turning them into command-line tools offers numerous benefits: reusability, testability, and greater efficiency. The Unix command line is a powerful environment, designed for combining tools, parallel execution, and working with massive data.

This hands-on tutorial will cover:

  • The Unix philosophy and its relevance to data science
  • How to convert Python code into a command-line tool
    • Preparing your code for reuse
    • Parsing command-line arguments
    • Reading from standard input
    • Making your tool executable and adding help options
  • Best practices for designing command-line interfaces
  • Upgrading from argv to argparse or Typer
  • Self-contained tools with uv

Throughout the tutorial, we’ll develop an actual command-line tool, starting with Python’s standard library and later incorporating additional libraries. This tutorial is ideal for developers and researchers looking to enhance their workflows. No prior Unix knowledge is needed; essential concepts will be covered.

Resources


Prior Knowledge Expected

No previous knowledge expected

Jeroen Janssens, PhD, is a Senior Developer Relations Engineer at Posit, PBC. His expertise lies in visualizing data, implementing machine learning models, and building solutions using Python, R, JavaScript, and Bash. He’s passionate about open source and sharing knowledge. He’s the author of Python Polars: The Definitive Guide (O’Reilly, 2025) and Data Science at the Command Line (O’Reilly, 2021). Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University. He lives with his wife and two kids in Rotterdam, the Netherlands.

Thijs Nieuwdorp is the Lead Data Scientist at Xomnia in Amsterdam. His interest in the interaction between human and computer led him to an education in artificial intelligence at the Radboud University, after which he dove straight into the field of data science. At Xomnia he witnessed the birth of Polars as Ritchie Vink started working on it during his employment there and has been using it in his projects ever since. He enjoys figuring out complex data problems, optimizing existing solutions, and putting them to good use by implementing them into business processes. Outside work, Thijs enjoys exploring our world through hiking and traveling and exploring other worlds through books, games, and movies. He lives in Amsterdam with his partner, Paula.