PyData London 2025

CUDA in Python: A New Era for GPU Acceleration
06-08, 11:45–12:30 (Europe/London), Grand Hall

We discuss bringing Python natively to the CUDA ecosystem. From low level bindings to domain specific applications, CUDA is supporting Python standards and ecosystem. New libraries include nvmath-python for managing optimized mathematics libraries, cccl-python for cooperative threading and device parallelism, cuda-core for managing the complete CUDA toolstack from Python with no need for C++, and finally numba-cuda for generating device side kernels with integration of C++ device libraries and LTO IR.


CUDA has been accessible to Python developers for over a decade, but often through third-party abstractions that lag behind the latest CUDA releases. However, that’s changing—over the next year, NVIDIA is making Python a first-class CUDA language.

In this talk, we’ll explore how Python programmers can leverage the CUDA platform today and how native Python support is evolving across the entire CUDA stack.

We begin with an overview of the CUDA programming model and how to manage accelerator devices as a core part of a Python application. Then, we dive into three practical examples:

Image Processing for Machine Learning Pipelines – Launching, executing, and streaming transformations directly from Python.
Neural Network Primitives – Implementing operations like softmax with blockwise parallelism.
High-Performance Deep Learning – Integrating with optimized libraries that leverage low-level, highly tuned CUDA kernels.
To showcase the power of these Python interfaces, we conclude with a hands-on demonstration: implementing GPT-2 (inspired by llm.c) entirely in Python—achieving performance nearly identical to its C counterpart.

Join us to discover the joy of CUDA from Python, and unlock new possibilities in GPU acceleration with a familiar, high-level language!


Prior Knowledge Expected

No previous knowledge expected

I lead CUDA Python Product Management, working to make CUDA a Python native.

I received my Ph.D. from the University of Chicago in 2010, where Ibuilt domain-specific languages to generate high-performance code for physics simulations with the PETSc and FEniCS projects. After spending a brief time as a research professor at the University of Texas and Texas Advanced Computing Center, I have been a serial startup executive, including a founding team member of Anaconda.

I am a leader in the Python open data science community (PyData). A contributor to Python's scientific computing stack since 2006, I am most notably a co-creator of the popular Dask distributed computing framework, the Conda package manager, and the SymPy symbolic computing library. I was a founder of the NumFOCUS foundation. At NumFOCUS, I served as the president and director, leading the development of programs supporting open-source codes such as Pandas, NumPy, and Jupyter.