06-07, 16:15–17:00 (Europe/London), Doddington Forum
Have you ever wondered how to find connections in your data and to gain insights from them?
Come discover how NetworkX makes this easy (and fast!).
This talk is broadly divided into two parts. First we will talk about the power of graph analytics and how you can use tools like NetworkX to extract information from your data, and then we will talk about how we made the machinery behind NetworkX work with heterogeneous backends like GraphBLAS (CPU optimized) and cuGraph (GPU optimized).
Part I
NetworkX is the most popular library in Python for graph theory and applied network science thanks to its extensive API and beginner-friendly documentation. NetworkX is used "everywhere", because graphs are everywhere. Don't believe me? We surveyed more than 300 Python packages to understand how they use NetworkX in domains ranging from geoscience, neuroscience, genomics, biology, chemistry, quantum computing, text and language, machine learning, causal inference, optimization, and more. We will summarize what we learned to help you apply graph analytics to your data.
Once you start using NetworkX you will soon realize that the pure-Python implementation starts becoming a roadblock to scalable graph analytics.
Part II
What should you do when your graph data becomes too large or NetworkX becomes too slow? Simple: use an accelerated NetworkX backend!
NetworkX 3.0 added the ability to dispatch to other implementations. This means you can use other highly tuned libraries from NetworkX to achieve up to 100 to 10_000+ times speedup! As "the API for graphs", NetworkX now makes it easy to accelerate your graph workflows on CPUs with GraphBLAS and NVIDIA GPUs with nx-cugraph. Other backends are welcome, and we plan to support distributed graphs soon for extreme scalability 🚀🚀🚀
Outline:
10 mins - Introduction to the world of network data, modeling with NetworkX, and needs of graph data in the world.
10 mins - How do backends work? Trade-offs of using backends
10 mins - Live demos
No previous knowledge expected
Currently I work at European Spallation Source making sure the data munging pipelines reduce the experiment data. I am also on the board of NumFOCUS, and I have been involved with various projects like Scientific Python, NetworkX, Econ-ARK. I am broadly interested in the development and maintenance of the open source data & science software ecosystem and I try to help around wherever possible!