PyData Tel Aviv 2025

Talk Less, Graph More: NLP, Networks and Musicals
2025-11-05 , ML+analytics

Hamilton isn't just a groundbreaking musical—it's a web of lyrical motifs, callbacks, high fives over or heads and repeated phrases that echo across songs and acts.
In this talk, we explore how Natural Language Processing and graph theory can be combined to uncover the hidden structure of the corpus.


Lin-Manuel Miranda is a remarkably creative writer who uses recurring phrases and lyrics to subtly connect his songs. We're going to take this collection of lyrics and explore it through the lens of graph theory — partly for fun, and partly to see if we can build a framework for understanding how different parts of a story link together.
We will go over the process of breaking a song into tokens, using the ol' n-gram method and some translations to find the most important keywords, and see how they repeat themselves in a graph (using networkx)

We will fold our graph a couple of times and see the usage of a word graph and visibility graph.


Prior Knowledge Expected:

No previous knowledge expected

Tal Mizrachi (a.k.a. Analysis Paralysis) is a data scientist, educator, and mentor on a mission to make data science, analytical thinking, and programming more approachable—and more fun than it already is. He loves helping people connect the dots, ask better questions, and build cool stuff with data. When he’s not wrangling datasets or teching Python in TAU, Tal is usually hanging out with his wife Adi, their two daughters, and a very large black dog.