Have you ever wondered how Shazam identifies a song in just a few seconds, even in a noisy environment? In this talk, we’ll learn the core technology behind Shazam’s magic: audio fingerprinting.
We’ll explore how raw audio is processed with techniques like Fast Fourier Transform (FFT) to create spectrograms, how peak points are selected to form compact audio “fingerprints”, and how those fingerprints can be stored and efficiently searched in a database. This process allows for accurate music recognition with minimal input.
Through a step-by-step Python implementation, I’ll demonstrate how to build a simplified Shazam-like system using libraries such as librosa, numpy and scipy. You’ll see how to extract fingerprints, build a mini database of tracks, and recognize an unknown audio snippet, all in code.
This talk is ideal for developers interested in audio processing, real-world applications of signal processing, or reverse-engineering clever systems. No advanced math or audio background needed, just curiosity and love for music (and Python)!
