As AI becomes embedded in nearly every piece of software, user adoptation will ultimately depend on the accuracy and reliability of these applications. As with any Machine Learning (ML) system, a meaningful evaluation framework is crucial to avoid structural biases in your data or models.
This talk identifies common pitfalls and illustrates them with real-world examples from nearly two decades of experience in the data science field. We explore the hidden story behind the metrics, moving beyond a single performance score to delve into the intricacies of the data set and its domain. We discuss how to detect artificial biases in your data and share strategies to prevent them through rigorous data collection and annotation practices.
This talk will conclude with a list of practical recommendations for building ML projects on strong foundations, providing developers with the knowledge and tools they need to transform ambitious AI ideas into reliable, production-ready solutions.
