The Personalization team makes deciding what to play next easier and more enjoyable for every listener. From Discover Weekly to AI DJ, we’re behind some of Spotify’s most-loved features. We built them by understanding the world of music and podcasts better than anyone else. Join us and you’ll keep millions of users listening by making great recommendations – and providing valuable context – to each and every one of them.
Do you want to help Spotify invent new personalized sessions with generative voice AI to delight users? In this role, you’ll work with Spotify’s Text-to-Speech (TTS) team, Speak, to create generated voice audio that enriches users’ experience of music and podcast recommendations.
What You'll Do
Collaborate with a multidisciplinary team to optimize machine learning models for production use cases, ensuring they are highly efficient and scalableDesign and build efficient serving infrastructure for machine learning models that supports large-scale deployments across different regionsOptimize machine learning models in Pytorch or other libraries for real-time serving and production applicationsLead the effort to transition machine learning models from research and development into production, working closely with researchers and machine learning engineersBuild and maintain scalable Kubernetes clusters to manage and deploy machine learning models, ensuring reliability and performanceImplement and monitor logging metrics, diagnose infrastructure issues, and contribute to an on-call schedule to maintain production stabilityInfluence the technical design, architecture, and infrastructure decisions to support new and diverse machine learning architecturesCollaborate with stakeholders to drive forward initiatives related to the serving and optimization of machine learning models at scale.,
Who You Are
You have a passion for speech, audio and/or generative machine learningYou have world-class expertise in optimizing machine learning models for production use cases, and extensive experience with machine learning frameworks like PytorchYou are experienced in building efficient, scalable infrastructure to serve machine learning models, and managing Kubernetes clusters in multi-region setupsYou have a strong understanding of how to bring machine learning models from research to production and are comfortable working with innovative, cutting-edge architecturesYou are familiar with writing logging metrics and diagnosing production issues, and are willing to take part in an on-call schedule to maintain uptime and performanceYou have a collaborative mindset, enjoy working closely with research scientists, machine learning engineers, and backend engineers to innovate and improve model deployment pipelinesYou thrive in environments that require solving complex infrastructure challenges, including scaling and performance optimizationExperience with low-level machine learning libraries (e.g., Triton, CUDA) and performance optimization for custom components is a bonus,
Where You'll Be
We offer you the flexibility to work where you work best! For this role, you can be within the European region as long as we have a work location.This team operates within the GMT/CET time zone for collaboration.Excluding France due to on-call restrictions.