Bridging the Semantic Gap in Cold-Start Music Recommendation through Multimodal Learning
| Thesis Type | Bachelor |
| Thesis Status |
Currently running
|
| Student | Tobias Iltchev |
| Thesis Supervisor | |
| Contact |
Music streaming platforms host millions of tracks, yet recommendation systems struggle with newly released songs, they have no interaction history to learn from. This thesis tackles the item cold-start problem by developing a deep multimodal framework that combines low-level audio representations (extracted from spectrograms via convolutional neural networks) with rich metadata such as lyrics, genre tags, and artist biographies to infer a track's relevance without prior listener data.
A key insight driving this work is that cold-start scenarios are rarely truly cold: the vast majority of new releases come from artists with an existing catalog. By reframing the problem as semi-cold recommendation and attending over an artist's collaborative history, the framework transfers knowledge from established entities to newly released items, going well beyond audio analysis alone.
The thesis evaluates whether this multimodal approach can close the performance gap between cold and warm items compared to popularity-based baselines, with a particular focus on long-tail artist discovery and more equitable recommendation ecosystems.