TEERS - Tool for the Experimental Evaluation of Recommender Systems
Music recommender systems need to be evaluated on user satisfaction because they constantly interact with users but most evaluation approaches use datasets without permanent user interaction. Such an approach is useful to compare recommender algorithms in a laboratory scenario. Unfortunately, a system that performs great in laboratory, where it is tested using a static dataset, can totally fail when used in production. This happens if the system cannot handle a constantly changing dataset (i.e. music charts change over time) and therefore it will not satisfy users. As user satisfaction is very important, especially in the context of music recommender systems, this work presents an approach which enables scientists to gain access to similar user feedback, as huge companies like Amazon, Google, Facebook and Spotify have. Gathering user feedback is done by observing users while they use the tool. This allows scientists to run evaluations on user feedback and therefore get information on user satisfaction.
TEERS uses the public available meta data of music tracks provided by Spotify to present additional information of recommended tracks. Further, it is possible to compare different recommendation algorithms by informing the tool of the used algorithm. Hence, TEERS offers a flexible platform for online evaluations of music recommender systems with good usability as shown by a case study.