Wikipedia-based Multi-Label Classification for Wikidata Property Prediction

Thesis Type Master
Thesis Status
Student Stefan Steinhauser
Thesis Supervisor
Research Field

Wikipedia has become one of the largest online encyclopedias and provides a massive amount of knowledge. Unfortunately, most of the information is stored as natural language text, which makes it hard for computers to make use of that knowledge. For that reason, Wikidata was introduced to store information from various projects of the Wikimedia Foundation in a structured format at a central place. The structured format of Wikidata are triples of subjects, properties and objects.

This work focuses on developing a deep neural network for recommending Wikidata properties using natural language text as input and Wikipedia articles for training. The recommender system developed in this work could then be used to support users in adding information when editing a document or it could be used to guide a user to add semantic tags while reading a document. Finally, an empirical evaluation of the developed recommender system and a comparison to other state-of-the-art approaches is presented.