Evaluation of Text Segmentation Algorithms
Text Segmentation deals with the problem to divide a given text in cohesive sections. The type of cohesiveness thereby varies and is based on different properties, for example same topics or same writing style. If for example different news articles of different genres (sports, politics, ...) are merged together in one document, then text segmentation algorithms should be able to find the borders of the individual articles automatically.
The aim of this thesis is one the one hand to find existing segmentation algorithms through literature search, and to recap them. On the other hand they should be executed and evaluated. If implementations are available (the programming language doesn't matter), they can be utilized, if not, algorithms should be reimplemented.