Intrinsic Plagiarism Detection and Author Analysis

Intrinsic Plagiarism Detection and Authorship Analysis

Today more and more text documents are made publicly available through large text collections or literary databases. As recent events show, the detection of plagiarism in such systems becomes considerably more important. To counter this problem, we propose the Plag-Inn algorithm in different variants, which attempt to expose plagiarism in text documents by analyzing the grammar of authors and finding significant stylistic differences within a single document.

The algorithms are also adapted so they can be applied to the fields of Authorship Attribution, Author Profiling as well as Multi-Author-Decomposition. Given a previously unseen text document, the question of Author Attribution is to predict the correct author, whereas the aim of Author Profiling is to extract meta information like the gender or the age of the writer. Finally, also collaboratively written documents can be automatically decomposed and clustered by distinct writers. All approaches reuse the idea of inspecting the grammar syntax of sentences.

Team

Publications

2014

Bib Link

Michael Tschuggnall and Günther Specht: Automatic Decomposition of Multi-Author Documents Using Grammar Analysis. In Proceedings of the 26th GI-Workshop on Foundations of Databases (Grundlagen von Datenbanken), (GvDB 2014), October 2014, Ritten, Italy. CEUR-WS.org, Volume 1313, pages 17-22, 2014

Bib Link

Michael Tschuggnall, Günther Specht: What Grammar Tells About Gender and Age of Authors. In Proceedings of the 4th International Conference on Advances in Information Mining and Management (IMMM 2014), July 2014, Paris, France, pp. 30-35, 2014

Bib Link

Michael Tschuggnall and Günther Specht: Enhancing Authorship Attribution By Utilizing Syntax Tree Profiles. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014), volume 2: Short Papers, April 2014, ACL, Gothenburg, Sweden, pages 195-199, 2014.

2013

Bib Link

Michael Tschuggnall and Günther Specht: Countering Plagiarism by Exposing Irregularities in Authors Grammars. In Proceedings of the European Intelligence and Security Informatics Conference (EISIC 2013), 12.-14. August 2013, Uppsala, Sweden, IEEE, pages 15-22, 2013

Bib Link

Michael Tschuggnall and Günther Specht: Using Grammar-Profiles to Intrinsically Expose Plagiarism in Text Documents. In Proceedings of the 18th International Conference of Natural Language Processing and Information Systems (NLDB 2013), Manchester, UK, June 2013, Springer, LNCS Volume 7934, pages 297-302, 2013

Bib Link Download

Michael Tschuggnall and Günther Specht. Detecting Plagiarism in Text Documents through Grammar-Analysis of Authors. In Proceedings of the 15. GI-Fachtagung Datenbanksysteme für Business, Technologie und Web (BTW 2013), 11.-15. März 2013, Magdeburg, LNI, pages 241-259, 2013

Bib Link

Michael Tschuggnall and Günther Specht: Plag-Inn: Uncovering Plagiarism by Examining Author’s Grammar Syntax. In M. Barden, Alexander Ostermann (ed): Scientific Computing @ uibk, innsbruck university press, pages 151-152, 2013

2012

Bib Link

Michael Tschuggnall and Günther Specht. Plag-Inn: Intrinsic Plagiarism Detection Using Grammar Trees. In Proceedings of the 17th International Conference of Natural Language Processing and Information Systems (NLDB 2012), Groningen, The Netherlands, June 2012, Springer, LNCS Volume 7337, pages 284-289, 2012