Authorship Attribution with Support Vector Machines

Diederich, Joachim, Kindermann, Jörg, Leopold, Edda and Paass, Gerhard (2003) Authorship Attribution with Support Vector Machines. Applied Intelligence, 19 1-2: 109-123. doi:10.1023/A:1023824908771

Author Diederich, Joachim
Kindermann, Jörg
Leopold, Edda
Paass, Gerhard
Title Authorship Attribution with Support Vector Machines
Journal name Applied Intelligence   Check publisher's open access policy
ISSN 0924-669X
Publication date 2003
Sub-type Article (original research)
DOI 10.1023/A:1023824908771
Open Access Status
Volume 19
Issue 1-2
Start page 109
End page 123
Total pages 15
Place of publication United States
Publisher Kluwer Academic Publishers
Language eng
Subject 280205 Text Processing
700103 Information processing services
Abstract In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics.
Keyword support vector machines
authorship identification
Q-Index Code CX

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 86 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 141 times in Scopus Article | Citations
Google Scholar Search Google Scholar
Created: Thu, 23 Aug 2007, 14:43:43 EST