A Text Categorization Technique based on a Numerical Conversion of a Symbolic Expression and an Onion Layers Algorithm

Marios Poulos, Sozon Papavlasopoulos, Vasilios Chrissikopoulos

Abstract


The dramatic increase in the amount of content available in digital forms gives rise to large-scale digital libraries, targeted at millions of users. As a result, it has become a necessary to categorize large texts (documents). The paper develops a novel method where text categorization is achieved via a reduction in the original data information using numerical conversion of a symbolic expression and an onion layers algorithm. Three different semantic categories were considered and five texts selected from each category for submission to a text categorization procedure using the proposed method. The results and the statistical evaluation of this procedure showed that the proposed method may be characterized as highly accurate for text categorization purposes.

Full Text: HTML