This paper introduces the Text Pre-processor, a tool that integrates several text pre-processing tasks such as tokenization, parts-of-speech tagging, and elimination of stop words. These pre-processing tasks are prerequisite for any text processing tasks such as sentiment analysis or text summarization. However, there does not exist any one-stop solution to perform multiple text pre-processing tasks. The Text Pre-processor serves to cover this gap. The tool includes five modules. These include text editor, single file processing, file to file processing, multiple file processing, as well as split and merge files. Informed by the technological acceptance model, a qualitative user study was conducted to evaluate the efficacy of the tool. Participants generally found the tool efficacious.
Natural language processing, Text processing, Text pre-processing, Text mining tool.
The authors would like to thank Mullick Kabirul Huda, Subhendu Maity, Subhajit Das, and Sk. Abdul Nasim for their help in this project.
