Designing a Python Based Text Pre-Processing Application for Text Classification

Hermawan Arief Putranto, Taufiq Rizaldi, Wahyu Kurnia Dewanto, Wahyu Pebrianto

ICoFA 2019 Conference

Designing a Python Based Text Pre-Processing Application for Text Classification
Hermawan Arief Putranto, Taufiq Rizaldi, Wahyu Kurnia Dewanto, Wahyu Pebrianto

Politeknik Negeri Jember

Abstract

The first step that is always passed by documents in natural language processing is preprocessing text. These steps are needed for transferring text from human language to machine-readable format for further processing. However, not many special applications have been found that function as text preprocessing. This has led to any research on natural language processing having to create its own program code for the preprocessing text phase. The main focus of this research is to create an integrated text preprocessing application that can be accessed by any researcher who needs it. Several issues discussed in this study include the design, implementation, testing and integration of each text preprocessing feature. Text preprocessing which is integrated in this research includes case folding, tokenizing, stop word removal and stemming. The tools used in this research are the NLTK library of python and Django framework.

Keywords: Case folding, Natural Language Processing, Stemming, Stopword Removal

Topic: Others (Related to food and agriculture)

Link: https://ifory.id/abstract-plain/7tuXg9rHp6b3

Web Format | Corresponding Author (Hermawan Arief Putranto)

Ifory - Indonesia Conference Directory