AASEC 2019 Conference

Index Group Documents Optimization Based on Automatic Clustering using Kmeans Genetic Algorithm: Case Study UMSIDA E-Prints Repository
Rohman dijaya, Nindia Maftul Shintia Devi, Mohammad Alfan Rosyid

Universitas Muhammadiyah Sidoarjo


Abstract

UMSIDA Eprints is a repository of student and lecturer publication documents at the Muhammadiyah Sidoarjo University (UMSIDA). The collection of documents is still random and the search can only detect from the title keywords. The increasing culture of writing and research makes it possible for more and more documents as literature. Documents in Eprints are grouped by subject provided by the repository manager and grouped by the admin who uploaded the document. Automatic document grouping can be done by grouping documents based on the contents of the document using the Information Retrieval (IR) approach. The retrieval process is carried out by document processing with tokenization to obtain data tokens, the data tokens are processed through a stemming process to obtain the stem value of each word. The stem value is processed using the indexing process and word stem to get sentence indexes through the weighting process. The index results stored in the database become document variables that are the features or characteristics of each document. The index of all documents is grouped through the Kmeans Genetic Algorithm (GA-Kmeans) by determining the similarity of the document or the proximity of the centroid with the index of each data. The cluster data will be in accordance with the groups grouped from the closest distance to the cluster.

Keywords: Automatic Optimation,Eprints,Genetic Algorithm, Index, Information Retreival

Topic: Computer Science

Link: https://ifory.id/abstract-plain/vB3XDFuKnErV

Web Format | Corresponding Author (rohman dijaya)