INCITEST 2019 Conference

Comparasion Performace K-Means and Jaccard for Document Similarity
Reinhart Simanjuntak

Master of Information System, Faculty of Post Graduate
Universitas Komputer Indonesia
Jln. Dipati Ukur No. 112-116, 40132, Bandung, Jawa Barat,
Indonesia
reinnatan[at]gmail.com


Abstract

Document clustering is a technique to classify documents into certain number of groups based on some notion of closeness between documents. For now days internet moving so fast, and it causes a large number of high dimensional data which needs to be classified on some grounds to enable efficient processing and organization of data. For the example for this cases are blogs, E-commerce, Social networking, etc, use various clustering techniques for this purpose. some algorithm that use this paper are clustering using K-Means, and Jaccard algorithm.

Keywords: Document Similarity, Plagiarism, Clustering, K-Means, Jaccard

Topic: Informatic and Information System

Link: https://ifory.id/abstract-plain/pMj4YuJxBV2C

Web Format | Corresponding Author (Reinhart Natanael Simanjuntak)