The Implementation of Web Scraping and Weighting Term TF-IDF Method in Developing Job Search Information Systems
Pulut Suryati, Deborah Kurniawati, Edy Prayitno, Beny Fajar Riyanto
STMIK AKAKOM Yogyakarta
Abstract
There are many websites that specifically provide information about job vacancy. Information on job vacancies submitted in each website is independent, does not have any relationship with one another. This causes users (job seekers) to find information on the desired job on every job vacancy website that exists. This research combines job information from several (three) websites into one integrated information. This research was conducted by combining information from three websites with scraping methods, and weighing Frequent Frequency - Inverse Document Frequency (TF-IDF) words. After the information from the three websites is combined, the job title words will be indexed and stored in the database. TF, is used to determine the word weight of the job title and suitability with the searched keyword, while IDF is used to indicate the availability of terms or keywords. The result of this study is a prototype information system that can display job information from several website sources in an integrated manner, according to the keywords entered. Job seekers are facilitated to get some job information that is searched at once.
Keywords: job vacancies, scraping, TF-IDF
Topic: Informatic and Information System