Distributed and Collaborative Web Change Detection System
Computer Science and Information Systems, Tome 12 (2015) no. 1.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Search engines use crawlers to traverse the Web in order to download web pages and build their indexes. Maintaining these indexes up-to-date is an essential task to ensure the quality of search results. However, changes in web pages are unpredictable. Identifying the moment when a web page changes as soon as possible and with minimal computational cost is a major challenge. In this article we present theWeb Change Detection system that, in a best case scenario, is capable to detect, almost in real time, when a web page changes. In a worst case scenario, it will require, on average, 12 minutes to detect a change on a low PageRank web site and about one minute on a web site with high PageRank. Meanwhile, current search engines require more than a day, on average, to detect a modification in a web page (in both cases).
Keywords: Content refresh, Incremental crawling, Crawling systems and Search engines
@article{CSIS_2015_12_1_a5,
     author = {V{\'\i}ctor M. Prieto and Manuel \'Alvarez and V{\'\i}ctor Carneiro and Fidel Cacheda},
     title = {Distributed and {Collaborative} {Web} {Change} {Detection} {System}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {12},
     number = {1},
     year = {2015},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2015_12_1_a5/}
}
TY  - JOUR
AU  - Víctor M. Prieto
AU  - Manuel Álvarez
AU  - Víctor Carneiro
AU  - Fidel Cacheda
TI  - Distributed and Collaborative Web Change Detection System
JO  - Computer Science and Information Systems
PY  - 2015
VL  - 12
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2015_12_1_a5/
ID  - CSIS_2015_12_1_a5
ER  - 
%0 Journal Article
%A Víctor M. Prieto
%A Manuel Álvarez
%A Víctor Carneiro
%A Fidel Cacheda
%T Distributed and Collaborative Web Change Detection System
%J Computer Science and Information Systems
%D 2015
%V 12
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2015_12_1_a5/
%F CSIS_2015_12_1_a5
Víctor M. Prieto; Manuel Álvarez; Víctor Carneiro; Fidel Cacheda. Distributed and Collaborative Web Change Detection System. Computer Science and Information Systems, Tome 12 (2015) no. 1. http://geodesic.mathdoc.fr/item/CSIS_2015_12_1_a5/