Algorithm of Web Page Similarity Comparison Based on Visual Block
Computer Science and Information Systems, Tome 16 (2019) no. 3.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Phishing often deceives users due to the relative similarity to the true pages on a layout and leads to considerable losses for the society. Consequently, detecting phishing sites has been an urgent activity. By researching phishing web pages using web page screenshots, we discover that this kind of web pages use numerous web page screenshots to achieve the close similarity to the true page and avoid the text and structure similarity detection. This study introduces a new similarity matching algorithm based on visual blocks. First, the RenderLayer tree of the web page is obtained to extract the visual block. Second, an algorithm that will settle the jumbled visual blocks, including the deletion of the small visual blocks and the emergence of the overlapping visual blocks, is designed. Finally, the similarity between the two web pages is assessed. The proposed algorithm sets different thresholds to achieve the optimal missing and false alarm rates.
Keywords: phishing, similarity comparison, visual block, web rendering
@article{CSIS_2019_16_3_a7,
     author = {Xingchen Li and Weizhe Zhang and Desheng Wang and Bin Zhang and Hui He},
     title = {Algorithm of {Web} {Page} {Similarity} {Comparison} {Based} on {Visual} {Block}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {16},
     number = {3},
     year = {2019},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2019_16_3_a7/}
}
TY  - JOUR
AU  - Xingchen Li
AU  - Weizhe Zhang
AU  - Desheng Wang
AU  - Bin Zhang
AU  - Hui He
TI  - Algorithm of Web Page Similarity Comparison Based on Visual Block
JO  - Computer Science and Information Systems
PY  - 2019
VL  - 16
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2019_16_3_a7/
ID  - CSIS_2019_16_3_a7
ER  - 
%0 Journal Article
%A Xingchen Li
%A Weizhe Zhang
%A Desheng Wang
%A Bin Zhang
%A Hui He
%T Algorithm of Web Page Similarity Comparison Based on Visual Block
%J Computer Science and Information Systems
%D 2019
%V 16
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2019_16_3_a7/
%F CSIS_2019_16_3_a7
Xingchen Li; Weizhe Zhang; Desheng Wang; Bin Zhang; Hui He. Algorithm of Web Page Similarity Comparison Based on Visual Block. Computer Science and Information Systems, Tome 16 (2019) no. 3. http://geodesic.mathdoc.fr/item/CSIS_2019_16_3_a7/