Proximity full-text search with response time guarantee by means of three component keys
Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika, Tome 7 (2018) no. 1, pp. 60-77

Voir la notice de l'article provenant de la source Math-Net.Ru

Searches for phrases and word sets in large text arrays by means of additional indexes are considered. A search result is a list of documents that contain specified words. A document which contains the query words near each other is more important. Such a tack required to store one posting per any word occurrence in a document. Some search systems use a list of stop words and exclude any information about a stop word from the index thus reducing search quality. In our paper we store information about all words to ensure search quality and build additional indexes for most frequently used words. Use of the additional indexes may reduce the query processing time by an order of magnitude and more in comparison with standard indexes. A new three component key based index has described. Results of search experiments are given and new search algorithm is provided. The results of the experiments shows 90 times improvement of search time for a class of queries containing most frequently used words in comparison with default inverted file.
Keywords: full-text search, search engines, inverted files, additional indexes, proximity search.
@article{VYURV_2018_7_1_a4,
     author = {A. B. Veretennikov},
     title = {Proximity full-text search with response time guarantee by means of three component keys},
     journal = {Vestnik \^U\v{z}no-Uralʹskogo gosudarstvennogo universiteta. Seri\^a Vy\v{c}islitelʹna\^a matematika i informatika},
     pages = {60--77},
     publisher = {mathdoc},
     volume = {7},
     number = {1},
     year = {2018},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VYURV_2018_7_1_a4/}
}
TY  - JOUR
AU  - A. B. Veretennikov
TI  - Proximity full-text search with response time guarantee by means of three component keys
JO  - Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika
PY  - 2018
SP  - 60
EP  - 77
VL  - 7
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/VYURV_2018_7_1_a4/
LA  - ru
ID  - VYURV_2018_7_1_a4
ER  - 
%0 Journal Article
%A A. B. Veretennikov
%T Proximity full-text search with response time guarantee by means of three component keys
%J Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika
%D 2018
%P 60-77
%V 7
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/VYURV_2018_7_1_a4/
%G ru
%F VYURV_2018_7_1_a4
A. B. Veretennikov. Proximity full-text search with response time guarantee by means of three component keys. Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika, Tome 7 (2018) no. 1, pp. 60-77. http://geodesic.mathdoc.fr/item/VYURV_2018_7_1_a4/