An efficient algorithm for three-component key index construction
    
    
  
  
  
      
      
      
        
Vestnik Udmurtskogo universiteta. Matematika, mehanika, kompʹûternye nauki, Tome 29 (2019) no. 1, pp. 117-132
    
  
  
  
  
  
    
      
      
        
      
      
      
    Voir la notice de l'article provenant de la source Math-Net.Ru
            
              			Proximity full-text searches in large text arrays are considered. A search query consists of several words. The search result is a list of documents containing these words. In a modern search system, documents that contain search query words that are near each other are more relevant than other documents. To solve this task, for each word in each indexed document, we need to store a record in the index. In this case, the query search time is proportional to the number of occurrences of the queried words in the indexed documents. Consequently, it is common for search systems to evaluate queries that contain frequently occurring words much more slowly than queries that contain less frequently occurring, ordinary words. For each word in the text, we use additional indexes to store information about nearby words at distances from the given word of less than or equal to $MaxDistance$, which is a parameter. This parameter can take a value of 5, 7, or even more. Three-component key indexes can be created for faster query execution. Previously, we presented the results of experiments showing that, when queries contain very frequently occurring words, the average time of the query execution with three-component key indexes is 94.7 times less than that required when using ordinary inverted indexes. In the current work, we describe a new three-component key index building algorithm. We prove the correctness of the algorithm. We present the results of experiments of the index creation
depending on the value of $MaxDistance$.
			
            
            
            
          
        
      
                  
                    
                    
                    
                    
                    
                      
Keywords: 
full-text search, search engines, inverted files, additional indexes, proximity search, three-component key indexes.
                    
                  
                
                
                @article{VUU_2019_29_1_a10,
     author = {A. B. Veretennikov},
     title = {An efficient algorithm for three-component key index construction},
     journal = {Vestnik Udmurtskogo universiteta. Matematika, mehanika, kompʹ\^uternye nauki},
     pages = {117--132},
     publisher = {mathdoc},
     volume = {29},
     number = {1},
     year = {2019},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VUU_2019_29_1_a10/}
}
                      
                      
                    TY - JOUR AU - A. B. Veretennikov TI - An efficient algorithm for three-component key index construction JO - Vestnik Udmurtskogo universiteta. Matematika, mehanika, kompʹûternye nauki PY - 2019 SP - 117 EP - 132 VL - 29 IS - 1 PB - mathdoc UR - http://geodesic.mathdoc.fr/item/VUU_2019_29_1_a10/ LA - ru ID - VUU_2019_29_1_a10 ER -
%0 Journal Article %A A. B. Veretennikov %T An efficient algorithm for three-component key index construction %J Vestnik Udmurtskogo universiteta. Matematika, mehanika, kompʹûternye nauki %D 2019 %P 117-132 %V 29 %N 1 %I mathdoc %U http://geodesic.mathdoc.fr/item/VUU_2019_29_1_a10/ %G ru %F VUU_2019_29_1_a10
A. B. Veretennikov. An efficient algorithm for three-component key index construction. Vestnik Udmurtskogo universiteta. Matematika, mehanika, kompʹûternye nauki, Tome 29 (2019) no. 1, pp. 117-132. http://geodesic.mathdoc.fr/item/VUU_2019_29_1_a10/
