Tuning the Zhu-Takaoka string matching algorithm and experimental results
Kybernetika, Tome 38 (2002) no. 1, pp. 67-80 Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

In this paper we present experimental results for string matching algorithms which have a competitive theoretical worst case run time complexity. Of these algorithms a few are already famous for their speed in practice, such as the Boyer–Moore and its derivatives. We chose to evaluate the algorithms by counting the number of comparisons made and by timing how long they took to complete a given search. Using the experimental results we were able to introduce a new string matching algorithm and compared it with the existing algorithms by experimentation. These experimental results clearly show that the new algorithm is more efficient than the existing algorithms for our chosen data sets. Using the chosen data sets over 1,500,000 separate tests were conducted to determine the most efficient algorithm.
In this paper we present experimental results for string matching algorithms which have a competitive theoretical worst case run time complexity. Of these algorithms a few are already famous for their speed in practice, such as the Boyer–Moore and its derivatives. We chose to evaluate the algorithms by counting the number of comparisons made and by timing how long they took to complete a given search. Using the experimental results we were able to introduce a new string matching algorithm and compared it with the existing algorithms by experimentation. These experimental results clearly show that the new algorithm is more efficient than the existing algorithms for our chosen data sets. Using the chosen data sets over 1,500,000 separate tests were conducted to determine the most efficient algorithm.
Classification : 68P10, 68Q25, 68W32, 68W40
Keywords: string matching algorithm; efficiency
@article{KYB_2002_38_1_a3,
     author = {Berry, Thomas and Ravindran, Somasundaram},
     title = {Tuning the {Zhu-Takaoka} string matching algorithm and experimental results},
     journal = {Kybernetika},
     pages = {67--80},
     year = {2002},
     volume = {38},
     number = {1},
     mrnumber = {1899847},
     zbl = {1265.68362},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/KYB_2002_38_1_a3/}
}
TY  - JOUR
AU  - Berry, Thomas
AU  - Ravindran, Somasundaram
TI  - Tuning the Zhu-Takaoka string matching algorithm and experimental results
JO  - Kybernetika
PY  - 2002
SP  - 67
EP  - 80
VL  - 38
IS  - 1
UR  - http://geodesic.mathdoc.fr/item/KYB_2002_38_1_a3/
LA  - en
ID  - KYB_2002_38_1_a3
ER  - 
%0 Journal Article
%A Berry, Thomas
%A Ravindran, Somasundaram
%T Tuning the Zhu-Takaoka string matching algorithm and experimental results
%J Kybernetika
%D 2002
%P 67-80
%V 38
%N 1
%U http://geodesic.mathdoc.fr/item/KYB_2002_38_1_a3/
%G en
%F KYB_2002_38_1_a3
Berry, Thomas; Ravindran, Somasundaram. Tuning the Zhu-Takaoka string matching algorithm and experimental results. Kybernetika, Tome 38 (2002) no. 1, pp. 67-80. http://geodesic.mathdoc.fr/item/KYB_2002_38_1_a3/

[1] Apostolico A., Giancarlo R.: The Boyer–Moore–Galil string strategies revisited. SIAM J. Comput. 15 (1986), 1, 98–105 | DOI | MR

[2] Baeza–Yates R. A.: Improved string searching. Software – Practice and Experience 19 (1989), 3, 257–271 | DOI

[3] Boyer R. S., Moore J. S.: A fast string searching algorithm. Comm. ACM 23 (1977), 5, 1075–1091 | Zbl

[4] Charras C., Lecroq T.: Exact string matching, available at: http:--www</b> dir.univ-rouen.fr/ lecroq/string.ps (1998)

[5] Charras C., Lecroq T.: Exact string matching animation in JAVA available at: http:--www</b> dir.univ-rouen.fr/ charras/string/ (1998)

[6] Crochemore M., Czumaj A., Gasieniec L., Jarominek L., Lecroq T., Plandowski, W., Rytter W.: Speeding up two string matching algorithms. Algorithmica 12 (1994), 4, 247–267 | DOI | MR | Zbl

[7] Crochemore M., Lecroq T.: Tight bounds on the complexity of the Apostolico–Giancarlo algorithm. Inform. Process. Lett. 63 (1997), 4, 195–203 | DOI | MR

[8] Crochemore M., Rytter W.: Text Algorithms. Oxford University Press 1994 | MR | Zbl

[9] Hancart C.: Analyse exacte et en moyenne d’algorithmes de recherche d’un motif dans un texte. These de doctorat de l’Universite de Paris 7, 1993

[10] Horspool R. N.: Practical fast searching in strings. Software – Practice and Experience 10 (1980), 6, 501–506 | DOI

[11] Hume A., Sunday D.: Fast string searching. Software – Practice and Experience 21 (1991), 11, 1221–1248 | DOI

[12] Knuth D. E., Jr. J. H. Morris, Pratt V. R.: Fast pattern matching in strings. SIAM J. of Comput. 6 (1980), 1, 323–350 | MR

[13] Liu Z., Du X., Ishi N.: An improved adaptive string searching algorithm. Software – Practice and Experience 28 (1998), 2, 191–198 | DOI

[14] Melichar B.: Approximate string matching by finite automata. In: Computer Analysis of Images and Patterns (Lecture Notes in Computer Scince 970), Springer–Verlag, Berlin 1995, pp. 342–349

[15] Raita T.: Tuning the Boyer–Moore–Horspool string searching algorithm. Software – Practice and Experience 22 (1992), 10, 879–884 | DOI

[16] Smith P. D.: Experiments with a very fast substring search algorithm. Software – Practice and Experience 21 (1991), 10, 1065–1074 | DOI

[17] Smit G. V. de: A comparison of three string matching algorithms. Software – Practice and Experience 12 (1982), 57–66 | DOI | Zbl

[18] Sunday D. M.: A very fast substring search algorithm. Comm. ACM 33 (1990), 8, 132–142

[19] Rytter W.: A correct preprocessing algorithm for Boyer–Moore string searching. SIAM J. Comput. 9 (1980), 509–512 | DOI | MR | Zbl

[20] Zhu R. F., Takaoka T.: On improving the average case of the Boyer–Moore string matching algorithm. J. Inform. Process. 10 (1987), 3, 173–177 | Zbl