The structure-from-motion reconstruction pipeline – a survey with focus on short image sequences
Kybernetika, Tome 46 (2010) no. 5, pp. 926-937 Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

The problem addressed in this paper is the reconstruction of an object in the form of a realistically textured 3D model from images taken with an uncalibrated camera. We especially focus on reconstructions from short image sequences. By means of a description of an easy to use system, which is able to accomplish this in a fast and reliable way, we give a survey of all steps of the reconstruction pipeline. For the purpose of developing a coherent reconstruction system it is necessary to integrate a number of different techniques such as feature detection, algorithms of the RANSAC-family, and methods for auto-calibration. We describe and review recent developments of distinct strands of these techniques. While developing our system the necessity of improvements of several steps of the state-of-the-art reconstruction pipeline emerged. Two of these innovations are introduced in detail in this paper: an advanced SIFT-based feature detector and a two-stage RANSAC process facilitating a faster selection of relevant object points. In addition, we give a recommendation regarding auto-calibration for short image sequences.
The problem addressed in this paper is the reconstruction of an object in the form of a realistically textured 3D model from images taken with an uncalibrated camera. We especially focus on reconstructions from short image sequences. By means of a description of an easy to use system, which is able to accomplish this in a fast and reliable way, we give a survey of all steps of the reconstruction pipeline. For the purpose of developing a coherent reconstruction system it is necessary to integrate a number of different techniques such as feature detection, algorithms of the RANSAC-family, and methods for auto-calibration. We describe and review recent developments of distinct strands of these techniques. While developing our system the necessity of improvements of several steps of the state-of-the-art reconstruction pipeline emerged. Two of these innovations are introduced in detail in this paper: an advanced SIFT-based feature detector and a two-stage RANSAC process facilitating a faster selection of relevant object points. In addition, we give a recommendation regarding auto-calibration for short image sequences.
Classification : 68U05, 68U10
Keywords: structure from motion; feature detection; RANSAC; auto-calibration
@article{KYB_2010_46_5_a7,
     author = {H\"aming, Klaus and Peters, Gabriele},
     title = {The structure-from-motion reconstruction pipeline {\textendash} a survey with focus on short image sequences},
     journal = {Kybernetika},
     pages = {926--937},
     year = {2010},
     volume = {46},
     number = {5},
     mrnumber = {2778920},
     zbl = {1211.94006},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/KYB_2010_46_5_a7/}
}
TY  - JOUR
AU  - Häming, Klaus
AU  - Peters, Gabriele
TI  - The structure-from-motion reconstruction pipeline – a survey with focus on short image sequences
JO  - Kybernetika
PY  - 2010
SP  - 926
EP  - 937
VL  - 46
IS  - 5
UR  - http://geodesic.mathdoc.fr/item/KYB_2010_46_5_a7/
LA  - en
ID  - KYB_2010_46_5_a7
ER  - 
%0 Journal Article
%A Häming, Klaus
%A Peters, Gabriele
%T The structure-from-motion reconstruction pipeline – a survey with focus on short image sequences
%J Kybernetika
%D 2010
%P 926-937
%V 46
%N 5
%U http://geodesic.mathdoc.fr/item/KYB_2010_46_5_a7/
%G en
%F KYB_2010_46_5_a7
Häming, Klaus; Peters, Gabriele. The structure-from-motion reconstruction pipeline – a survey with focus on short image sequences. Kybernetika, Tome 46 (2010) no. 5, pp. 926-937. http://geodesic.mathdoc.fr/item/KYB_2010_46_5_a7/

[1] Baumberg, A.: Reliable feature matching across widely separated views. In: IEEE Conf. on Computer Vision and Pattern Recognition 2000, Vol. 01, pp. 1774–1781.

[2] Bay, H., Tuytelaars, T., Gool, L. Van: Surf: Speeded up robust features. In: 9th European Conference on Computer Vision, Graz 2006.

[3] Beardsley, P. A., Torr, P. H. S., Zisserman, A.: 3d model acquisition from extended image sequences. In: ECCV ’96: Proc. 4th European Conference on Computer Vision-Volume II, Springer, London 1996, pp. 683–695.

[4] Beis, J. S., Lowe, D. G.: Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proc. IEEE Conf. Comp. Vision Patt. Recog 1997, pp. 1000–1006.

[5] Birchfield, S., Tomasi, C.: Depth discontinuities by pixel-to-pixel stereo. Internat. J. Comput. Vision 3 (1999), 269–293. | DOI

[6] Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8 (1986), 6, 679–698. | DOI

[7] Chum, O., Matas, J.: Matching with PROSAC – progressive sample consensus. In: Proc. Conference on Computer Vision and Pattern Recognition (C. Schmid, S. Soatto, and C. Tomasi, eds.), Vol. 1, Los Alamitos 2005, IEEE Computer Society, pp. 220–226.

[8] Chum, O., Matas, J., Kittler, J.: Locally optimized ransac. In: DAGM-Symposium 2003, pp. 236–243.

[9] Chum, O., Matas, J., Obdržálek, Š.: Enhancing RANSAC by generalized model optimization. In: Proc. Asian Conference on Computer Vision (ACCV) (K.-S. Hong and Z. Zhang, eds.), Vol. 2, Seoul 2004, Asian Federation of Computer Vision Societies, pp. 812–817.

[10] Cox, I. J., Hingorani, S. L., Rao, S. B., Maggs, B. M.: A maximum likelihood stereo algorithm. Comput. Vis. Image Underst. 63 (1996), 3, 542–567. | DOI

[11] Dellaert, F., Seitz, S. M., Thorpe, Ch. E., Thrun, S.: Structure from motion without correspondence. In: IEEE Conf. on Computer Vision and Pattern Recognition 2000, pp. 557–564.

[12] Dempster, A. P., Laird, N. M., Rubin, D. B.: Maximum likelihood from incomplete data via the em algorithm. J. Roy. Statist. Soc. Ser. B 39 (1977), 1, 1–38. | MR | Zbl

[13] Fischler, M. A., Bolles, R. C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24 (1981), 6, 381–395. | DOI | MR

[14] Fitzgibbon, A. W., Zisserman, A.: Automatic 3D model acquisition and generation of new images from video sequences. In: Proc.European Signal Processing Conference (EUSIPCO ’98), Rhodes 1998, pp. 1261–1269.

[15] Fitzgibbon, A. W., Zisserman, A.: Automatic camera recovery for closed or open image sequences. In: Proc. European Conference on Computer Vision 1998, pp. 311–326.

[16] Frahm, J.-M., Pollefeys, M.: Ransac for (quasi-)degenerate data (qdegsac). In: CVPR ’06: Proc. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington 2006, IEEE Computer Society, pp. 453–460.

[17] Friedman, J. H., Bentley, J. L., Finkel, R. A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Software 3 (1997), 3, 209–226.

[18] Pollefeys, M., Gool, L J. Van, Meerbergen, G. Van, Vergauwen, M.: A hierarchical symmetric stereo algorithm using dynamic programming. Internat. J. Comput. Vision 47 (2002), 275–285. | DOI

[19] Häming, K., Peters, G.: Extension of the generalized image rectification – Catching the infinity cases. In: Proc. 4th International Conference on Informatics in Control, Automation, and Robotics (ICINCO 2007) (J. Zaytoon, J.-L. Ferrier, J. A. Cetto, and J. Filipe, eds.), Vol. RA-2, Angers 2007, Institute for Systems and Technologies of Information, Control and Communication, pp. 275–279.

[20] Harris, Ch., Stephens, M.: A combined Corner and Edge detector. In: 4th ALVEY Vision Conference 1988, pp. 147–151.

[21] Hartley, R. I., Zisserman, A.: Multiple View Geometry in Computer Vision. Second edition. Cambridge University Press 2004. | MR | Zbl

[22] Koch, R., Pollefeys, M., Gool, L. J. Van: Realistic surface reconstruction of 3d scenes from uncalibrated image sequences. J. Visualization and Computer Animation 11 (2000), 3, 115–127. | DOI

[23] Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Trans. Pattern Analysis and Machine Intelligence 27 (2005), 3, 418–433. | DOI

[24] Lindeberg, T.: Feature detection with automatic scale selection. Internat. J. Comput. Vision 30 (1998), 2, 77–116.

[25] Lindeberg, T., Bretzner, L.: Real-time scale selection in hybrid multi-scale representations. In: Proc. Scale-Space, Lect. Notes in Comput. Sci. 2695, Springer 2003, pp. 148–163. | Zbl

[26] Lowe, D. G.: Distinctive image features from scale-invariant keypoints. Internat. J. Comput. Vision 60 (2004), 2, 91–110. | DOI

[27] Lucas, B. D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI81, pp. 674–679.

[28] Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Internat. J. Comput. Vision 60 (2004), 1, 63–86. | DOI

[29] Peters, G., Häming, K.: Fast freehand acquisition of 3d objects and their visualization. J. Commun. Comput. 7 (2010), 2–3.

[30] Pollefeys, M., Gool, L. Van, Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. Internat. J. Comput. Vision 59 (2004), 3, 207–232. | DOI

[31] Pollefeys, M., Koch, R., Gool, L. J. van: Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters. In: ICCV 1998, pp. 90–95.

[32] Pollefeys, M., Koch, R., Gool, L. J. van: A simple and efficient rectification method for general motion. In: Proc. Internat. Conference on Computer Vision (ICCV 1999), pp. 496–501.

[33] Pollefeys, M., Verbiest, F., Gool, L. Van: Surviving dominant planes in uncalibrated structure and motion recovery. In: Computer Vision – ECCV 2002, 7th European Conference on Computer Vision (Johansen, ed.). Lect. Notes Comput. Sci. 2351, Springer-Verlag 2002, pp. 837–851.

[34] Pollefeys, M., Vergauwen, M., Cornelis, K., Tops, J., Verbiest, F., Structure, L. Van Gool., In, motion from image sequences.: Proc. Conference on Optical 3-D Measurement Techniques V (K. Gruen, ed.), Vienna 2001. pp. 251–258.

[35] Ponce, J., Papadopoulo, T., Teillaud, M., Triggs, B.: On the absolute quadratic complex and its application to autocalibration. In: IEEE Conference on Computer Vision & Pattern Recognition 2005, Vol. I., pp. 780–787.

[36] Prasad, M., Fitzgibbon, A. W.: Single view reconstruction of curved surfaces. In: IEEE Conf. on Computer Vision and Pattern Recognition 2006, Vol. 02, pp. ,1345–1354.

[37] Saxena, A., Sun, M., Ng, A. Y.: Make3d: Depth perception from a single still image. In: AAAI (D. Fox and C. P. Gomes, eds.), AAAI Press 2008, pp. 1571–1576.

[38] Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets, or “How do I organize my holiday snaps?”. In: Proc. 7th European Conference on Computer Vision, Copenhagen 2002, Springer, Vol. 1, pp. 414–431. | Zbl

[39] Schaffalitzky, F., Zisserman, A., Hartley, R. I., Torr, P. H. S.: A six point solution for structure and motion. In: ECCV ’00: Proc. 6th European Conference on Computer Vision, Vol. I, London 2000, Springer, pp. 632–648.

[40] Shen, F., Wang, H.: A local edge detector used for finding corners. Proc. ICICS, 2001.

[41] Shi, J., Tomasi, C.: Good features to track. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’94), Seattle 1994.

[42] Snavely, N., Seitz, S. M., Szeliski, R.: Photo tourism: Exploring photo collections in 3d. ACM Trans. on Graphics (SIGGRAPH Proc.), 25 (2006), 3, 835–846. | DOI

[43] Torr, P. H. S., Zisserman, A.: Mlesac: a new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. 78 (2000), 1, 138–156. | DOI

[44] Triggs, B.: Autocalibration and the absolute quadric. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, Puerto Rico 1977, IEEE Computer Society Press, pp. 609–614.

[45] Tsai, R. Y.: A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. In: Radiometry (L. B. Wolff, S. A. Shafer, and G. Healey, eds.), Jones and Bartlett Publishers, Inc., pp. 221–244, 1992.

[46] Vergauwen, M., Gool, L. Van: Web-based 3d reconstruction service. Mach. Vision Appl. 17 (2006), 6, 411–426. | DOI

[47] Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2001, Vol. 1, 511.

[48] Woodford, O. J., Torr, P. H. S., Reid, I. D., Fitzgibbon, A. W.: Global stereo reconstruction under second order smoothness priors. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage 2008.

[49] Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Analysis and Machine Intelligence 22 (1998), 1330–1334. | DOI