Brief description

We have proposed a method for tracking multiple skin colored objects in images acquired by a possibly moving camera. The proposed method encompasses a collection of techniques that enable the modeling and detection of skin-colored objects as well as their temporal association in image sequences. Skin-colored objects are detected with a Bayesian classifier which is bootstrapped with a small set of training data. Then, an on-line iterative training procedure is employed to refine the classifier using additional training images. On-line adaptation of skin-color probabilities is used to enable the classifier to cope with illumination changes. Tracking over time is realized through a novel technique which can handle multiple skin-colored objects. Such objects may move in complex trajectories and occlude each other in the field of view of a possibly moving camera. Moreover, the number of tracked objects may vary in time. A prototype implementation of the developed system operates on 320x240 live video in real time (30Hz) on a conventional Pentium 4 processor.

The proposed 2D tracker has formed a basic building block for tracking multiple skin colored regions in 3D. More specifically, we have developed a method which is able to report the 3D position of all skin-colored regions in the field of view of a potentially moving stereoscopic camera system. The prototype implementation of the 3D version of the tracker also operates at 30 fps.

On top of this functionality, the tracker is able to deliver 3D contours of all skin colored regions; this is performed at a rate of 22 fps.

One of the very important aspects of the proposed tracker is that it can be trained to any desired color distribution, which can be subsequently tracked efficiently and robustly with high tolerance in illumination changes.

Due to its robustness and efficiency, the proposed tracker(s) have already been used as important building blocks in a number of diverse applications. More specifically, the 2D tracker has been employed for:

  • Tracking the hands of a person for human computer interaction. Simple gesture recognition techniques applied on top of the outcome of the skin-colored regions tracker has resulted in a system that controls the mouse of a computer based on the visual interpretation of hand gestures. These gesture recognition techniques are based on finger detection in skin-colored regions corresponding to human hands. The developed demonstrator has successfully been employed in real-world situations where a human controls the computer during MS powerpoint presentations.
  • Tracking color blobs in vision-based robot navigation experiments. The tracker has been trained in various (non-skin) color distributions to support angle-based robot navigation.

Moreover, the 3D tracker has been employed as a basic building block in the framework of a cognitive vision system developed within the EU-IST ActIPret project, whose goal is the automatic interpretation of the activities of people handling tools. In fact, the ActIPret project was the one that supported financially this research on 2D and 3D tracking. Although the ActIPret project has been successfully finished, there are still several on-going activities in several aspects of 2D and 3D tracking of skin colored regions.

Sample results

Download video: A video showing the performance of the tracker in varying illumination conditions

2D tracking of human face and hands.

Finger detection on the tracked skin-colored regions.

Download video: A video of a detected and tracked hand

The estimated 3D trajectory of the centroid of the hand appearing in the video at the left. The developed hand tracker can also detect and reconstruct the 3D contour of the hand


Download video: A video showing (among other things) the performance of the 3D tracker in the context of the ActIPret demonstrator


Relevant publications

  • A.A. Argyros, M.I.A. Lourakis, "Binocular Hand Tracking and Reconstruction Based on 2D Shape Matching", in proceedings of the International Conference on Pattern Recognition 2006 (ICPR’06), Hong Kong, China, 20 – 24, August 2006.
  • A.A. Argyros, M.I.A. Lourakis, “Vision-based Interpretation of Hand Gestures for Remote Control of a Computer Mouse”, in proceedings of the HCI’06 workshop (in conjunction with ECCV’06), LNCS 3979, Springer Verlag, pp.40-51, Graz, Austria, May 13th, 2006. Recipient of the “Best Paper Award”.
  • A.A. Argyros, M.I.A. Lourakis, “Tracking Skin-colored Objects in Real-time”, invited contribution to the “Cutting Edge Robotics” book, ISBN 3-86611-038-3, Advanced Robotic Systems International, 2005.
  • A.A. Argyros, M.I.A. Lourakis, “Real time Tracking of Multiple Skin-Colored Objects with a Possibly Moving Camera”, in proceedings of the European Conference on Computer Vision (ECCV’04), Springer-Verlag, vol. 3, pp. 368-379, May 11-14, 2004, Prague, Chech Republic.
  • A.A. Argyros, M.I.A. Lourakis, “Tracking Multiple Colored Blobs With a Moving Camera” in proceedings of the Computer Vision and Pattern Recognition Conference, (CVPR’05), vol. 2, no. 2, p. 1178, San Diego, USA, June 20-26, 2005.
  • A.A. Argyros, M.I.A. Lourakis, “3D Tracking of Skin-Colored Regions by a Moving Stereoscopic Observer”, Applied Optics, Information Processing Journal, Special Issue on Target Detection, Vol. 43, No 2, pp. 366-378, January 2004.
  • K. Sage, J. Howell, H. Buxton, A.A. Argyros, “Learning Temporal Structure for Task-based Control”, Image and Vision Computing Journal (IVC), special issue on Cognitive Systems, conditionally accepted, under revision.
  • S.O. Orphanoudakis, A.A. Argyros, M. Vincze “Towards a Cognitive Vision Methodology: Understanding and Interpreting Activities of Experts”, ERCIM News, No 53, Special Issue on “Cognitive Systems, April 2003

The electronic versions of the above publications can be downloaded from my publications page.