Scalable 3D Tracking of Multiple Interacting Objects

Brief description

We consider the problem of tracking multiple interacting objects in 3D, using RGBD input and by considering a hypothesize-and-test approach. Due to their interaction, objects to be tracked are expected to occlude each other in the field of view of the camera observing them. A naive approach would be to employ a Set of Independent Trackers (SIT) and to assign one tracker to each object. This approach scales well with the number of objects but fails as occlusions become stronger due to their disjoint consideration. The solution representing the current state of the art employs a single Joint Tracker (JT) that accounts for all objects simultaneously. This directly resolves ambiguities due to occlusions but has a computational complexity that grows geometrically with the number of tracked objects. We propose a middle ground, namely an Ensemble of Collaborative Trackers (ECT) that combines best traits from both worlds to deliver a practical and accurate solution to the multi-object 3D tracking problem. We present quantitative and qualitative experiments with several synthetic and real world sequences of diverse complexity. Experiments demonstrate that ECT manages to track far more complex scenes than JT at a computational time that is only slightly larger than that of SIT.

ECT allows for the 3D tracking of complex scenarios involving bimanual manipulation of several rigid objects using commodity hardware and with high accuracy. One problem with this approach is that it treats tracking as a search problem whose dimensionality increases with the number of objects in the scene. This fact typically limits the number of the tracked objects and/or the processing framerate. In a subsequenct work (BMVC 2015) we presented a method that utilizes simple low level motion cues for dynamically assigning computational resources to parts of the scene where they are actually required. In a series of experiments, we showed that this simple idea improves tracking performance dramatically at a cost of only a minor degradation of tracking accuracy.

Sample results

A video with 3D tracking experiments.

Boosting the performance of ECT by employing low level cues.


Relevant publications

  • N. Kyriazis, A.A. Argyros, “Scalable 3D Tracking of Multiple Interacting Objects”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), Columbus, Ohio, USA, 24-27 June, 2014.
  • A. Qammaz, N. Kyriazis, A.A. Argyros, “Boosting the Performance of Model-based 3D Tracking by Employing Low Level Motion Cues”, British Machine Vision Conference (BMVC 2015), Swansea, UK, Sep. 7-10, 2015.
  • N. Kyriazis and A.A. Argyros, "3D tracking of hands interacting with several objects", In IEEE International Conference on Computer Vision Workshops (OUI 2015 - ICCVW 2015), IEEE, Santiago, Chile, November 2015.
  • N. Kyriazis, I. Oikonomidis, P. Panteleris, D. Michel, A. Qammaz, A. Makris, K. Tzevanidis, P. Douvantzis, K. Roditakis and A.A. Argyros, "A generative approach to tracking hands and their interaction with objects", In Man-Machine Interactions 4 - International Conference on Man-Machine Interactions (ICMMI 2015), Springer, pp. 19-28, Kocierz, Poland, October 2015.

The electronic versions of the above publications can be downloaded from my publications page.