Occlusion-tolerant and personalized 3D human pose estimation in RGB images

Brief description

We introduce a real-time method that estimates the 3D human pose directly in the popular BVH format, given estimations of the 2D body joints in RGB images. Our contributions include: (a) A novel and compact 2D pose representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the H3.6M dataset compared to the baseline MocapNET method while maintaining real-time performance (70 fps in CPU-only execution).

Sample results

Video with description and experimental results

Main web page

Check the github page of Ammar Qammaz.


  • Ammar Qammaz, Antonis A. Argyros
  • We gratefully acknowledge the support of NVIDIA Corporation with the donation of a Quadro P6000 GPU used for the execution of this research. This work was partially supported by EU H2020 project Co4Robots (Grant No 731869).

Relevant publications

  • A. Qammaz and A.A. Argyros, "Occlusion-tolerant and personalized 3D human pose estimation in RGB images", In IEEE International Conference on Pattern Recognition (ICPR 2020), Milan, Italy, January, 2021.
  • A. Qammaz, A.A. Argyros, “MocapNET: Ensemble of SNN Encoders for 3D Human Pose Estimation in RGB Images”, British Machine VIsion Conference (BMVC 2019), Cardiff, UK, September, 2019.

The electronic versions of the above publications can be downloaded from my publications page.