Home » 3D Pose Reconstruction

3D Pose Reconstruction

Monocular Image 3D Human Pose Estimation under Self-Occlusion



3D pose reconstruction from single image framework

Outline of our processing pipeline: (From the left:) Starting with an input image, 2D part detectors and self-occlusion reasoning are applied. Next, multiple synthetic views are generated from the initial view. Then, structure from motion is used to enforce kinematic constraints and reduce the ambiguity. Finally, orientation constraints are enforced from the synthetic views onto the initial input in order to generate the 3D pose.

An automatic approach for 3D pose reconstruction from a single image is proposed. The presence of human body articulation, hallucinated parts and cluttered background leads to ambiguity during the pose inference, which makes the problem non-trivial. Researchers have explored various methods based on motion and shading in order to reduce the ambiguity and reconstruct the 3D pose. The key idea of our algorithm is to impose both kinematic and orientation constraints. The former is imposed by projecting a 3D model onto the input image and pruning the parts, which are incompatible with the anthropomorphism. The latter is applied by creating synthetic views via regressing the input view to multiple oriented views. After applying the constraints, the 3D model is projected onto the initial and synthetic views, which further reduces the ambiguity. Finally, we borrow the direction of the unambiguous parts from the synthetic views to the initial one, which results in the 3D pose. Quantitative experiments are performed on the HumanEva-I dataset and qualitatively on unconstrained images from the Image Parse dataset. The results show the robustness of the proposed approach to accurately reconstruct the 3D pose form a single image.


Monocular Image 3D Human Pose Estimation under Self-Occlusion
Ibrahim Radwan, Abhinav Dhall, and Roland Goecke
Proc. of International Conference on Computer Vision(ICCV),Dec, 2013.



Quantitative evaluation on HumanEva dataset
Visual comparison between our method (right) and [18] (middle)
Visual comparison for the proposed method without occlusion handling (top) and with occlusion handling (bottom)


More qualitative results by applying the proposed framework on images from the Image Parse dataset.

2D Image 3D Pose 2D Image 3D Pose 2D Image 3D Pose 2D Image 3D Pose