POSE ESTIMATION USING COMPUTER VISION

POSE ESTIMATION USING COMPUTER VISION

Ever imagined that instead of playing adventurous games by using keyboard or mouse you can play it by yourself !! i.e by running you are making the game character run , by jumping you are making it jump and many more things !! interesting , right?

But can computer detect human actions ? if yes then how?

Lets deep dive and understand a very interesting topic :

Pose estimation using computer vision.

Pose estimation is a computer vision technique that predicts and tracks the location of a person or object. This is done by looking at a combination of the pose and the orientation of a given person/object. We can also think of pose estimation as the problem of determining the position and orientation of a camera relative to a given person or object.

This is typically done by identifying, locating, and tracking a number of keypoints on a given object or person. For objects, this could be corners or other significant features. And for humans, these keypoints represent major joints like an elbow or knee.

The goal of our machine learning models are to track these keypoints in images and videos.

Categories of pose estimation

When working with people, these keypoints represent major joints like elbows, knees, wrists, etc. This is referred to as human pose estimation.

Humans fall into a particular category of objects that are flexible. By bending our arms or legs, keypoints will be in different positions relative to others. Most inanimate objects are rigid. For instance, the corners of a brick are always the same distance apart regardless of the brick’s orientation. Predicting the position of these objects is known as rigid pose estimation.

There’s also a key distinction to be made between 2D and 3D pose estimation.

2D pose estimation simply estimates the location of keypoints in 2D space relative to an image or video frame. The model estimates an X and Y coordinate for each keypoint.

No alt text provided for this image


3D pose estimation works to transform an object in a 2D image into a 3D object by adding a z-dimension to the prediction.

3D pose estimation allows us to predict the actual spatial positioning of a depicted person or object. As you might expect, 3D pose estimation is a more challenging problem 

No alt text provided for this image


Single pose estimation approaches detect and track one person or object, while multi pose estimation approaches detect and track multiple people or objects.

In addition to tracking human movement and activity, pose estimation opens up applications in a range of areas, such as:

Augmented reality

Animation

Gaming

Robotics

All approaches for pose estimation can be grouped into bottom-up and top-down methods.

  • Bottom-up methods estimate each body joint first and then group them to form a unique pose. Bottom-up methods were pioneered with DeepCut.
  • Top-down methods run a person detector first and estimate body joints within the detected bounding boxes.


Human Body Modelling :-

In human pose estimation, the location of human body parts is used to build a human body representation (such as a body skeleton pose) from visual input data. Therefore, human body modeling is an important aspect of human pose estimation. It is used to represent features and key points extracted from visual input data. Typically, a model-based approach is used to describe and infer human body poses and render 2D or 3D poses.


There are three types of models for human body modeling:

Kinematic Model

Planar Model

Volumetric Model

Working of Pose Estimation :-

Pose estimation utilizes pose and orientation to predict and track the location of a person or object. Accordingly, pose estimation allows programs to estimate spatial positions (“poses”) of a body in an image or video. In general, most pose estimators are 2 steps frameworks that detect human bounding boxes and then estimate the pose within each box.

Pose estimation operates by finding key points of a person or object. Taking a person, for example, the key points would be joints like the elbow, knees, wrists, etc. There are two types of pose estimation: multi-pose and single pose.

Single pose estimation is used to estimate the poses of a single object in a given scene.

Multi-pose estimation is used when detecting poses for multiple objects.

Most popular Pose Estimation Methods :-

High-Resolution Net (HRNet)

Open Pose

DeepCut

Regional Multi-Person Pose Estimation (AlphaPose)

Deep Pose

PoseNet

Dense Pose


To view or add a comment, sign in

More articles by Sakshi Chavan

Others also viewed

Explore content categories