Count Push-Ups Using Python and Your Computer’s Camera

You can use many projects to strengthen your skills in computer vision and Python. One of these projects is creating a simple push-up counter using Python. You can write this project’s program in a single file.

The program will take a video input or a real-time input from a camera, perform human pose estimation on the input, and count the number of push-ups the person is doing. To perform human pose estimation, the program will use the MediaPipe human pose estimation model.

What Is the MediaPipe Human Pose Estimation Model?

It is a model developed by Google that tracks thirty-three landmarks on the human body. It also predicts a full-body segmentation which it represents as a two-class segmentation. The following image shows all the landmarks that the model is capable of identifying. Numbered points identify each landmark and connect to each other with lines.

Your push-up counter program will utilize the positions of the shoulders and the elbows. In the above image, the shoulder landmarks are 11 and 12 while the elbow landmarks are 13 and 14.

BBC iPlayer showing on a smart TV.

Setting Up Your Environment

You should already be familiar withthe basics of Python. Open a Python IDE and create a new Python file. Run the following command on the terminal to install the respective packages on your environment:

You will use OpenCV-Pythonto take the video input in your program and process it. This library gives your programcomputer vision capabilities.

A computer monitor showing the Windows 11 security settings

You will use MediaPipe to perform human pose estimation on the input.

You will use imutils to resize the video input to your desired width.

Imports and MediaPipe Initializations

Import the three libraries that you previously installed on your environment. This will make it possible to use their dependencies in the project.

Then create three MediaPipe objects and initialize them using the respective functions. You will use mp.solutions.drawing_utilsfunction to draw the various landmarks on the input. mp.solutions.drawing_styles to change the styles in which the drawings of the landmarks appear, and mp.solutions.pose which is the model you will use to identify these landmarks.

A person doing push-ups, facing the camera.

Performing the Human Pose Estimation

Detecting the pose of a human is the process of identifying their body orientation by identifying and classifying their joints.

Declaring Your Variables

Declare the variables you will use to store the number of pushups, the position of the shoulders and elbows, and the video input.

Initialize the position variable to None. The program will update it depending on the position of the elbows and shoulders.

Calling the MediaPipe Pose Estimation Model

Call the MediaPipe pose estimation model which will detect the human pose in the input.

The initializations of the detection confidence and tracking confidence represent the level of accuracy you need from the model. 0.7 is similar to 70% accuracy. you’re able to change it to your desired level.

Taking and Preprocessing the Input

Take the input which you will later pass to the pose estimation model. Resize the width of the video input using the imutils library. Convert the input fromBGR to RGB as MediaPipe works with RGB input only. Finally, pass the converted input to the human pose estimation model to identify the landmarks.

After processing the input, you have identified the landmarks on the input.

Drawing the Identified Landmarks on the Input

Create an empty list that will store the coordinates of each landmark. Use the draw_landmarks class to draw a dot on each landmark and the connections between them. Using a for loop, iterate over the landmarks and store the ID and coordinates of each landmark in the list you created. Use the image.shape class to calculate the width and height of the video input.

The ID is the number given to a specific landmark by the MediaPipe pose estimation model. Having identified the pose of the human in the input, you need to count the number of push-ups they are doing if any.

Counting the Number of Push-Ups

Create a condition that checks the position of the shoulders against the position of the elbows. When the shoulders of the person in the input are higher than the elbows the person is up. When the shoulders are lower than the elbows, the person is down. You check this by comparing the IDs of the shoulders' landmarks with those of the elbows' landmarks.

For a person to complete one full push-up, they must assume a down position and then get back to the up position. After a complete push-up, the program can update the count by one.

Displaying the Output

You need to display the number of push-ups the program has counted. Print the value of the count on the terminal, each time the user does a complete push-up. Finally, display the output of the person doing push-ups with the landmarks drawn on their body.

The output should look something like this:

You should observe an update on the terminal as the person on the output does a complete push-up.

Strengthen Your Computer Vision Skills

Computer vision is broad. A push-up counter is one of the many projects you can use to put your computer vision skills into practice. The best way to strengthen these skills is by building more projects that involve computer vision.

The more projects you will build, the more you will learn!

Face detection is much easier than you think and this handy Python library proves it.

Anyone with more than a passing interest in motorsports must see these films.

Your iPhone forgets what you copy, but this shortcut makes it remember everything.

The best features aren’t the ones being advertised.

Sometimes the smallest cleaning habit makes the biggest mess.

Flagship price, mid-range phone.

Count Push-Ups Using Python and Your Computer’s Camera