In order for a machine to interact with objects it needs to be able to see and identify them. One of the easiest ways for a computer to identify an object is with that objects color. In this tutorial I will walk you through how to do very basic object tracking using a kinect (v1) and processing. By the end of this post you should be able to threshold an image based on color, apply a very basic filter to the image, and track objects of specific colors in real time. Let’s get started!
Getting the Kinect working
This tutorial is done using the Kinect model number xxxx. I also saw that Daniel Shiffman and some other pretty bright people put together a library for using the Kinect with Processing. I wanted to give it a go so the language used here is Processing. For a good intro to the Kinect and processing see Daniel Shiffman’s post on getting the kinect up and running. Here we will focus on doing some other cool stuff with it. First off let’s load in the libraries
The first two allow us to interface with the Kinect and the last one is an excellent library for working with images in processing. Now that we have our libraries let’s initialize the Kinect and get some video going.
With the above code working you should see something like this:
Now that we have the basics, let’s get started with the fun stuff.
If we want to track an object by color the simplest way to do this is try and identify all the regions of an image where the color in question is strong. We do this with thresholding. Thresholding is one of the simplest methods of image segmentation where all pixels with an intensity above a certain threshold are replaced with a white pixel and all other pixels are replaced with a black one. In this tutorial I will be tracking a bright red object. In order to threshold based on the red channel I first need to be able to calculate how “red” a pixel is. We can do this with the below function:
Each color is calculated using bit shifts for speed. The red channel is found and then the strength of the remaining channels is averaged and subtracted from the red. This removes how much red is naturally in the image and gives us a better view of the true red of an object. Now we can threshold the image by looking at each pixel individually.
Now with the thresholded image we can now kind of extract the red object from the background!
This works well but as you can see there is a bit of noise in the image. In order to detect objects in the image it is helpful to eliminate noise. We can do this by using a simple median filter the median filter steps through each pixel in the image and assigns it the median value of the surrounding pixels. In processing this looks like.
As you can see this reduces the noise in the image and we are no longer picking up on my face as much!
Putting it all together
We are finally ready to detect our objects to do that I use the blobDetection library. The below code is taken from one of v3ga’s examples but let’s walk through what’s happening. First we need to create a blobDetector and get all of the blobs in the image.
With this last piece we can now track the object!
All of the code for this project can be found here