Tracking Objects: Acquiring and Analyzing Image Sequences in MATLAB
By Dan Lee, MathWorks and Steve Eddins, MathWorks
Four-dimensional arrays are about to become a lot more common in MATLAB®. With the new Image Acquisition Toolbox, you can easily stream images from your frame grabbers and scientific cameras directly into MATLAB, often as an array with four dimensions: height, width, color, and time.
In this article, we’ll demonstrate how to get video image sequences into MATLAB and illustrate basic object tracking techniques using the Image Processing Toolbox.
Image Acquisition
A typical image acquisition session includes these steps
Step | Toolbox Functions | Input |
---|---|---|
Connect to the device. | videoinput |
adaptorname, deviceID, video format(optional) |
Configure acquisition properties and preview the results. | set,get,inspect, preview |
video input object, property names, and settings |
Acquire and process data. | start, getdata getsnapshot |
video input object, number of frames |
Disconnect from the device and free resources. | clear, delete |
video input object |
First, we connect to a Windows video device using the videoinput
function. (Use imaqhwinfo
to determine your device’s identifier number and supported video formats.)
vid = videoinput('winvideo', 1, 'RGB24_352x288')
Next, we specify that we want to acquire 50 frames at 3 frames per second.
set(vid, 'FramesPerTrigger', 50) set(getselectedsource(vid), 'FrameRate', 3)
Now we start acquiring images.
start(vid)
By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait
function to wait for the acquisition to stop. The getdata
function transfers the acquired images into the MATLAB workspace.
wait(vid) [f, t] = getdata(vid);
Variable f
is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t
contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview
. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.
imview(f(:,:,:,10))
Before continuing, we disconnect from the camera to enable other applications to use it, and clear vid
from the workspace.
delete(vid) clear(vid)
Object Tracking
First, we connect to a Windows video device using the videoinput
function. (Use imaqhwinfo
to determine your device’s identifier number and supported video formats.)
vid = videoinput('winvideo', 1, 'RGB24_352x288')
Next, we specify that we want to acquire 50 frames at 3 frames per second.
set(vid, 'FramesPerTrigger', 50) set(getselectedsource(vid), 'FrameRate', 3)
Now we start acquiring images.
start(vid)
By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait
function to wait for the acquisition to stop. The getdata
function transfers the acquired images into the MATLAB workspace.
wait(vid) [f, t] = getdata(vid);
Variable f
is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t
contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview
. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.
imview(f(:,:,:,10))
Now that we have the image sequence in MATLAB, we’ll explore two simple techniques for tracking the ball: frame differencing and background subtraction.We’ll use functions in the Image Processing Toolbox.
Frame Differencing
The absolute difference between successive frames can be used to divide an image frame into changed and unchanged regions. Since only the ball moves, we expect the changed region to be associated only with the ball, or possibly with its shadow.
To begin, we convert each frame to grayscale using rgb2gray
. Running the loop “backwards,” from numframes
down to 1, is a common MATLAB programming trick to ensure that g
is initialized to its final size the first time through the loop.
numframes = size(f, 4); for k = numframes:-1:1 g(:, :, k) = rgb2gray(f(:, :, :, k)); end
Next, we compute frame differences using imabsdiff
.
for k = numframes-1:-1:1 d(:, :, k) = imabsdiff(g(:, :, k), g(:, :, k+1)); end imview(d(:, :, 1), [])
The two bright spots correspond to the ball locations in frames 1 and 2. The dim spots are the ball’s shadow. The function graythresh
computes a threshold that divides an image into background and foreground pixels. Since graythresh
returns a normalized value in the range [0,1], we must scale it to fit our data range, [0,255].
thresh = graythresh(d) bw = (d >= thresh * 255); imview(bw(:, :, 1))
First, we connect to a Windows video device using the videoinput
function. (Use imaqhwinfo
to determine your device’s identifier number and supported video formats.)
vid = videoinput('winvideo', 1, 'RGB24_352x288')
Next, we specify that we want to acquire 50 frames at 3 frames per second.
set(vid, 'FramesPerTrigger', 50) set(getselectedsource(vid), 'FrameRate', 3)
Now we start acquiring images.
start(vid)
By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait
function to wait for the acquisition to stop. The getdata
function transfers the acquired images into the MATLAB workspace.
wait(vid) [f, t] = getdata(vid);
Variable f
is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t
contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview
. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.
imview(f(:,:,:,10))
First, we connect to a Windows video device using the videoinput
function. (Use imaqhwinfo
to determine your device’s identifier number and supported video formats.)
vid = videoinput('winvideo', 1, 'RGB24_352x288')
Next, we specify that we want to acquire 50 frames at 3 frames per second.
set(vid, 'FramesPerTrigger', 50) set(getselectedsource(vid), 'FrameRate', 3)
Now we start acquiring images.
start(vid)
By default, acquisition begins immediately. Since acquisition occurs in the background, the MATLAB command line is free and we can start processing images in the background. The processing does not need to start before the acquisition is done, so we’ll use the wait
function to wait for the acquisition to stop. The getdata
function transfers the acquired images into the MATLAB workspace.
wait(vid) [f, t] = getdata(vid);
Variable f
is a four-dimensional array of size 288-by-352-by-3-by-50. It represents an image sequence with 288 rows, 352 columns, 3 color components, and 50 frames. The vector t
contains the time stamps for each frame. We display the tenth frame using the Image Processing Toolbox function imview
. The frame shows a ball, attached by a string to the ceiling, swinging over the state of Alabama.
imview(f(:,:,:,10))
As you can see, the resulting binary image has a small extra spot that should be removed. The technique we’ll use, area opening, removes objects in a binary image that are too small. The call to bwareaopen
bw2 = bwareaopen(bw, 20, 8);
removes all objects containing fewer than 20 pixels. The third argument, 8, tells bwareaopen
to assume that pixels are connected only to their immediate 8 neighbors in each frame. bwareaopen
will then treat bw
as a sequence of two-dimensional images rather than one three-dimensional image.
Finally, we label each individual object (using bwlabel
) and compute its corresponding center of mass (using regionprops
).
s = regionprops(bwlabel(bw2(:,:,1)), 'centroid'); c = [s.Centroid] c = 226.8231 53.1538 260.3750 43.3167
Background Subtraction
Another approach to tracking the ball is to estimate the background image and subtract it from each frame. Our approach here is to find the pixel-wise maximum among several neighboring frames. That’s exactly what morphological dilation does, if you use a structuring element oriented along the frame dimension.
background = imdilate(g, ones(1, 1, 5)); imview(background(:,:,1))
Next, we compute the absolute difference between each frame and its corresponding background estimate. Since the array of frame differences, d
, and the array of background images, background
, have the same size, we don’t need a loop.
d = imabsdiff(g, background); thresh = graythresh(d); bw = (d >= thresh * 255);
Now we want to compute the location of the ball in each frame. As before, some frames contain small extra spots, most of which result from the ball’s shadow. We solve this problem by assuming that the ball is the largest object in each frame.
centroids = zeros(numframes, 2); for k = 1:numframes L = bwlabel(bw(:, :, k)); s = regionprops(L, 'area', 'centroid'); area_vector = [s.Area]; [tmp, idx] = max(area_vector); centroids(k, :) = s(idx(1)).Centroid; end
Visualization
To finish this example, let’s visualize the ball’s motion by plotting the centroid locations as a function of time:
subplot(2, 1, 1) plot(t, centroids(:,1)), ylabel('x') subplot(2, 1, 2) plot(t, centroids(:, 2)), ylabel('y') xlabel('time (s)')
This article should get you started with mixing MATLAB, cameras, four-dimensional arrays, and a little image processing. If you want to experiment with this data, download the Gravity Measurement Case Study from MATLAB Central.
Published 2003