The premise of this project was an extended exercise in machine vision. Inspired by and derived from some of the concepts for simple motion detection algorithms presented in the Processing text written by Ben Fry and Casey Reas. In particular I focused on frame differencing and brightness tracking to create an interface by which the coordinates of a single gesture could be tracked, and used as control input. Before I go further with the mechanism of the entire project, some background in frame differencing and brightness tracking is in order.
Frame differencing relies on two consecutive frames as input. The frames are subtracted and the difference between the two frames is derived as the sum of pixels remaining which are not black. This sum can then be used as a threshold value for how much motion has occurred. This method of detecting motion is very effective for simply sensing motion, but it does not provide for anything more sophisticated than determining simply that motion has occurred.
The added sophistication comes from brightness tracking. After the frames have been subtracted from one another, the image is much cleaner, since the majority of the color has been removed from the image, simplifying the color palette to a gray scale. After a simple search of the color values present in frame difference image, the grayest point can be easily calculated, and the location of the grayest pixel is stored. This location changes when motion occurs, and that motion can be tracked as a pair of coordinates, and the coordinates can be used to derive a speed and direction, or simply a direction.
As a final implementation, I designed a simple application to map the tracking data to a pitch and modulation amount that are then passed to a program written in Pure Data that generates a ring-modulated tone, and depending how you move, it would be better described as R2D2's distant, significantly less intelligent cousin. To be more specific, a user's X-axis motion is passed to pure data as the modulation amount, and the Y-axis motion determines the base pitch. Images of the patch have been included with this posting, and the Processing sketch is linked (though even if you have a webcam, the applet is NOT signed, and will not work as an applet).