Assignment 2: A Remotely Good Idea

In this project, you're going to make use of a local webcam and a Canon PowerShot used remotely via the FlashAir SD card. That is known as tethered remote control... and it's called that even when there is no physical wire tethering the camera to a computer. Because this is the first time this course has used these SD cards, we're keeping the project simple... but it's still pretty cool. You'll be combining the images from two cameras to make a time-lapse movie consisting of at least 180 frames.

Your Project

Making time-lapse movie doesn't sound too hard, does it? Well, it sort of is. You see, all of the better codecs involve not just compressing one frame at a time, but interpolating between frames. To do that, one needs to have key frames both before and after the frame currently being compressed. Not a big deal, but that does tend to mean you need more memory than the PowerShots make available. There's also the little issue that most cameras don't allow you to run programs inside them at all.

Your project here is simply going to involve using a Linux PC to command a camera to take a series of images which the computer will upload and straightforwardly convert into a lime-lapse movie using standard software tools and a command line that I'll give you. The cute thing is that you're not going to get all the images from the same camera. One will be a a UVC webcam using fswebcam. That's the one the basic images in your timelapse will come from. However, you're going to treating those images a little differently depending on what the second camera has seen: a Canon PowerShot A4000/ELPH115 programmed under CHDK will be capturing images during the same interval (but not at the same rate) as the webcam, and you'll be intelligently combining the images based on what -- I should say who -- the PowerShot saw.

Basically, you are to write a program/script to run under Linux that will capture frames at regularly-spaced intervals from two cameras simultaneously, but not at the same rate. The capture from the webcam should be at a rate of approximately one frame/second (1 FPS), so your 180-frame movie will be summarizing 3 minutes of action. However, the PowerShot images should be captured at a rate of one frame every 15 seconds. A little math reveals that's just 12 images. I don't care if you individually trigger and transfer the images from the PowerShot or if you simply tell the PowerShot when to start and then transfer all the files from it after the complete sequence is captured... but, either way, you need to have the triggering and image fetching all done by a script running on the PC interacting with a CHDK script via 802.11 WiFi communication with the FlashAir card.

One of the little surprises is that most modern cameras actually do face recognition. It's mostly used to determine focus points, but your PowerShot cameras also record how many faces they see, and at what image coordinates, as part of each JPEG captured. Each time the latest PowerShot image contains a detected face, you'll extract a close-up of the face from that image and include it as an inset in each of the 15 images taken with the webcam during the same interval. That would actually be somewhat difficult if there wasn't already software that essentially can do it for you....

You should scale the image resolution to a manageable size for video, i.e., 640x480 instead of 16MP. That's all most webcams deliver anyway. Do that scaling in-camera on the PowerShot to save on file size and to speed-up file transfer via WiFi. Of course, any inset image must be small enough to fit within the webcam's image to be used as a video frame, but that shouldn't be a problem. If you wish, you may impose an arbitrary limit on inset image size, such as 320x240 pixels maximum -- one quarter of the frame. In any case, once the 180-second capture interval is over and your script has collected all the images, your program is to combine them into a movie in any standard format. There is a lot of open choice here -- the goal is to make sure you understand the basics of tethered camera control, I'm not asking for anything fancy.

Stuff To Know About

There are a few tricks here. So, let's talk about what you need to know....

Scripting On A PC

You don't need to write any C code for this project (unless you want to), but you do need to write a script that will invoke all the other programs as needed. You can use Python, Borne Shell, etc. for that -- it's up to you.

Talking To FlashAir

It's not too bad (now that I've found a way that works). Read my page on use of FlashAir. In addition, here is a CHDK Lua script, wishoot.lua, to take a shot:

@title WiFi Shoot
@chdk_version 1.4

wifile ="A/X/CMD", "r")
num = 0 + wifile:read()

	print("Was:", num)
		wifile ="A/X/CMD", "r")
		shot = 0 + wifile:read()
		print("Is:", shot)
	until (shot ~= num)

	num = shot
	repeat sleep(50) until get_shooting() == true
	us = tv96_to_usec( get_tv96() )
	iso = get_iso_real()
	repeat sleep(50) until get_shooting() == false	

	ex = get_exp_count()
	jpeg = string.format('%s/IMG_%04d.JPG', get_image_dir(), ex%10000)
	wifile ="A/X/RESP", "w")
	wifile:write(jpeg, " is exposure ", num, ": ", us, "us @ ISO ", iso, "\n")
until (false)

And here is a little shell script,, to talk to it:

expr `cat CMD` + 1 >CMD
echo "Taking shot" `cat CMD`
curl -s --form "file=@CMD" >/dev/null
sleep 10
curl -s -o RESP
cat RESP

Extraction Of Faces Using EXIF

As noted above, even lowly little cameras like your Canon PowerShots actually can recognize faces (if you enable face detection). Most digital cameras can, and they record the face information in the EXIF data. The bad news is that each camera brand (and in some cases, different models within a brand) uses different EXIF fields to store the relevant info. Fortunately for us, there is a wonderful little piece of free software that knows how to read that info for us....

The real key is a free program called ExifTool that was written by Phil Harvey. It knows how to extract a mind-bogglingly-diverse set of field values from an image file. It is actually structured as both a stand-alone program and a Perl library. He also wrote a little example code called that uses the library to extract face data and create a copy of each image with the faces boxed. I've slightly modified that program for you so that it simply creates a new image file with just the extracted face image: file.jpg will create a copy of the face extracted from the file as a file with the same name in the subdirectory called tmp. If no face was recognized and tagged in the EXIF, it doesn't make a file. Note that the ExifTool library and ImageMagick (described below) must be installed for there Perl scripts to work.


fswebcam is a very nice tool for grabbing images from a webcam. I'd suggest you invoke it once per image you wish to grab... which doesn't work so well with some other webcam software because some cameras don't keep their exposure adjusted while not being explicitly accessed. This software allows you to specify a short sequence of frames be requested before taking one for real, thus allowing the camera exposure to correct. There is a man page documenting the options; -s is the option that skips frames before capture.


ImageMagick is a powerful library for image processing -- and a suite of command-line tools that serve as a demonstration of the library and make ImageMagick very easy to use in scripts. The functionality of the tools overlap each other considerably, and one can do most things in convert. In particular, a command like:

convert -size 640x480 xc:black \
	image1.jpg -geometry 640x480 -composite \
	image2.jpg -geometry 100x80+529+0 -composite \

Will create a black canvas of 640x480 pixels, scale image1.jpg to fit it, scale image2.jpg to 100x80 and put it in the upper right corner, and then write the resulting composite image to image3.jpg.


mencoder is a video encoder co-developed with mplayer. Command lines look very scary, but if you want to do it and it involves video encoding, odds are mencoder can help. For example:

echo "Mencoder Pass 1..."
mencoder mf://*.jpg -mf w=320:h=240:fps=15:type=jpg -vf scale=320:240 -ovc lavc 
-lavcopts vcodec=mpeg4:keyint=120:vbitrate=1600:vpass=1 -o /dev/null 
echo "Mencoder Pass 2..."
mencoder mf://*.jpg -mf w=320:h=240:fps=15:type=jpg -vf scale=320:240 -ovc lavc 
-lavcopts vcodec=mpeg4:keyint=120:vbitrate=1600:vpass=2 -o /dev/null 2>/dev/null
 |tail -1
echo "Mencoder Pass 3..."
mencoder mf://*.jpg -mf w=320:h=240:fps=15:type=jpg -vf scale=320:240 -ovc lavc 
-lavcopts vcodec=mpeg4:keyint=120:vbitrate=1600:vpass=3 -o movie.mpg 2>/dev/null
 |tail -1

That sequence of commands will produce a silent video running at 15 FPS, which is 15X the rate the images were captured. That means each extracted face inset will stay up for 1 second as 15 webcam frames display "under" it.

Due Dates, Submission Procedure, & Such

You will be submitting source code for your tethered control program or script, the CHDK Lua script, a make file (which does whatever is necessary), and a simple "implementor's notes" document. For this project, you definitely have design decisions to talk about in those notes -- such as how you coordinated the captures between the webcam and the PowerShot.

For full consideration, your project should be submitted no later than the Final exam timeslot. Submit your .tar or .tgz file here:

Your email address is .
Your password is

Which type of student are you?
Undergraduate registered for EE599
Graduate registered for EE699 Cameras as Computing Systems