Saturday, September 28, 2013

How Fast is Your Heart Rate?

Recently, the video from the Legislative Yuan goes viral. Mr. Chao-Hao Liu on the right, as a legislator and a former judge, clearly points out several serious law issues in Mr. Shih-Ming Huang on the left. The background of the story can be found here

It's really hard to see the reaction of Mr. Shih-Ming Huang from the video because there are no obvious facial expression changes throughout the video. Thus, I am interested in estimating his heart rate in the video using several computer vision and signal processing tools.

Here is the resultant video with estimated heart rate. We can now see how the heart rate of Mr. Shih-Ming Huang changes when Mr. Chao-Hao Liu raises various questions. 


Non-contact physiological signal monitoring using video processing has been an hot topic recently.
Two examples:

  • The Cardiocam in the Affective Computing Group in MIT Media Lab [Link]
  • Euliean video magnification in MIT CSAIL [Link]

Inspired by these awesome works, I implemented a simplified version for heart rate monitoring in video. Here we describe the method step-by-step.

Face detection and tracking

We use the off-the-shelf Viola-Jones face detection and KLT tracking to detect and track Mr. Shih-Ming Huang's face throughout the whole video. This provides reference frames for further analysis. Below we show part of the video sequence with face tracking on the left and the rectified face region on the right.

Temporal Processing

To extract the heart rate, we seek the color variations in the face region. For each frame of the rectified face, we perform  averaging in each of the R, G, B channels. Here is an example of the values of averaged R, G, B channels over 300 frames (~10 seconds for a 30 frame per second video).

From the variation in the time, it's difficult to perceive the color variation due to heart beats. We therefore perform temporal filtering to focus on the signals in certain frequency ranges.

Here we show the Fourier transform of the green channel. From the magnitude spectrum of the signal (in log scale), we can see that the energy of the DC component dominates over other frequency components.
As we know a normal human heart rate is around 60 - 90 beat per minutes (i.e., 1 Hz - 1.5 Hz), we apply a temporal bandpass filter to filter out the rest unwanted frequencies. Here we show a bandpass filter with range 1-2 Hz (60 BPM - 120 BPM).

After temporal filtering, we can convert the filtered signal back to temporal. Here is the filtered R, G, B signals. Now we can see the color variations due to heart beats. 

From these three filtered signals, we perform the Principal component analysis to compute the first principal component that captures the largest variation in the temporally filtered R, G, B signals. We can then detect the peak in this principal component to estimate how many times the heart beats in this observation time window. For example, we show the peak detection result (marked in red) in the first principal component. In this case, there are 16 beats in a 300 frames (~10 seconds). We then get an heart rate estimate: (16/10)*60 = 96 Beat Per Minute (BPM) for this time window.

Using the sliding window approach, we can estimate the subject's heart rate variation over time. Specifically, we use a 30-seconds time windows with 1 second increment. Here is the plot for all the heart rate measurement in the video. It's interesting to the variations of the heart rate. For example, Mr. Mr. Shih-Ming Huang is nervous (HR~110 bpm) at the very beginning. He gradually calms down for the first 30 seconds and seems to get nervous when Mr. Chao-Hao Liu starts to raise questions.
Matlab code available [Link]

Saturday, April 27, 2013

My Second Marathon!

Today is the annual Christie Clinic Illinois Marathon. The race starts near Assembly Hall, runs through campus, loops through Urbana, back through campus, out into Champaign, and finishes on the 50-yard line of Memorial Stadium (route). It is a good chance to visit both Urbana and Champaign in the same time. 

I may not be a fast runner, but I am definitely a persistent runner! After 5 hours, I earned my second Marathon medal! Exhausted, but very happy :D

Thanks everyone for cheering me up! H

Thursday, April 25, 2013

Miss Daegu 2013 Contestants Face Morphing

Recently, there is a debate on plastic surgery from the post on Reddit titled "Korea's plastic surgery mayhem is finally converging on the same face. Here are the miss Korea 2013 contestants.

To better see the similarity among them, I first took the images from one Japanese blog. After simple normalization and registration, I can get an aligned, animated GIF looping through the 20 contestants as shown below.


These images remind me of a course project in computational photography I did two years ago. I wonder how the "averaged" face of these 20 contestants look like. Thus, I morph all the face images into a mean face shape, then average over the picture values using the code I developed in the course project. Here is the result.

Using the shape of the face images, I can generate a short movie of face morphing from contestant number 1 to 20. We can now see how one face is smoothly transformed into another.

Another application is to synthesize faces with new shapes. For example, I can morph each face a bit toward the mean shape. This results in a stabilized version of animated GIF looping from contestant 1 to 20 as shown below. 

So far, the above visualization figures are qualitative, we may be able to get more insights by quantitative analysis. 

First, we construct the eigenspaces from these contestants faces. However, the standard Principle Component Analysis cannot be directly applied due to the slight pose variation and occlusion by the hair. Instead, we perform a robust version of PCA, e.g., [Robust PCALow-Rank Matrix Recovery], to factor the low rank part and the sparse errors. Specifically, I use the implementation provided in [Low-Rank Matrix Recovery].
[The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices, Z. Lin, M. Chen, L. Wu, and Y. Ma (UIUC Technical Report UILU-ENG-09-2215, November 2009)]
Here are three sample results.

Second, by performing singular value decomposition, we get the eigenfaces and the corresponding eigenvalues. Here we can see that the eigenvalues vanish after 7, suggesting that the rank of the image data is 6.

Here we visualize the largest 6 eigenfaces. These eigenfaces encodes the variation within the 20 contestants.

Third, by projecting faces onto the eigenspace, we can analyze how these faces are distributed in the eigenspace. We show here the coefficients of the 20 contestants. Note that most of the energies are concentrated on first two eigenvalues.
Thus, we plot the coefficients corresponding to the first two eigenvectors. We can now see how similar the appearance of these contestants are.

 Fourth, we can try to construct the affinity matrix which encodes the pairwise similarity. (blue indicates similar and red indicates dissimilar).

This affinity matrix facilitates many interesting applications. For example, by summing up over columns or rows, we get a measure of how a contestant is different from the rest. In the plot below, x-axis denotes the contestant index (from 1- 20). The y-axis indicates how distinct the contestant is. This measure can also be interpreted as the salient object among the 20 contestants using the surround center information divergence.
(If you are interested in saliency, please see our paper on ICPR or slide.)

From the plot above, we can see that contestant 1, 2 and 6 are more distinct (or more salient) than the rest of the contestants. On the other hand, contestant 7, 12, and 15 are more common (i.e., closed to the averaged face) than others. 

Salient contestants (1, 2, 6)

It is interesting that one actually used these most salient contestants (1, 2, 6) to support their opinions on how they are dissimilar [link]. (I am not sure whether they did the same eigen analysis or not.)

Common contestants (7, 12, 15)

Here are the pictures of all contestants.

Another application of using the affinity matrix is to come up with a displaying order with maximized or minimized similarity from one to another. You may use one of them to support your opinion on whether they are similar or not.

Consider a complete graph of 20 vertices with edges encode the pairwise similarity, the problem of finding a displaying order is the problem of finding a (weighted) Hamiltonian circuit in an undirected graph [see Hamiltonian path problem]. As the problem is known to be NP-hard, we could use a greedy heuristic to find approximated solutions. 

The heuristic algorithm works as follows. Start from a random contestant, choose the next contestant (that has not been selected before) with maximum or minimum similarity with the previous contestant. Repeat the process until we find the ordering.

Here is the result of ordering with maximized pairwise similarity. Note that the transition between contestants are more smooth.

Here is the result of minimized pairwise similarity. They look more dissimilar in this displaying order. 

  • If you would like to try it out on your own, you can download my code (in Matlab) here (12.7 MB)
  • You can also just download the cropped, aligned, and mean faces here (12.6 MB)
Have fun! :D

Further readings:

  • Average faces
    • Face of tomorrow project by Mike Mike. [link]
    • Face Research has a online demo for you to make an average face [link]
  • Face morphing [wiki]
  • Principle component analysis
  • Low-Rank Matrix Recovery and Completion via Convex Optimization [link]
  • Saliency
    • Salience (neuroscience) [wiki]
  • Hamiltonian path/circuit [wiki]

Updates (May 4, 2013)

1. The title of the original post was wrong. They are in fact contestants for Miss Daegu. I had corrected the title. Sorry for the confusion.

2. The poll results are out and is available on the official website

The top candidates are contestant number 5, 1, 15, 20, and 19.  We may be able to "guess" how the quantitative analysis of similarity is related to the result.

The first thing I observed is that one of the most salient (No. 1) and most common contestants (No. 15) won the 2nd and the 3rd places, respectively. This result seems to coincide with the psychological study on averaged faces. 

Thomas R. Alley and Michael R. Cunningham, Psychological Science, 1991

We then plot the coefficients corresponding to the first two eigenvectors for top 5 contestants (marked in red). It seems that the selection process somehow avoided choosing both candidates with similar appearances. 

Update (July 20, 2013)

One of the most salient contestant (number one), Yoo Ye-bin, crowned the Miss Korea 2013.

Wednesday, April 24, 2013

Transformation Guided Image Completion

I just presented our work on image completion in International Conference on Computational Photography at Boston. This is my intern project done in Microsoft Research Redmond last summer. It's a fun project and I learned a lot from my two awesome mentors Johannes and Sing Bing. If you are interested, please check out the slide below and visit our project page.

Tuesday, April 16, 2013

The Best Course Syllabus

I received a syllabus promoting the course Machine Learning for Signal Processing next Fall (taught by Prof. Paris Smaragdis). The design of the poster is very creative. This is certainly the best syllabus I have ever seen!

Thursday, March 21, 2013

ModernCV LaTeX template

Recently, my home department requested all graduate students to provide a self-evaluation and an updated curriculum vitae. I then found a very nice and clean LaTeX template called ModernCV (Link). Here is what the CV looks like.

I then spent some time filling out my information in this format. During this process, I am inspired from many people with awesome CV (e.g., Jianxiong Xiao). Finally, I am satisfied with the result. If you would like to modify your CV from mine, you can download the Tex source file here (Link). 

Tuesday, March 5, 2013

Raskar's Idea Hexagon

Ramesh Raskar, professor in MIT Media Lab, presented in TED this inspiring talk on "How to Invent?" Along with many concrete examples, he described how we can invent new technologies using a principled way.

I found this idea very interesting. Thus, a few years ago, I tried to summarize some of the works in my research field using this approach. Check out my slides.

Face inversion impairs your holistic perception

It is well known that our ability to recognize a face or an object will significantly degrade when inverted images are presented. This video is an excellent demonstration of this effect.  You simply cannot recognize what the painting is about until he revealed the upright one.