## Friday, April 23, 2010

### 如何找研究題目(四)：使用有效的工具，尋找適合的問題 - Dimensionality Reduction

(How to come up with new research ideas?)

Jia-Bin Huang

jbhuang0604@gmail.com

Latest update: April 12, 2010

If the only tool you have is a hammer, you tend to see every problem as a nail. - Abraham Maslow

Dimensionality Reduction

1.    Introduction

在非常多科學的問題中我們往往會取得並使用高維的資料來描述某個物件，舉例來說一張大小100x100的灰階影像可以表示成一個10,000維的向量、氣候的大量資料、人類基因等，資料的高維度一方面大幅增加計算時的複雜度，另一方面在從統計的角度來看，高維空間需要以指數成長的樣本數才能正確地分析資料
然而雖然資料表示在高維空間中，往往這些資料會落於一個有意義的較低維的空間中，也許是線性子空間或是非線性流形(Nonlinear manifold) 因此，資料降維可以1. 降低運算複雜度、2.取得更有本質意義的資料表示方式、3.，容易將高維資料視覺化。降維方法即是透過分析資料來尋找一種Embedding的方式，將資料從原先的高維空間映射到低維空間。

2.    Dimensionality Reduction Methods

2.1  Linear, global structure preserved

i.     Principal component analysis (PCA) 
PCA minimizes reconstruction error or maximizes the data variance in the low-dimensional representation of data.

ii.     Linear discriminant analysis (LDA) 
LDA  maximizes the linear class separability in the low-dimensional representation of the data.

iii.     Independent component analysis (ICA) 
ICA makes sure that the low-dimensional representation of data are maximally
statistically independent.

iv.     Factor analysis (FA) 
FA represent the original high-dimensional data with low-dimensional common factors and specific factors.

2.2 Linear, local structure preserved

i.     LPP) 
LPP approximates Laplacian eigenmaps (LE), preserves the neighborhood structure of the high-dimensional data.

ii.     Neighborhood Preserving Embedding (NPE) 
NPE approximates Locally Linear Embedding (LLE), preserves the local neighborhood structure on the high-dimensional data manifold.

2.3 Non-linear, global structure preserved

scaling (MDS) 
MDS is a set of nonlinear techniques that maps the high-dimensional data to a low-dimensional representation while preserving the pairwise distances between the data points as much as possible.

ii.     ISOMAP) 
ISOMAP combines geodesic distances on a weighted graph and MDS.

iii.    Kernel PCA) 
Kernel PCA extends PCA using kernel tricks [20-22] to handle nonlinear data (without explicit nonlinear mapping of data).

iv.    General Discriminant Analysis (GDA) 
GDA maximizes the Fisher criterion used in LDA in high-dimensional space constructed by a kernel function.

v.     Diffusion maps  (DM) 
DM embeds high-dimensional data onto low-dimensional space via the eigenvectors of suitably defined random walks defined on the given datasets.

2.4 Non-linear, local structure preserved

i.      Locally Linear Embedding (LLE) [12-13]
LLE constructs a K-NN graph W, computes weights of data points by linear combination of neighbors (i.e., the manifold is locally linear), attempts to retain the reconstruction weights in the linear combinations in the low-dimensional space as well as possible.

ii.     Laplacian Eigenmaps (LE) 
LE constructs a K-NN graph W, computes weights to minimize the norm of the gradient in least square sense, minimizes the distance between a datapoint and its k nearest neighbors in the low-dimensional space.

iii.    (HE) 
With the constraint that low-dimensional data representation is locally isometric, HE
minimizes the curviness of the high-dimensional manifold when constructing a low dimensional representation of data.

iv.    Local Tangent Space Alignment (LTSA) 
Based on the assumption that the manifold is locally linear, LTSA simultaneously searches for the low-dimensional representation of data, and the linear mappings of the low dimensional datapoints to the local tangent space of high-dimensional manifold.

v.     Semidefinite embedding (SDE) or maximum variance unfolding (MVU) 
SDE builds a K-NN graph, compute an inner product matrix that maximizes the pairwise distances between two points that are not connected, then apply MDS to obtain the low-dimensional representation of data.

vi.     Visualizing Data using t-SNE (t-SNE) 
t-SNE converts the pairwise distances between the datapoints into probabilities of
selecting pairs of data points, achieving good visualization results.

1：其實上面介紹的許多方法可以被歸類為Spectral method的一種，Spectral method牽涉到eigenvector/eigenvalue的計算(of specially constructed matrices)，並以藉此來分析高維空間的結構(e.g., PCA, LDA, metric MDS, ISOMAP, LLE, Kernel PCA, GDA SDE, LE, HE等等)

3：儘管Nonlinear方法能夠學習較複雜的流形結構，在真實世界的問題上未必能夠較線性的方法效果好 Geometric Methods and Manifold Learning

Partha Niyogi, Mikhail Belkin Introduction to feature selection

Isabelle Guyon

## (對於線性代數的觀念不是非常清楚的話可以看這個線上課程，非常推薦)

 Applied and Computational Harmonic Analysis 2006
 Proceedings of the National Academy Science 2003
 Principal Manifolds and Nonlinear Dimension Reduction via Tangent Space Alignment, SIAM Journal of Scientific Computing, 2004
 Dimensionality reduction: A Comparative Survey,
Neural computation 1998

3.    Applications

3.1 Subspace as constraints: Structure from motion, Optical flow, Layer extraction, Face alignment

3.2  Face Recognition


 On Optimizing Subspaces for Face Recognition, ICCV 2009
 PCA versus LDA, PAMI 2001

3.3 Motion segmentation

##  A General Framework for Motion Segmentation: Independent, Articulated, Rigid, Non-rigid, Degenerate and Non-degenerate, ECCV 2006

3.4  Lighting

3.5 Visual tracking