如何找研究題目?
(How to come up with new research ideas?)
Jia-Bin Huang
jbhuang0604@gmail.com
Latest update: April 12, 2010
If the only tool you have is a hammer, you tend to see every problem as a nail. - Abraham Maslow
四、使用有效的工具,尋找適合的問題 neXt = T(X), T: X->Y
Dimensionality Reduction
1. Introduction
在非常多科學的問題中我們往往會取得並使用高維(High-dimensional )的資料來描述某個物件,舉例來說一張大小100x100的灰階影像可以表示成一個10,000維的向量、氣候的大量資料、人類基因等,資料的高維度一方面大幅增加計算時的複雜度,另一方面在從統計的角度來看,高維空間需要以指數成長的樣本數才能正確地分析資料(curse of dimensionality)。
然而雖然資料表示在高維空間中,往往這些資料會落於一個有意義的較低維的空間中,也許是線性子空間(Linear subspace)或是非線性流形(Nonlinear manifold), 因此,資料降維(dimension reduction)可以1. 降低運算複雜度、2.取得更有本質意義的資料表示方式、3.,容易將高維資料視覺化(Visualization)。降維方法即是透過分析資料來尋找一種Embedding的方式,將資料從原先的高維空間映射到低維空間。
2. Dimensionality Reduction Methods
降維的方法非常多,根據兩種特質可以將這些方法分為四大類,第一種特質是線性或非線性(Linearity: Linear v.s. Non-linear),第二種特質則是降維時保留全面性或是區域性的結構(Locality: Global v.s. local):
第一及第二大類線性的方法利用分析資料的分布,來計算一個投影的矩陣已達到降維的功效,不同的目標則產生不同的降維方法
圖一 降維方法分類圖 [19]
2.1 Linear, global structure preserved
PCA minimizes reconstruction error or maximizes the data variance in the low-dimensional representation of data.
ii. Linear discriminant analysis (LDA) [2]
LDA maximizes the linear class separability in the low-dimensional representation of the data.
iii. Independent component analysis (ICA) [3]
ICA makes sure that the low-dimensional representation of data are maximally
statistically independent.
statistically independent.
iv. Factor analysis (FA) [4]
FA represent the original high-dimensional data with low-dimensional common factors and specific factors.
2.2 Linear, local structure preserved
LPP approximates Laplacian eigenmaps (LE), preserves the neighborhood structure of the high-dimensional data.
ii. Neighborhood Preserving Embedding (NPE) [6]
NPE approximates Locally Linear Embedding (LLE), preserves the local neighborhood structure on the high-dimensional data manifold.
第三第四類為Nonlinear manifold learning,從2000年Science期刊上的兩篇文章(ISOMAP和LLE)之後開始蓬勃發展。
2.3 Non-linear, global structure preserved
i. Multidimensional scaling (MDS) [7]
MDS is a set of nonlinear techniques that maps the high-dimensional data to a low-dimensional representation while preserving the pairwise distances between the data points as much as possible.
ii. Isomap (ISOMAP) [8]
ISOMAP combines geodesic distances on a weighted graph and MDS.
iii. Kernel principal component analysis (Kernel PCA) [9]
Kernel PCA extends PCA using kernel tricks [20-22] to handle nonlinear data (without explicit nonlinear mapping of data).
iv. General Discriminant Analysis (GDA) [10]
GDA maximizes the Fisher criterion used in LDA in high-dimensional space constructed by a kernel function.
v. Diffusion maps (DM) [11]
DM embeds high-dimensional data onto low-dimensional space via the eigenvectors of suitably defined random walks defined on the given datasets.
2.4 Non-linear, local structure preserved
i. Locally Linear Embedding (LLE) [12-13]
LLE constructs a K-NN graph W, computes weights of data points by linear combination of neighbors (i.e., the manifold is locally linear), attempts to retain the reconstruction weights in the linear combinations in the low-dimensional space as well as possible.
ii. Laplacian Eigenmaps (LE) [14]
LE constructs a K-NN graph W, computes weights to minimize the norm of the gradient in least square sense, minimizes the distance between a datapoint and its k nearest neighbors in the low-dimensional space.
iii. Hessian Eigenmaps (HE) [15]
With the constraint that low-dimensional data representation is locally isometric, HE
minimizes the curviness of the high-dimensional manifold when constructing a low dimensional representation of data.
iv. Local Tangent Space Alignment (LTSA) [16]
Based on the assumption that the manifold is locally linear, LTSA simultaneously searches for the low-dimensional representation of data, and the linear mappings of the low dimensional datapoints to the local tangent space of high-dimensional manifold.
v. Semidefinite embedding (SDE) or maximum variance unfolding (MVU) [17]
SDE builds a K-NN graph, compute an inner product matrix that maximizes the pairwise distances between two points that are not connected, then apply MDS to obtain the low-dimensional representation of data.
vi. Visualizing Data using t-SNE (t-SNE) [18]
t-SNE converts the pairwise distances between the datapoints into probabilities of
selecting pairs of data points, achieving good visualization results.
註1:其實上面介紹的許多方法可以被歸類為Spectral method的一種,Spectral method牽涉到eigenvector/eigenvalue的計算(of specially constructed matrices),並以藉此來分析高維空間的結構(e.g., PCA, LDA, metric MDS, ISOMAP, LLE, Kernel PCA, GDA SDE, LE, HE等等)
下一個段落介紹的Spectral clustering也屬與此類,只不過運用在資料的分群(clustering),
Geometric Methods and Manifold Learning
Partha Niyogi, Mikhail Belkin
Introduction to feature selection
Isabelle Guyon
Introduction to kernel methods
Bernhard Schölkopf
MIT 18.06 Linear Algebra Spring 2005
(對於線性代數的觀念不是非常清楚的話可以看這個線上課程,非常推薦)
[1] Analysis of a complex of statistical variables into principal components, Journal of educational psychology 1933
[2] The use of multiple measurements in taxonomic problems, Annals of Eugenics 1936
[3] Independent Component Analysis: Algorithms and Application, Neural Networks 2000
[5] Locality preserving projections, NIPS 2003
[6] Neighborhood Preserving Embedding, ICCV 2005
[8] A global geometric framework for nonlinear dimensionality reduction, Science 2000
[9] Kernel principal component analysis, ICANN 1997
[10] Generalized discriminant analysis using a kernel approach, Neural Computation,2000
[11] Diffusion maps, spectral clustering and reaction coordinates of dynamical systems, Applied and Computational Harmonic Analysis 2006
[12] Nonlinear dimensionality reduction by locally linear embedding, Science 2000
[14] Laplacian eigenmaps and spectral techniques for embedding and clustering, NIPS 2002
[15] Hessian Eigenmaps: new locally linear embedding techniques for high-dimensional data, Proceedings of the National Academy Science 2003
[16] Principal Manifolds and Nonlinear Dimension Reduction via Tangent Space Alignment, SIAM Journal of Scientific Computing, 2004
[19] Dimensionality reduction: A Comparative Survey,
[20] Learning with kernels, 2002
[21] Nonlinear component analysis as a kernel eigenvalue problem, Neural computation 1998
[22] An introduction to kernel-based learning algorithms, TNN 2001
3. Applications
不論是在科學研究的哪個領域(甚至部分的社會科學),Dimensionality reduction幾乎都是必備的工具,這邊討論在電腦視覺中Linear subspace和Nonlinear manifold learning方法是如何被應用在各類問題中。
3.1 Subspace as constraints: Structure from motion, Optical flow, Layer extraction, Face alignment
某些問題我們事先可以透過分析得知要解的未知量(e.g., Optical flow, Structure of a scene)存在於一個線性子空間中,在計算時便可以加上subspace的限制得到更正確的估計,這種技巧可以運用在估測3D Structure [1]、2D Optical flow [2]、Planar homogrpahy [3]、Scene layer extraction [4]、和Face alignment [5]等應用。
[3] Multi-view subspace constraints on homographies, ICCV 1999
[4] A subspace approach to layer extraction, CVPR 2001
[5] Face Alignment through Subspace Constrained Mean-shifts, ICCV 2009
3.2 Face Recognition
人臉辨識大概是線性子空間應用最多的一個問題,因為人臉影像在不同光線下的變化往往可以由低維的線性子空間[1]來表示,各種降維的方式如PCA [2]、LDA [3]、LPP [4]、ICA [5]或是經由optimization得來的subspace [6]都曾被提出來以求更佳的辨識率。過去一般的認知往往是Discriminative的方法(e.g., LDA)會比Generative方法(e.g., PCA)來得有效,但實際上並不是所有的情況下都是如此 [8]。另外除了線性的方法之外,Nonlinear manifold learning的觀念也被帶進來處理非線性的問題 [7]。
[1] Lambertian reflectance and linear subspaces, PAMI 2003
[2] Face recognition using eigenfaces, CVPR 1991
[4] Face recognition using laplacianfaces, PAMI 2005
[5] Recognizing faces with PCA and ICA, CVIU 2003
[6] On Optimizing Subspaces for Face Recognition, ICCV 2009
[8] PCA versus LDA, PAMI 2001
3.3 Motion segmentation
在同一場景時常同時有許多運動產生,Motion segmentation的問題即是將這些不同的Motion切割出來,此時subspace方法便可以非常有效地去model每一種motion,於是原先的motion segmentation這個問題便轉化成一個Subspace separation的問題了 [1-4]。
[2] A General Framework for Motion Segmentation: Independent, Articulated, Rigid, Non-rigid, Degenerate and Non-degenerate, ECCV 2006
[3] Motion Segmentation via Robust Subspace Separation in the Presence of. Outlying, Incomplete, or Corrupted Trajectories, CVPR 2008
3.4 Lighting
光線的變化對於拍攝的物體本身會產生非常大的改變,這些改變使得電腦視覺的演算法難以掌握物體外觀的本質,然而光線的變化所產生的影響被認為可以使用線性子空間來表示,於是利用Linear Subspace的方法便能夠捕捉到所有光線(包含未見過的)影響 [1-4]。
[2] From Few to Many: Illumination Cone Models for Face Recognition Under Variable Lighting and Pose, PAMI 2001
[4] Lambertian reflectance and linear subspaces, PAMI 2003
3.5 Visual tracking
基於一個物件的外觀可以用Subspace來簡潔地表示,便可將subspace的觀念運用在物件追蹤的問題上,但是在追蹤的時候,雖著時間推進,物體的外觀可能會受到光線、角度、遮蔽等等影響,所以在追蹤時subspace方法必須考量到如何做incremental update [1-2]。
[1] Incremental learning for robust visual tracking, IJCV 2008
除了以上的應用子空間的方法之外,還可以運用在物件辨識(Object recognition)、動作辨識(Action recognition),和許多電腦圖學的問題上。