Robust PCA 学习笔记(一)

2014-11-24 08:08:55 · 作者: · 浏览: 109
Robust PCA
Rachel Zhang
1. RPCA Brief Introduction
1. Why use Robust PCA
Solve the problem withspike noise with high magnitude instead of Gaussian distributed noise.
2. Main Problem
Given C = A*+B*, where A*is a sparse spike noise matrix and B* is a Low-rank matrix, aiming at recoveringB*.
B*= UΣV’, in which U∈Rn*k ,Σ∈Rk*k ,V∈Rn*k
3. Difference from PCA
Both PCA and Robust PCAaims at Matrix decomposition, However,
In PCA, M = L0+N0, L0:low rank matrix ; N0: small idd Gaussian noise matrix,it seeks the best rank-k estimation of L0 by minimizing ||M-L||2 subjectto rank(L)<=k. This problem can be solved by SVD.
In RPCA, M = L0+S0, L0:low rank matrix ; S0: a sparse spikes noise matrix, we’lltry to give the solution in the following sections.
2. Conditionsfor correct decomposition
4. Ill-posed problem:
Suppose sparse matrix A* and B*=eiejTare the solution of this decomposition problem.
1) With the assumption that B* is not only low rank but alsosparse, another valid sparse-plus low-rank decomposition might be A1= A*+ eiejT andB1 = 0, Therefore, we need an appropriatenotion of low-rank that ensures that B* is not too sparse. Conditionswill be imposed later that require the space spanned by singular vectors U andV (i.e., the row and column spaces of B*) to be “incoherent” with the standardbasis.
2) Similarly, with the assumption that A* is sparse as well aslow-rank (e.g. the first column of A* is non-zero, while all other columns are0, then A* has rank 1 and sparse). Another valid decomposition might be A2=0,B2 = A*+B* (Here rank(B2) <= rank(B*) + 1). Thus weneed the limitation that sparse matrix should beassumed not to be low rank. I.e., assume each row/column does not havetoo many non-zero entries (don’t exist dense row/column), to avoid such issues.
5. Conditions for exact recovery / decomposition:
If A* and B* are drawnfrom these classes, then we have exact recovery with high probability [1].
1) For low rank matrix L---Random orthogonal model [Candes andRecht 2008]:
A rank-k matrix B* with SVD B*=UΣV’ is constructed in this way: Thesingular vectors U,V∈Rn*k are drawnuniformly at random from the collection of rank-k partial isometries(等距算子) inRn*k. The singular vectors in U and V need not be mutuallyindependent. No restriction is placed on the singular values.
2) For sparse matrix S---Random sparsity model:
The matrix A* such that support(A*) is chosen uniformlyat random from the collection of all support sets of size m. There is noassumption made about the values of A* at locations specified by support(A*).
[Support(M)]: thelocations of the non-zero entries in M
Latest [2] improved on the conditions andyields the ‘best’ condition.
3. Recovery Algorithms
6. Formulization
For decomposition D = A+E,in which A is low rank and error E is sparse.
1) Intuitively propose
minrank(A)+γ||E||0, (1)
However, it is non-convex thus intractable (both of these 2are NP-hard to approximate).
2) Relax L0-norm to L1-norm and replace rank with nuclear norm
min||A||* + λ||E||1,
where||A||* =Σiσi(A) (2)
This is convex, i.e., exist a uniquely minimizer.
Reason: This relaxation ismotivated by observing that ||A||* + λ||E||1 is theconvex envelop (凸包) of rank(A)+γ||E||0 over the set of (A,E) suchthat max(||A||2,2,||E||1, ∞)≤1.
Moreover, there might be circumstancesunder which (2) perfectly recovers low-rank matrix A0.[3] shows itis indeed true under surprising broad conditions.
7. Approach RPCA Optimization Algorithm
We approach in twodifferent ways. 1st approach, use afirst-order method to solve the primal problem directly. (E.g. ProximalG