My Learnings in ML

Posts

Showing posts from February, 2016

Randomized SVD

February 19, 2016

Randomized SVD is a really cute technique for computing the SVD of a large scale matrix. It proceeds as follows: Consider A m x n be the large matrix. Let Q m x k k << n be an orthogonal matrix with k orthonormal columns. We know QQ T = I A can thus be rewritten as - A ≈ QQ T A The first step is to compute the SVD of B=Q T A. As B k x n is much smaller, the SVD of B can be computed using standard methods and is B = SΣV T . Now plugging this back we have A = QSΣV T which becomes the SVD of A with U = QS. The challenge here is constructing the orthogonal matrix Q efficiently. This is done as follows: Compute a Gaussian random matrix Ω n x k . Compute Y m x k = AΩ. Compute the QR decomposition of Y to get the orthonormal matrix Q. The matlab code is as follows: O = randn(n, k); Y = A*O; [Q, R] = qr(Y, 0); B = Q'*A; [S sigma V] = svd(B); U = Q*S;

Random Projections

February 19, 2016

This is one of the beautiful ideas that came across recently while trying to look for something that works well for dimensionality reduction. It is a very simple technique yet works beautifully. It is as follows- Reduce the given data matrix X d x N = R k x d X d x N The core idea behind the entire field is that of Johnson Lindenstrauss Lemma that says that the points in the high dimensional space can be mapped into a low dimensional space that preserves the original structure with a high probability. The matrix R k x d is the random projection that reduces the data into k dimensional subspace. The initial result showed that R can be generated from a Gaussian distribution consisting of k orthogonal random vectors. Later on, it was shown that the Gaussian distribution can be replaced with a much simpler distribution such as generate +1,-1 with probability 1/2 or even generate 0, -1 and +1 with a probability 2/3, 1/6 and 1/6 respectively . This translates the multiplication in