Kernel Methods
One of the popular methods for handling non-linear data using linear models is kernel methods. A kernel function essentially maps the input data into a higher dimensional space. Note that the models are still linear in the parameter space and its only the data which is mapped to a higher dimensional space.
Lets consider the example of linear regression. The least squares method fits a linear function to given data by minimizing the sum of the squares of the errors made at every point. It has a closed form solution given by:
However, consider the following example in which the relationship in the data is non-linear. The linear model fits a straight line to the data which does not capture the real relationship in the data.
The least squares model can be reformulated in a dual representation leading to a kernel representation of the problem as follows: (See Bishop, chapter 6 for the complete derivation.)
The Nadaraya Watson model is also one of the popularly used kernel regression methods. The main goal of the method is to do a weighted similarity in the neighbourhood to predict the value at a point. It is defined as follows:
w = inv(X'*X)*X'*yThis solution works perfectly fine, in case the relationships in the data are linear as shown by the following example.
However, consider the following example in which the relationship in the data is non-linear. The linear model fits a straight line to the data which does not capture the real relationship in the data.
The least squares model can be reformulated in a dual representation leading to a kernel representation of the problem as follows: (See Bishop, chapter 6 for the complete derivation.)
y(xn) = k(xn, x)'* inv(K + lambda*I)*tWe can write the matlab code as follows:
sig = 0.0001; % create rbf kernel K = zeros(n,n); for i=1:n K(i,:) = exp(-0.5*sig*(x(i) - x).^2)'; end % compute prediction lambda = 1; alpha = (K + lambda*eye(n))\y; xr = [min(x) max(x)]; yr = [min(y) max(y)]; xp = xr(1):1:xr(2); %points you want to predict m = length(xp); yp = zeros(m,1); for i = 1:m ki = exp(-0.5*(xp(i) - x).^2/sig)'; yp(i) = ki*alpha; end plot(xp,yp,'g', 'LineWidth', 2);Finally the image showing the linear regression and the kernel regression is as follows. As you can see, the kernel regression is able to fit a non-linear function to the data.
The Nadaraya Watson model is also one of the popularly used kernel regression methods. The main goal of the method is to do a weighted similarity in the neighbourhood to predict the value at a point. It is defined as follows:
hi = K(x, xi) * y / sum(K(x, xi)Here, K(x, xi) represents the similarity of the point xi with all the points x. Effectively, we are predicting the y at a new point by taking the weighted sum of the y in the neighbourhood space and giving higher weights to points that are more similar to xi. In a sense it is similar to KNN. Using this model, we generate the following figure:
Comments
Post a Comment