Note: answers are bolded
In the AdaBoost algorithm, if the final hypothesis makes no mistakes on the training data, which of the following is correct?
- The individual weak learners also make zero error on the training data.
- Additional rounds of training always leads to worse performance on unseen data.
- Additional rounds of training can help reduce the errors made on unseen data.
You are given a set of examples that are linearly inseparable over an original feature set X. The classifier trained on this set of examples using a blown-up feature space Φ(X) always performs worse than the one trained using the kernel based method.
Given a pre-existing kernel k(x, x'), which of the following is not guaranteed to be a valid kernel?
- k'(x,x') = exp(c*k(x,x')), where c is a constant
- k'(x,x') = log(x)k(x,x')log(x')
- k'(x,x') = (k(x,x'))2
- k'(x,x') = k(x,x') + xAx', where A is an upper triangular matrix
What does the generalization ability (or: mistake bound) of using a Kernel method for Perceptron depend on?
- The size of the original feature space
- The size of the corresponding blown up feature space
Which of the following functions will have significant improvement in accuracy upon using the Kernel Perceptron with polynomial kernel instead of the regular perceptron algorithm?
- l-of-m-of-n class of functions
- Class of functions where only positive examples are enclosed by an ellipse