Kernel methods are widely employed in machine learning as they are a powerful tool to implicitly map data into high-dimensional spaces, enabling the discovery of complex patterns that might be challenging to capture in the original feature space.
Although many classification and regression problems can be successfully attacked with a single kernel, sometimes real-world datasets exhibit diverse structures and employing several kernel types (one for each notion of similarity that we aim to take into account) is necessary.
This is where multi-kernel learning (MKL) comes into play.
This paper revisits multi-kernel classification with a specific focus on kernel(s) selection in the light of the recent developments in stochastic variational inference (SVI). In the framework of kernelized logistic regression, we consider positive semi-definite linear combitinations of kernels and we treat the kernel weights as random variables. Proper choices of prior distributions make naturally emerge a Lasso penalty, whereas the power of SVI allows us to estimate the model and variational parameters in a fully differentiable context as well as to build confidence intervals for the kernel weights. Numerical examples highlight our approach.
- Poster