Orthogonally Decoupled Variational Gaussian Processes

back to our research

Orthogonally Decoupled Variational Gaussian Processes

December 2 - 8, 2018 NeurIPS, Montreal, Canada

Authors: Hugh Salimbeni (PROWLER.io and Imperial College London), Ching-An Cheng (Georgia Institute of Technology), Byron Boots (Georgia Institute of Technology), Marc Deisenroth (PROWLER.io and Imperial College London)

Abstract: Gaussian processes provide a powerful non-parametric framework for reasoning over functions. Despite appealing theories, its superlinear computational and memory complexities have presented a long-standing challenge. The state-of-the-art methods of sparse variational inference trade modeling accuracy with complexity. However, their complexities still scale superlinearly in the number of basis functions, so they can learn from large datasets only when a small model is used. Recently, a decoupled approach was proposed to remove the unnecessary coupling between the complexities of modeling the mean and the covariance functions. It achieves a linear complexity in the number of mean parameters, so an expressive posterior mean function can be modeled. While promising, this approach suffers from optimization difficulties due to ill-conditioning and non-convexity. In this work, we propose an alternative decoupled parametrization. It adopts an orthogonal basis in the mean function to model the residues that cannot be learned by the standard coupled approach. Therefore, our method extends, rather than replaces, the coupled approach to achieve strictly better performance. This construction admits a straightforward natural gradient update rule, so the structure of the information manifold that is lost during decoupling can be leveraged to speed up learning. Empirically, our algorithm demonstrates significantly faster convergence in multiple experiments.

Probabilistic Modelling

NeurIPS

Gaussian Processes


See paper

Join us to make AI that will change the world

join our team