Introducing our AISTATS papers

An examination of our work at the intersection of AI and statistics

back to our blogs

Introducing our AISTATS papers

An examination of our work at the intersection of AI and statistics


At, our machine learning models aid fast decision-making, providing white-box solutions with quantified uncertainty. In other words, these tools not only help enterprises understand and fast-track their AI-enabled decision-making processes but also tell them how certain we are about our decision.

These tools are under constant development from our research teams, who have had four papers accepted at this year’s Artificial Intelligence and Statistics (AISTATS) conference.

The conference will take place in Palermo, Italy in early June 2020. It’s an interdisciplinary gathering, focusing on the intersection of AI and statistics, an area in which our research teams are making a significant contribution. Let’s look at each of these four papers in more detail now, explaining their findings and impact within the wider AI community.

Bayesian Image Classification with Deep Convolutional Gaussian Processes

Vincent Dutordoir, Mark van der Wilk, Artem Artemev, James Hensman

There is a lot of focus on Bayesian deep learning at the moment, with many researchers tackling this problem by building on top of neural networks and making the inference look more Bayesian. We use a different strategy and start with a Gaussian process, which is a well-understood Bayesian method and allows us to classify images with calibrated uncertainty and accuracy.

In this paper, we use a fully Bayesian model and Deep Convolutional Gaussian processes to perform image classifications. Using this approach, we achieve state-of-the-art performance in terms of both the accuracy of our predictions and their uncertainty quantification, which are features that are not guaranteed by other Bayesian models. This work is essential in certain tasks, especially when decision-making is linked to the outcomes of the prediction model. Senior machine learning researcher Vincent Dutordoir explains: “On certain image classification tasks, our Deep Convolutional Gaussian process models are on-par with neural networks in terms of accuracy, but come with all the benefits from a Bayesian method: automatic training, out-of-sample robustness, uncertainty quantification, and so on.”

AISTATS blog body graphic 1

Our approach quantifies the uncertainty when classifying these numerical digits. While the neural network is certain we are looking at a number 1 in the above image (third column from the right), our models quantify uncertainty, accounting for the possibility of a 7. The number, in fact, is a 7, which is something that the neural network has dismissed.

Uncertainty in Neural Networks: Approximately Bayesian Ensembling

Tim Pearce, Felix Leibfried, Alexandra Brintrup, Mohamed Zaki, Andy Neely

Scaling Bayesian inference to high-dimensional input spaces is a hot, yet currently unresolved, research topic. While neural network techniques do scale, their uncertainty predictions are either unreliable or non-Bayesian. Our approach achieves both scalability and principled uncertainty while also providing a straightforward implementation that does not require any additional computational overheads.

This paper proposes a modification to neural networks (whose uncertainty predictions are unprincipled) to encourage uncertainty estimates to follow Bayes’ rule. In other words, the unprincipled uncertainty estimates associated with neural networks are transformed into approximately Bayesian uncertainty estimates.

Senior machine learning researcher Felix Leibfried explains: “A scalable, yet easy-to-implement, method for uncertainty quantification is of importance for many real-world applications that need to behave sensibly when facing novel inputs that are different from those faced during training.”

AISTATS blog body graphic 2

Predictive distributions produced by various inference methods with ReLU activation functions in single-layer neural networks on a toy regression task. Our method (right) obtains results close to the ground truth (left) when compared to other scalable techniques such as variational inference on network weights and Monte Carlo dropout.

Doubly Sparse Variational Gaussian Processes

Vincent Adam, Stefanos Eleftheriadis, Artem Artemev, Nicolas Durrande, James Hensman

Gaussian Processes (GPs) are widely recognised for their versatility. In this paper, we demonstrate how we can combine two previously separated approaches to scale inference and learning in Gaussian Processes: the sparse variational approach and state-space model-based approaches. We demonstrate the efficiency of this combination by applying our method to large datasets.

As a result, our approach is faster than classic sparse GP and state-space model approaches and can be used for a greater range of applications, including gradient-based learning using valid objectives. We can also use the compositionality of variational inference to build more complex models with ease. 

Senior machine learning researcher Vincent Adam explains: “This method is elegant, defining a task and formal objective clearly. The efficiency of the algorithm relies on statistical properties of the model and the quality of the approximation is easily understood and controlled. GPs are beautiful objects that allow us to quantify uncertainty in interpretable models (unlike the main trend in deep learning). Developing better approximations and faster models is an exciting area to work in and is readily useful in the real world.”

AISTATS blog body graphic 3

Predictions of our models (blue) given data points (orange) supported by a much smaller number of inducing states that summarize its location and curvature (black curves).

Enriched Mixtures of Gaussian Process Experts

Charles W.L. Gadd, Sara Wade, Alexis Boukouvalas

In this paper, we present a novel approach to scaling up Gaussian process models to more complex problems. This is an ongoing area of research and innovation in this area allows for the wider application of Gaussian process models.
We use a divide-and-conquer approach to break the problem into a set of simpler sub-problems where each one is handled by a Gaussian process. When we put everything back together, this creates a novel mixture model, which can scale to higher-dimensional problems than previous approaches. 

Our head of the ML engineering line, Alexis Boukouvalas, explains: “Modelling complex non-stationary problems in high-dimensions has proven challenging using Gaussian processes. This paper proposes a divide-and-conquer approach that allows us to learn a parsimonious and yet flexible and powerful model that can predict well.”


The graphs below show the application of a naive implementation of the mixture of experts model (top) and our approach (bottom). The colours refer to different experts used to explain the data - in the top plot we need more experts and the partitioning is not interpretable. In the bottom plot, our approach is able to partition the data using only two experts in a very interpretable and meaningful fashion.

At, we’re not afraid to tackle the problems other machine learning methods find difficult. If you’d like to find out more about joining our research teams, please click here.

Join us to make AI that will change the world

join our team