list of research papers

  • Many-Body Coarse-Grained Interactions Using Gaussian Approximation Potentials

    Presented 20 February 2018, University of Manchester

    Authors: S.T. John (, Gábor Csányi (University of Cambridge)

    The Journal of Physical Chemistry (citation: J. Phys. Chem. B 121, 48, 10934-10949)

    Abstract: We introduce a computational framework that is able to describe general many-body coarse-grained (CG) interactions of molecules and use it to model the free energy surface of molecular liquids as a cluster expansion in terms of monomer, dimer, and trimer terms. The contributions to the free energy due to these terms are inferred from all-atom molecular dynamics (MD) data using Gaussian Approximation Potentials, a type of machine-learning model that employs Gaussian process regression. The resulting CG model is much more accurate than those possible using pair potentials. Though slower than the latter, our model can still be faster than all-atom simulations for solvent-free CG models commonly used in biomolecular simulations.

    Gaussian Processes, High-Dimensional Representation, Probabilistic Modelling

  • DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding

    October 2017

    The IEEE International Conference on Computer Vision (ICCV)

    Authors: Dieu Linh Tran (Imperial College London), Robert Walecki, (Imperial College London), Ognjen Rudovic (MIT), Stefanos Eleftheriadis (, Bjorn Schuller (Imperial College London), Maja Pantic (Imperial College London).

    Abstract: Human face exhibits an inherent hierarchy in its representations (i.e., holistic facial expressions can be encoded via a set of facial action units (AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown great results in unsupervised extraction of hierarchical latent representations from large amounts of image data, while being robust to noise and other undesired artifacts. Potentially, this makes VAEs a suitable approach for learning facial features for AU intensity estimation. Yet, most existing VAE-based methods apply classifiers learned separately from the encoded features. By contrast, the non-parametric (probabilistic) approaches, such as Gaussian Processes (GPs), typically outperform their parametric counterparts, but cannot deal easily with large amounts of data. To this end, we propose a novel VAE semi-parametric modeling framework, named DeepCoder, which combines the modeling power of parametric (convolutional) and non-parametric (ordinal GPs) VAEs, for joint learning of (1) latent representations at multiple levels in a task hierarchy, and (2) classification of multiple ordinal outputs. We show on benchmark datasets for AU intensity estimation that the proposed DeepCoder outperforms the state-of-the-art approaches, and related VAEs and deep learning models.

    Deep learning, Gaussian Processes, Probabilistic Modelling

  • Gaussian Process Domain Experts for Modeling of Facial Affect

    published 28 June 2017

    IEEE Transactions on Image Processing

    Authors: Stefanos Eleftheriadis (, Ognjen Rudovic (MIT Media Lab), Marc Peter Deisenroth, (, Imperial College London), Maja Pantic (Imperial College London).

    Abstract: Most of existing models for facial behavior analysis rely on generic classifiers, which fail to generalize well to previously unseen data. This is because of inherent differences in source (training) and target (test) data, mainly caused by variation in subjects’ facial morphology, camera views, and so on. All of these account for different contexts in which target and source data are recorded, and thus, may adversely affect the performance of the models learned solely from source data. In this paper, we exploit the notion of domain adaptation and propose a data efficient approach to adapt already learned classifiers to new unseen contexts. Specifically, we build upon the probabilistic framework of Gaussian processes (GPs), and introduce domain-specific GP experts (e.g., for each subject). The model adaptation is facilitated in a probabilistic fashion, by conditioning the target expert on the predictions from multiple source experts. We further exploit the predictive variance of each expert to define an optimal weighting during inference. We evaluate the proposed model on three publicly available data sets for multi-class (MultiPIE) and multi-label (DISFA, FERA2015) facial expression analysis by performing adaptation of two contextual factors: “where” (view) and “who” (subject). In our experiments, the proposed approach consistently outperforms: 1) both source and target classifiers, while using a small number of target examples during the adaptation and 2) related state-of-the-art approaches for supervised domain adaptation.

    Data Efficiency, Domain Adaptation, Gaussian Processes, Probabilistic Modelling

  • Variational Fourier Features for Gaussian Processes

    Posted 04/01/2018 (No publication date yet)

    Journal of Machine Learning Research (JMLR)

    (not yet available online)

    Authors:  James Hensman & Nicolas Durrand (, Arno Solin (Aalto University)

    Abstract: This work brings together two powerful concepts in Gaussian processes: the variational approach to sparse approximation and the spectral representation of Gaussian processes. This gives rise to an approximation that inherits the benefits of the variational approach but with the representational power and computational scalability of spectral representations. The work hinges on a key result that there exist spectral features related to a finite domain of the Gaussian process which exhibit almost-independent covariances. We derive these expressions for Matérn kernels in one dimension, and generalize to more dimensions using kernels with specific structures. Under the assumption of additive Gaussian noise, our method requires only a single pass through the data set, making for very fast and accurate computation. We fit a model to 4 million training points in just a few minutes on a standard laptop. With non-conjugate likelihoods, our MCMC scheme reduces the cost of computation from O(NM2 ) (for a sparse Gaussian process) to O(NM) per iteration, where N is the number of data and M is the number of features.

    Fourier Features, Gaussian Processes, Probabilistic Modelling, Variational Inference

  • Distributed Lifelong Reinforcement Learning with Sub-Linear Regret

    17-19 December 2018, Miami Beach

    (To appear – not yet available online)

    IEEE Conference on Decision and Control (CDC), Miami Beach, Dec. 17-19, 2018 (To Appear)

    Authors: Julia El-Zini, Rasul Tutunov, Haitham Bou-Ammar (, and Ali Jadbabaie 

    Abstract: In this paper, we propose a distributed second- order method for lifelong reinforcement learning (LRL). Upon observing a new task, our algorithm scales state-of-the-art LRL by approximating the Newton direction up-to-any arbitrary precision ε > 0, while guaranteeing accurate solutions. We analyze the theoretical properties of this new method and derive, for the first time to the best of our knowledge, sublinear regret under this setting

    Distributed Optimisation, Learning to Learn, Lifelong Learning, Online Learning, Reinforcement Learning

  • Scalable Lifelong Reinforcement Learning

    December 2017

    Journal of Pattern Recognition, Volume, 72, Dec. 2017 

    Authors: Yusen Zhan, Haitham Bou-Ammar (, and Matthew E. Taylor 

    Abstract: Lifelong reinforcement learning provides a successful framework for agents to learn multiple consecutive tasks sequentially. Current methods, however, suffer from scalability issues when the agent has to solve a large number of tasks. In this paper, we remedy the above drawbacks and propose a novel scalable technique for lifelong reinforcement learning. We derive an algorithm which assumes the availability of multiple processing units and computes shared repositories and local policies using only local information exchange. We then show an improvement to reach a linear convergence rate compared to current lifelong policy search methods. Finally, we evaluate our technique on a set of benchmark dynamical systems and demonstrate learning speed-ups and reduced running times.

    Data Efficiency, Distributed Optimisation, Learning to Learn, Lifelong Learning, Multitask Learning, Online Learning, Reinforcement Learning, Scalability

  • Correctness-by-Learning of Infinite-State Component-Based Systems

    10-13 October 2017, Braga, Portugal

    International Conference on Formal Aspects of Component Software, Braga, Portugal, 10-13 Oct. 2017 

    Authors: Haitham Bou-Ammar (, Mohamad Jaber, and Mohammad Nassar 

    Abstract: We introduce a novel framework for runtime enforcement of safe executions in component-based systems with multi-party interactions modelled using BIP. Our technique frames runtime enforcement as a sequential decision-making problem and presents two alternatives for learning optimal strategies that ensure fairness between correct traces. We target both finite and infinite state-spaces. In the finite case, we guarantee that the system avoids bad-states by casting the learning process as a one of determining a fixed point solution that converges to the optimal strategy. Though successful, this technique fails to generalize to the infinite case due to the need for building a dictionary, which quantifies the performance of each state-interaction pair. As such, we further contribute by generalizing our framework to support the infinite setting. Here, we adapt ideas from function approximators and machine learning to encode each state-interaction pairs’ performance. In essence, we autonomously learn to abstract similar performing states in a relevant continuous space through the usage of deep learning. We assess our method empirically by presenting a fully implemented tool, so-called RERL. Particularly, we use RERL to: 1) enforce deadlock freedom on a dining philosophers benchmark, and 2) allow for pair-wise synchronized robots to autonomously achieve consensus within a cooperative multi-agent setting.

    Autonomous Formal Verification Methods, Reinforcement Learning, Software Engineering

  • Non-convex Policy Search Using Variational Inequalities

    October 2017

    Journal of Neural Computation, Volume 29, Issue 10, MIT Press, Oct. 2017

    Authors: Yusen Zhan, Haitham Bou-Ammar (, and Matthew E. Taylor 

    Abstract: Policy search is a class of reinforcement learning algorithms for finding optimal policies in control problems with limited feedback. These methods have been shown to be successful in high-dimensional problems such as robotics control. Though successful, current methods can lead to unsafe policy parameters that potentially could damage hardware units. Motivated by such constraints, we propose projection-based methods for safe policies.

    These methods, however, can handle only convex policy constraints. In this letter, we propose the first safe policy search reinforcement learner capable of operating under nonconvex policy constraints. This is achieved by observing, for the first time, a connection between nonconvex variational inequalities and policy search problems. We provide two algorithms, Mann and two-step iteration, to solve the above problems and prove convergence in the nonconvex stochastic setting. Finally, we demonstrate the performance of the algorithms on six benchmark dynamical systems and show that our new method is capable of outperforming previous methods under a variety of settings.

    Non-Convexity, Optimisation, Reinforcement Learning

  • An Information-Theoretic On-Line Update Principle for Perception-Action Coupling

    24-28 September 2017, Vancouver, Canada

    (not yet available online)

    IEEE/RSJ International Conference on Intelligent Robots and Systems, Vancouver, Canada, Sep 24-28, 2017

    Authors: Zhen Peng, Tim Genewein, Felix Leibfried (, and Daniel Braun

    Abstract: Inspired by findings of sensorimotor coupling in humans and animals, there has recently been a growing interest in the interaction between action and perception in robotic systems. Here we consider perception and action as two serial information channels with limited information-processing capacity. We follow and formulate a constrained optimization problem that maximizes utility under limited information-processing capacity in the two channels. As a solution, we obtain an optimal perceptual channel and an optimal action channel that are coupled such that perceptual information is optimized with respect to downstream processing in the action module. The main novelty of this study is that we propose an online optimization procedure to find bounded-optimal perception and action channels in parameterized serial perception-action systems. In particular, we implement the perceptual channel as a multi-layer neural network and the action channel as a multinomial distribution. We illustrate our method in a NAO robot simulator with a simplified cup lifting task.

    Bounded-Rationality, Information Theory, Reinforcement Learning

  • A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games

    6-11 August, Sydney, Australia

    Workshop on Principled Approaches to Deep Learning at the International Conference on Machine Learning, Sydney, Australia, Aug 6-11, 2017

    Authors: Felix Leibfried (, Nate Kushman, and Katja Hofmann

    Abstract: Reinforcement learning is concerned with identifying reward-maximizing behaviour policies in environments that are initially unknown. State-of-the-art reinforcement learning approaches, such as deep Q-networks, are model-free and learn to act effectively across a wide range of environments such as Atari games, but require huge amounts of data. Model-based techniques are more data-efficient but need to acquire explicit knowledge about the environment. In this paper, we take a step towards using model-based techniques in environments with a high-dimensional visual state space by demonstrating that it is possible to learn system dynamics and the reward structure jointly. Our contribution is to extend a recently developed deep neural network for video frame prediction in Atari games to enable reward prediction as well. To this end, we phrase a joint optimization problem for minimizing both video frame and reward reconstruction loss, and adapt network parameters accordingly. Empirical evaluations on five Atari games demonstrate accurate cumulative reward prediction of up to 200 frames. We consider these results as opening up important directions for model-based reinforcement learning in complex, initially unknown environments.

    ATARI Domain, Deep learning, Reinforcement Learning

  • Efficiently detecting switches against non-stationary opponents

    July 2017

    Journal of Autonomous Agent Multi-Agent Systems (July 2017) 31: 767.

    Authors: Pablo Hernandez-Leal, Yusen Zhan, Matthew E. Taylor, L. Enrique Sucar · Enrique Munoz de Cote (

    Abstract: Interactions in multiagent systems are generally more complicated than single-agent ones. Game theory provides solutions on how to act in multiagent scenarios; however, it assumes that all agents will act rationally. Moreover, some works also assume the opponent will use a stationary strategy. These assumptions usually do not hold in real-world scenarios where agents have limited capacities and may deviate from a perfectly rational response. Our goal is still to act optimally in these cases by learning the appropriate response and without any prior policies on how to act. Thus, we focus on the problem when another agent in the environment uses different stationary strategies over time. This will turn the problem into learning in a non-stationary environment, posing a problem for most learning algorithms. This paper introduces DriftER, an algorithm that (1) learns a model of the opponent, (2) uses that to obtain an optimal policy and then (3) determines when it must re-learn due to an opponent strategy change. We provide theoretical results showing that DriftER guarantees to detect switches with high probability. Also, we provide empirical results showing that our approach outperforms state of the art algorithms, in normal form games such as prisoner’s dilemma and then in a more realistic scenario, the Power TAC simulator.

    Game Theory, Learning, Multiagent Systems, Non-Stationary environments, Repeated Games, Switching strategies

  • Differential evolution strategies for large-scale energy resource management in smart grids

    15-19 July 2017

    Proceedings of the Genetic and Evolutionary Computation Conference, July 15-19 2017

    Authors: Fernando Lezama, Enrique Sucar, Joao Soares, Zita Vale, Enrique Munoz de Cote (

    Abstract: Smart Grid (SG) technologies are leading the modifications of power grids worldwide. The Energy Resource Management (ERM) in SGs is a highly complex problem that needs to be efficiently addressed to maximize incomes while minimizing operational costs. Due to the nature of the problem, which includes mixed-integer variables and non-linear constraints, Evolutionary Algorithms (EA) are considered a good tool to find optimal and near-optimal solutions to large-scale problems. In this paper, we analyze the application of Differential Evolution (DE) to solve the large-scale ERM problem in SGs through extensive experimentation on a case study using a 33-Bus power network with high penetration of Distributed Energy Resources (DER) and Electric Vehicles (EVs), as well as advanced features such as energy stock exchanges and Demand Response (DR) programs. We analyze the impact of DE parameter setting on four state-of-the-art DE strategies. Moreover, DE strategies are compared with other well-known EAs and a deterministic approach based on MINLP. Results suggest that, even when DE strategies are very sensitive to the setting of their parameters, they can find better solutions than other EAs, and near-optimal solutions in acceptable times compared with an MINLP approach.

    Differential Evolution, Evolutionary Algorithms, Multiagent Systems

  • Distributed Newton for Network Flow Optimisation

    (No date yet – To Appear)

    (not yet available online)

    SIAM Journal on Optimisation

    Authors: Rasul Tutunov, Haitham Bou-Ammar (, and Ali Jadbabaie 

    Abstract: In this paper, we propose a new distributed second-order method for network flow optimization. Our algorithm exploits the symmetry and diagonal dominance property of the dual Hessian to determine the Newton direction in a distributed fashion up-to-any precision ε > 0. This is achieved by first introducing a novel distributed solver for systems of equations described by symmetric diagonally dominant matrices based on Chebyshev polynomials. Our solver is then used to compute the Newton direction leading to a novel algorithm exhibiting similar phases of convergence to the exact (i.e., centralized) Newton method. We rigorously analyze the theoretical properties of both the solver and the distributed Newton method. Finally, we provide empirical validation for the proposed technique on a variety of network topologies. 

    Data Efficiency, Graph Theory, Network Flow, Optimisation

Quick Links

  • AAAI
  • Abstraction
  • ATARI Domain
  • Autonomous Formal Verification Methods
  • Bayesian Optimisation
  • Bounded-Rationality
  • Data Efficiency
  • Deep Learning
  • Differential Evolution
  • Distributed Optimisation
  • Game Theory
  • Gaussian Processes
  • Graph Theory
  • High-Dimensional Representation
  • Information Theory
  • Learning to Learn
  • Lifelong Learning
  • Model Learning
  • Multiagent systems
  • Multitask Learning
  • Network Flow
  • NIPS
  • Non-Stationary Environments
  • Non-Convexity
  • Online Learning
  • Optimisation
  • Probabilistic Modelling
  • Reinforcement Learning
  • Repeated Games
  • Software Engineering

join our team