The course aims at providing the student with a self contained introduction to the main ideas of probabilistic machine learning, with a particular emphasis on using statistics as a modelling tool. The outcome of the course will be a theoretical and practical familiarity with the most widely used Bayesian machine learning algorithms and their implementation.
This is a rough summary for the Bayesian Machine Learning course. The main reference is D. Barber's book, Bayesian Reasoning and Machine Learning; the numbers in brackets refer to Barber's book unless explicitly stated otherwise. The book is available online from http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=Brml.HomePage.
- Lecture 1: Statistical basics. Probability refresher, probability distributions, entropy and KL divergence (Ch 1, Ch 8.2, 8.3). Multivariate Gaussian (8.4). Estimators and maximum likelihood (8.6 and 8.7.3). Supervised and unsupervised learning (13.1)
- Lecture 2: Linear models. Regression with additive noise and logistic regression (probabilistic perspective): maximum likelihood and least squares (18.1 and 17.4.1). Duality and kernels (17.3).
- Lecture 3: Bayesian regression models and Gaussian Processes. Bayesian models and hyperparameters (18.1.1, 18.1.2). Gaussian Process regression (19.1-19.4, see also Rasmussen and Williams, Gaussian Processes for Machine Learning, MIT Press, 2007, Ch 2. Available for download at http://www.gaussianprocess.org/gpml/).
- Lecture 4: Active learning and Bayesian optimisation. Active learning, basic concepts and types of active learning (B. Settles, Active learning literature survey, sections 2 and 3, available from http://burrsettles.com/pub/settles.activelearning.pdf.) Bayesian optimisation and the GP-UCB algorithm (Brochu et al, see http://arxiv.org/abs/1012.2599).
- Lecture 5: Latent variables and mixture models. Latent variables and the EM algorithm (11.1 and 11.2.1). Gaussian mixture models and mixture of experts (20.3, 20.4).
- Lecture 6: Graphical models. Belief networks and Markov networks (3.3 and 4.2). Factor graphs (4.4). Exact inference in trees. Message passing and belief propagation (5.1 and 28.7.1).
- Lecture 7: Approximate inference in graphical models. Variational inference: Gaussian and mean field approximations (28.3, 28.4). Sampling methods and Gibbs sampling (27.4 and 27.3).
- Plenary talk
- Assessed Lab session (4 hrs max): Choice 1: GP regression and Bayesian Optimisation; Choice 2: Bayesian Gaussian mixture models.