Course Teacher – Alessio Micheli
Seminar Classes on Bayesian Learning
Teacher – Davide Bacciu
Aims – Introduction to Bayesian Learning: maximum likelihood hypothesis, MAP and Bayesian hypotheses. Representing (conditional) independence between random variables: Bayesian networks and plate notation. Parameter learning in Bayesian networks: ML and Expectation Maximization (EM). Application examples.
Students’ Office Hours – Tue. 14-16 (email contact for confirmation)
[AIMA] Russell, S. and Norvig, N. Artificial Intelligence: A Modern Approach. Prentice Hall Series in Artificial Intelligence, 2003.
Chapter 20 “Statistical Learning Methods” – Available online here
[MML] Mitchell, T . Machine Learning, McGraw Hill.1997.
Chapter 6 “Bayesian Learning”
A freely available online book that can be used both as a reference course book as well as to deepen course contents is
David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2012.
Chapters that are of interest for this seminars are n. 1 to 5 and n. 8 to 12.
All the official material for the seminars (including slides) can be found on the ML course website on Moodle. On this page you can find a complement of such information. For each class, it is provided a selection of additional readings in addition to the references to the course books.
David Barber’s book is distributed with Matlab code (BRML toolbox) showing examples of Bayesian learning. An archive of the most recent software distribution can be downloaded here. It should run seamlessly also in Octave, an open source porting of Matlab environment.
Further Matlab demo are provided as additional material in the class calendar section.
|Lecture||Topic||Course book||Additional materials and further readings|
|1||Introduction to Bayesian Learning||[AIMA] Sect. 20.1
[MML] Sect. 6.1-6.3, Sect. 6.5-6.9
 Chapt- 1-3Software
[BRML toolbox] Functions demoBurglar.m and demoChestClinic.m show demos of probabilistic inference on Bayesian networks (cf. examples 3.1 and Fig. 3.15 in )
|2||Parameters Learning in Bayesian Network: Learning with Complete Data||[AIMA] Sect. 20.2-20.3
[MML] Sect. 6.4, Sect. 6.5, 6.10, 6.12
 Generative Vs Discriminative
 Tutorial on maximum likelihood estimation with Matlab code
 Sect. 8.8, 9.1-9.3. Chapt 10A step-by-step derivation of the NB prior learning rule can be found here.
For those interested in getting to know more about Lagrange Multipliers here is a good technical source.
|3||Parameters Learning in Bayesian Network: the EM Algorithm||[MML] Sect. 6.11||Further readings
 EM algorithm
 Chapt 11
 Comparative analysis of structure learning algorithmsSoftware
[demoEM] A graphical demo showing EM on a Mixture of Gaussians[BoW Demo] Tutorial code with bag-of-words application to images using Naive Bayes and Probabilistic Latent Semantic Analysis (PLSA).
 Pernkopf, F. and Bilmes, J., Discriminative versus generative parameter and structure learning of Bayesian network classifiers. Proceedings of the 22nd international conference on Machine learning. ACM. 2005.
 I. J. Myung, Tutorial on maximum likelihood estimation, Journal of Mathematical Psychology, Vol. 47, No. 1. (2003), pp. 90-100.
 Bilmes, J.A., A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Technical Report, 1998.
 Ioannis Tsamardinos, Laura E. Brown, Constantin F. Aliferis: The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65(1): 31-78, 2006.
 David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2012.