Machine Learning

A.A. 2020/21 (First Semester)

Course Teacher –  Alessio Micheli

Seminar Classes on Bayesian Learning

Teacher Davide Bacciu

Aims – Introduction to Bayesian Learning: maximum likelihood hypothesis, MAP and Bayesian hypotheses. Representing (conditional) independence between random variables: Bayesian networks and plate notation. Parameter learning in Bayesian networks: ML and Expectation Maximization (EM). Application examples.

Students’ Office Hours – Tue. 14-16  (email contact for confirmation)

Course Book

[AIMA]    Russell, S. and Norvig, N. Artificial Intelligence: A Modern Approach. Prentice Hall Series in Artificial Intelligence, 2003.
Chapter 20 “Statistical Learning Methods”  – Available online here

[MML] Mitchell, T . Machine Learning, McGraw Hill.1997.
Chapter 6 “Bayesian Learning”

A freely available online book that can be used both as a reference course book as well as to deepen course contents is

David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2012.

Chapters that are of interest for this seminars are n. 1 to 5 and n. 8 to 12.

All the official material for the seminars (including slides) can be found on the ML course website on Moodle. On this page you can find a complement of such information. For each class, it is provided a selection of additional readings in addition to the references to the course books.

Software

David Barber’s book is distributed with Matlab  code (BRML toolbox) showing examples of Bayesian learning.  An archive of the most recent software distribution can be downloaded here. It should run seamlessly also in Octave, an open source porting of Matlab environment.

An excellent Matlab package that allows to rapidly build Bayesian models and networksis the Bayes Net Toolbox (BNT) by Kevin Murphy.

Further Matlab demo are provided as additional material in the class calendar section.

Lecture calendar

Lecture Topic Course book Additional materials and further readings
1 Introduction to Bayesian Learning [AIMA] Sect. 20.1
[MML] Sect. 6.1-6.3, Sect. 6.5-6.9
Further readings
[5] Chapt- 1-3Software
[BRML toolbox] Functions demoBurglar.m and demoChestClinic.m  show demos of probabilistic inference on Bayesian networks (cf. examples 3.1 and Fig. 3.15 in [5])
2 Parameters Learning in Bayesian Network: Learning with Complete Data [AIMA] Sect. 20.2-20.3
[MML] Sect. 6.4, Sect. 6.5, 6.10, 6.12
Further readings
[1] Generative Vs Discriminative
[2] Tutorial on maximum likelihood estimation with Matlab code
[5] Sect. 8.8, 9.1-9.3. Chapt 10A step-by-step derivation of the NB prior learning rule can be found here.For those interested in getting to know more about Lagrange Multipliers here is a good technical source.

Software
[demoCoin] A quick demo comparing maximum likelihood and MAP estimation of the classical coin toss scenario.

[demoNB] A demo showing Naive Bayes learning on the 20 Newsgroup dataset.

3 Parameters Learning in Bayesian Network: the EM Algorithm [MML] Sect. 6.11 Further readings
[3] EM algorithm
[5] Chapt 11
[4] Comparative analysis of structure learning algorithms

Software
[demoEM] A graphical demo showing EM on a Mixture of Gaussians

[BoW Demo] Tutorial code with bag-of-words application to images using Naive Bayes and Probabilistic Latent Semantic Analysis (PLSA).

Bibliography

[1]    Pernkopf, F. and Bilmes, J., Discriminative versus generative parameter and structure learning of Bayesian network classifiers.  Proceedings of the 22nd international conference on Machine learning. ACM. 2005.

[2]    I. J. Myung, Tutorial on maximum likelihood estimation, Journal of Mathematical Psychology, Vol. 47, No. 1. (2003), pp. 90-100.

[3]   Bilmes, J.A., A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models, Technical Report, 1998.

[4]   Ioannis Tsamardinos, Laura E. Brown, Constantin F. Aliferis: The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65(1): 31-78, 2006.

[5]  David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2012.