Advanced topics in artificial intelligence I (CK0146)
Pattern recognition (TIP8311)

The course overviews selected topics in machine learning, pattern recognition, or was it advanced topics in artificial intelligence? To make it simpler for everybody, we shall call it data science.

The course deals with some of the central principles of data science, including probabilistic modelling, density estimation and generalised linear models for regression and classification:
  1. Introductory refresher: Probability theory, decision theory and information theory;
  2. Probability distributions: Binary and multinomial variables, the Gaussian, the exponential family, non-parametric distributions;
  3. Linear models for regression: Linear basis function models, Bayesian linear regression, evidence approximation;
  4. Linear models for classification: Discriminant functions, probabilistic generative models, probabilistic discriminative models, Bayesian logistic regression.
Stuff from the past: Previous version of the course are available (internal links may need adjustment) for 2015.2

Instructor : Francesco Corona (FC), francesco döt corona ät ufc döt br
Teaching assistants : José Florencio de Queiroz Neto (F ), jfqn ät ask döt him and Alisson Sampaio de Carvalho Alencar (A ), asca ät ask döt him

Physical location : Tuesdays and Thursdays 10:00-12:00. Bloco 951, Sala 2.
Internet location : Here! Or, here (CK0146) and here (TIP8311) for mambojumbo related to administration.

Evaluation : Approx. half a dozen theoretical and practical problem sets will be assigned as homework. Partial evaluations (APs) will consist of exercises that are randomly drawn from the aformentioned sets. The APs must be worked out in the classroom, individually.

>>>>>> Wanna get candy? Participate to this survery and spread the link (UFC under- or post-grad peeps ONLY) <<<<<<

Go to:   Lectures and schedule | Problem sets | Supplementary material | As it pops out |

Lectures and schedule

We meet on Tuesday AUG 16 at 10:15am (give or take 5), to briefly introduce each other and discuss some practicalities.

  1. About this course

    A About this course (FC)
    • About the type of machine learning, pattern recognition, and advanced topics in artificial intelligence that we study

  2. Introductory refresher

    A Probability theory (FC)
    • Slides ( SEP 01, SEP 05, SEP 08 and SEP 13). Last updated on SEP 09.
    • Definitions and rules, densities, expectations and covariances
    • Bayesian probabilities, the univariate Gaussian distribution
    B Decision theory (FC)
    • Minimisation of the classification rate and minimisation of the expected loss
    • The reject option
    • Inference and decision
    • Loss functions for regression
    C Information theory (FC)
    • Slides ( SEP 20). Last updated on SEP 20.
    • Definitions, relative entropy and mutual information

    Exercises ( SEP 22 and SEP 27, F and A) Hand-in by OCT 09 (was 02) at 23:59:59 Fortaleza time
    Results in [0,10]

  3. Probability distributions

    A Binary and multinomial variables (FC)
    • Bernoulli, binomial and beta distributions
    • Multinomial and Dirichlet distributions
    B The Gaussian distribution (FC)
    • Slides ( OCT 04, OCT 06, OCT 11, OCT 13, OCT 18, OCT 20 and OCT 25). Last updated on OCT 13
    • Conditional and marginal Gaussians, Bayes' rule for Gaussians, maximum likelihood, sequential estimation, Bayesian inference, Student's t-distribution, mixtures of Gaussians (naive MLE, with latent variables and EM, k-means)
    C The exponential family (FC)
    • Maximum likelihood and sufficient statistics
    • Conjugate and non-informative priors
    D Non-parametric distributions (FC)
    • Slides ( OCT 27). Last updated on NOV 02
    • Kernel and nearest-neighbour density estimators

    Exercises ( NOV 01 and NOV 03, F and A) Hand-in by NOV 23 (was 16) at 23:59:59 Fortaleza time

  4. Generalised linear models for regression

    A Linear basis function models (FC)
    • Slides ( NOV 08, NOV 10 and NOV 17).
    • Maximum likelihood and least-squares, geometry of the least-squares
    • Sequential learning
    • Regularised least-squares
    • Multiple outputs
    B Bias-variance decomposition (FC)
    • Bias-variance decomposition
    C Bayesian linear regression (FC)
    • Slides ( NOV 22 and NOV 24).
    • Parameter distribution and predictive distribution
    • The equivalent kernel and Gaussian processes for regression
    D Bayesian model comparison (FC)
    • Bayesian model comparison

    Exercises ( DEC 01, F and A) Hand-in by DEC 13 (was DEC 11) at 23:59:59 Fortaleza time

  5. Generalised linear models for classification

    A Discriminative functions (FC)
    • Two- and multi-class classification
    • Least-squares for classification
    • Fisher's linear discriminant
    B Probabilistic generative models (FC)
    • Continuous inputs, maximum likelihood solution
    C Probabilistic discriminative models (FC)
    • Logistic regression and iterative re-weighted least-squares
    • Probit regression
    • Canonical link functions
    D Laplace approximation (FC)
    • Model comparison and BIC
    E Bayesian logistic regression (FC)
    • Laplace approximation, predictive distribution

Problem sets

As we use problem set questions covered by books, papers and webpages, we expect you not to copy, refer to, or look at the solutions in preparing your answers. We expect you to want to learn and not google for answers: If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution.

The purpose of problem sets is to help you think about the material, not just give us the right answers.

Homeworks must be done individually: Each of you must hand in his/her own answers. In addition, each of you must write his/her own code when requested. It is acceptable, however, for you to collaborate in figuring out answers. We are assuming that you take the responsibility to make sure you personally understand the solution to any work arising from collaboration (though, you must indicate on each homework with whom you collaborated).

To typeset assignments, students and teaching assistants are encouraged to use this LaTeX template: Source (PDF).

Assignments must be returned before deadline via SIGAA (if you're in it, you'll get notified of the opening of a new task) or, if you're not in SIGAA, return via email to one of the responsible teaching assistants - Delays will be penalised.



Course slides will suffice. Slides are mostly based on the following textbook: The material can be complemented using material from the following textbooks (list not exhaustive):
  1. Machine Learning: A Probabilistic Perspective, by Kevin Murphy;
  2. The Elements of Statistical Learning (Book website), by Trevor Hastie, Robert Tibshirani and Jerome Friedman;
  3. Bayesian Reasoning and Machine Learning (Book website), by David Barber.
Copies of these books are floating around.

>>>>>> Course material is prone to a typo or two - Please inbox FC to report <<<<<<


Read me or watch me