CS6375: Machine Learning

3 Credit Course, JO 3.516, 2018

Fall 2018

#f03c15 This is a previous offering of this class. See the teaching page for current courses.

Course Overview

Class Hours: Tu/Th 4:00–5:15pm
Class Room: JO 3.516

Instructor: Gautam Kunapuli
Office: ECSS 2.717
Email: gautam-dot-kunapuli-@-utdallas.edu
Office Hours: Wed 12pm-1pm, and by appointment.

Teaching Assistant: Siwen Yan
Email: sxy170011-@-utdallas.edu
Office Hours: Fri 9:30am-11:30am, ECSS 2.104 A1

Course Description

The main aim of the course is to provide an introduction and hands-on understanding of a broad variety of machine-learning algorithms on real applications. In addition to delving into the underlying mathematical and algorithmic details for many learning methods, we will also explore the practical aspects of applying machine learning to real-world data through programming assignments.

Pre-requisities

The mandatory pre-requisite is CS5343: Algorithm Analysis and Data Structures.

In addition, many concepts in this class require a comfortable grasp of basic probability theory, linear algebra, multivariate calculus and optimization. Garret Thomas’ Mathematics for Machine Learning is a superb review of essential mathematical background: you can read it here.

The programming assignments will require coding in Python.

Textbooks and Course Materials

There is no required textbook for this class. However, the following textbooks are useful references for various topics we will cover in this course:

  • Pattern Recognition and Machine Learning by Christopher M. Bishop; this is a standard textbook and reference for introductory machine learning and covers a large part of our syllabus;
  • Machine Learning: a Probabilistic Perspective by Kevin Murphy; another excellent book and reference, especially for probabilistic graphical models.

The following books are available online, free for personal use. Supplemental reading material will be assigned from these sources as often as possible.

  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani and Jerome Friedman (available online)
  • Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, Anuj Karpatne and Vipin Kumar (available online)
  • Bayesian Reasoning and Machine Learning by David Barber (available online)
  • Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai Ben-David (available online) introduces machine learning from a theoretical perspective;
  • Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville (available online) is an excellent introductory textbook for a wide-variety of deep learning methods and applications;
  • Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (available online) is the de facto textbook and reference for reinforcement learning;

Syllabus and Schedule

The topic schedule is subject to change at the instructor’s discretion. Please check this page regularly for lecture slides, additional references and reading materials.

WeekDateTopicReadingsNotes
1Aug 21 (tu)Introduction & Linear RegressionBishop, Ch. 1 
 Aug 23 (th)Linear Regression (continued)Andrew Ng’s Lecture Notes, Part I;
Shalev-Shwartz & Ben-David, Ch. 9.2;
Kilian Weinberger’s Lecture Notes (probabilistic view)
 
2Aug 28 (tu)PerceptronShalev-Shwartz & Ben-David, Ch. 9.1;
Kilian Weinberger’s Lecture Notes
 
 Aug 30 (th)Decision TreesSriraam Natarajan’s Primer on Uncertainty;
Mitchell, Ch. 3;
Kilian Weinberger’s Lecture Notes
HW1 Out
3Sep 04 (tu)Decision Trees (continued)  
 Sep 06 (th)Nearest Neighbor MethodsBishop, Ch. 14.4;
Daumé III, Ch. 3
 
4Sep 11 (tu)Support Vector MachinesAndrew Ng’s Lecture Notes;
Bishop, Ch. 7;
Barber, Ch. 17.5;
Shalev-Shwartz & Ben-David, Ch. 15
HW1 Due
 Sep 13 (th)Support Vector Machines (continued) PA1 Out
5Sep 18 (tu)Machine Learning Theory
VC Dimension, PAC bounds
Bias-variance tradeoff
Andrew Ng’s Lecture Notes;
Shalev-Shwartz & Ben-David, Ch. 3-6
 
 Sep 20 (th)Machine Learning Practice
pre-processing, model selection,
cross validation, missing data, evaluation

Kotsiantis et al., 2006 
6Sep 25 (tu)Naive BayesMitchell 2nd ed. Ch. 3.1-3.2;
Daumé III, Ch. 9
 
 Sep 27 (th)Logistic RegressionMitchell 2nd ed. Ch. 3.3-3.5;
Bishop, 8.4.1, 9.2, 9.3, 9.4;
Andrew Ng’s Lecture Notes, Pt II;
Kilian Weinberger’s Lecture Notes
 
7Oct 2 (tu)Logistic Regression (continued) PA1 Due
 Oct 4 (th)#3ca015 Mid-Term Prep  
8Oct 9 (tu)#f03c15 In-class Mid-Term Exam  
 Oct 11 (th)Ensemble Methods: BaggingBishop, Ch. 14;
Hastie et al., Ch. 7.1-7.6, 8.7;
Visualization of the Bias-Variance Tradeoff
 
9Oct 16 (tu)Ensemble Methods: BoostingHastie et al., Ch. 15;
Freund and Schapire, 1999
PA2 out
 Oct 18 (th)Ensemble Methods: Gradient BoostingFriedman, 99;
Mason et al., 99;
Visualizing Gradient Boosting;
Tong He’s Presentation on XGBoost
 
10Oct 23 (tu)Principal Components AnalysisAndrew Ng’s Lecture Notes, Pt II 
 Oct 25 (th)ClusteringTan et al., Ch. 8 
11Oct 30 (tu)Clustering (continued) PA2 due
HW2 out
 Nov 01 (th)Neural NetworksGoodfellow et al., Ch. 6 
12Nov 06 (tu)Neural Networks (continued)  
 Nov 08 (th)Convolutional Neural NetworksGoodfellow et al., Ch. 9 
13Nov 13 (tu)Recurrent Neural NetworksGoodfellow et al., Ch. 10HW2 due
 Nov 15 (th)Practical Deep LearningGoodfellow et al., Ch. 7, 8PA3 Out
14Nov 20 (tu)Fall/Thanksgiving Break  
 Nov 22 (th)Fall/Thanksgiving Break  
15Nov 27 (tu)Reinforcement Learning
Slides have been updated.
Sutton and Barto, Ch. 1, 3;
Andrew Ng’s Lecture Notes
 
 Nov 29 (th)Reinforcement Learning (continued)  
16Dec 4 (tu)#3ca015 Final Exam Prep PA3 due
 Dec 6 (th)   
 Dec 13 (th)#f03c15 Final Exam
JO 3.516, 5:00pm-7:45pm
  

Grading

  • 20%, Homework problem sets (2, each 10%)
  • 30%, Programming assignments (3, each 10%)
  • 20%, Mid-term Exam
  • 30%, Final Exam

Course Policies

Attendance Policy

Classroom attendance for all lectures is mandatory. Prolonged absence from the lectures may lead to substantial grade penalties:

  • two consecutive absences, no penalty;
  • 3 consecutive absences: 1 letter grade drop;
  • 4 consecutive absences, F grade.

Absence due to emergency or extenuating circumstances can be excused, but proof may be required.

Homework Policy

Homework assignments are due at the start of class on the due date without exceptions, unless permission was obtained from the instructor in advance. Homework and assignment deadlines will not be extended except under extreme university-wide circumstances such as weather emergencies.

All homeworks, programming projects, take-home exams (if any) are to be written up and completed individually. You may discuss, collaborate, brainstorm and strategize ideas, concepts and problems with other students. However, all written solutions and coded programs must be your own. Copying another student’s work or allowing other students to copy your work is academically dishonest.

Academic Integrity

All students are responsible for adhering to UT Dallas Community Standards and Conduct, particularly regarding Academic Integrity and Academic Dishonesty. Any academic dishonesty, including, but not restricted to plagiarism (including from internet sources), collusion, cheating, fabrication, will result in a zero score on the assignment/project/exam and possible disciplinary action.

Students with Disabilities

UT Dallas is committed to equal access in all endeavors for students with disabilities. The Office of Student Accessability (OSA) provides academic accommodations for eligible students with a documented disability. Accommodations for each student are determined by OSA on an individual basis, with input from qualified professionals. Accommodations are intended to level the playing field for students with disabilities, while maintaining the academic integrity and standards set by the University. If you think you qualify for an academic accommodation, please visit OSA to determine eligibility.

If you have already received academic accommodation, please contact me by e-mail to schedule an appointment before classes start, if possible.