This is a research-oriented advanced class that intends to focus on the latest frontier of computer vision. It describes computer vision algorithms that make sense of photographs, video, and other imagery. Applications include robotics, content creation, entertainment, medical image analysis, smart home, security, and HCI, among many others. Through this course, the students will digest and practice their knowledge and skills by many open discussions in classes, and will obtain in-depth experience with a particular research topic through a final project.
Students should have taken the following courses or equivalent: Convex Optimization (381K-18), and Probability & Stochastic Process I (381J).
Previous knowledge of the following courses is helpful, but not necessary: Digital Video (381K-16), Statistical Machine Learning (381V), Data Mining (381L-10), or Cross-Layer Machine Learning HW/SW Design (382V).
Coding experiences with Python are necessary and assumed. Previous knowledge of C/C++, MATLAB or Tensorflow is very helpful, but not necessary.
This course does not follow any textbook closely. Among many recommended readings are:
Grading will be based on class participation (10%), three in-class quizzes (10% each), and one final project (60%) (proposal 15% + presentation 15% + final report 15% + code review 15%). There will be no final exam.
Many materials included in this course are adapted from the existing teaching or tutorial slides, created by colleagues in CMU, UIUC, UC Berkeley, GaTech, UVa, Microsoft, DeepMind, NVIDIA, and more. The instructor owes many thanks for their generosity of sharing those materials publicly.