Spring 24 - Introduction to Computer Vision

Course title
EE 379K: Introduction to Computer Vision

Term
Spring 2024

Meeting times and location
TR 2:00-3:30pm (EER 1.518)

After-class platform
Slack (link sent to registered students)

Video recording
Available on Canvas

Course Description and Prerequisites

Computer vision (CV) is the discipline of “teaching machines how to see”: it makes sense of photographs, video, and other imagery. Applications include analysis of medical images, automated quality inspection, entertainment, vehicle safety, security, and HCI, among many others. This course offers a gentle introduction to computer vision, including image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. Both classical and the latest deep learning approaches will be covered.

The students will digest and practice their knowledge and skills by both homework and a midterm exam. They will also obtain in-depth experience with a particular topic through a final project. There will be no final exam.

Students should have taken the following courses or equivalent: Algorithms (EE 360C or CS 314/314H), Linear Systems and Signals (EE313 or BME 343), Probability and Random Processes (EE 351K or BME 335 or MATH 362K). Solid Knowledge of Linear Algebra will be instrumental to this course.

Coding experiences with Python are assumed. Previous knowledge of C/C++, MATLAB, or PyTorch/Tensorflow is very helpful, but not necessary.

Instructor Information

Name
Dr. Zhangyang (Atlas) Wang

Telephone number
512-471-1866

Email address

Office hour time
Wednesday 1:00pm - 2:00pm

Office hour location
EER 6.886 (instructor office)

TA Information

TA 1 Name

Email address

Office hour time
Monday 4:00pm - 5:00pm

Office hour location
EER 3.854

TA 2 Name

Email address

Office hour time
Friday 4:00pm - 5:00pm

Office hour location
EER 3.854

Textbook and/or Resource Material

This course does not follow any textbook closely. Among many recommended readings are:

Grading Policies

Grading will be based on homework (20%; there will be 4 assignments), one mid-term exam (30%), and one final project (50%) (proposal 10% + mid-report 10% + presentation 5% + final report 15% + code review 10%).

  • Projects in the novel, interdisciplinary domains (some examples: 5G/6G telecommunication, brain-computer interface, economics & markets, COVID-19, etc.), judged by the instructor. (+2%)
  • For late submission, each additional late day will incur a 10% penalty.
  • Request for re-grading an assignment must be made in writing within one (1) week of the graded assignment being made available to the class.
Course Topics

1/16 Tuesday
Class Logistics, and Fundemental Vision Theory [Slides 1/16]
(Extended Materials: MIT lecture on "Marr’s Level’s of Analysis")


1/18 Thursday
Image Representation (1): From Our Brain to the Digital World

1/23 Tuesday
Image Representation (2): Gaussian and Laplacian Image Pyramids

1/25 Thursday
Image Representation (3): Taking A Frequency Domain View [Slides 1/18 + 1/23 + 1/25]
(Extended Materials: Review of Sampling, Aliasing, and Fourier Analysis Methods)


1/30 Tuesday
Image Filtering (1): Pointwise, Convolution, and Beyond [Slides 1/30]

2/01 Thursday
Image Filtering (2): Edge Detection, from Sober to Canny [Slides 2/01]

2/06 Tuesday
TA Lecture: Q&A on Course Projects & Cracking the Coding! [Slides 2/06] [Jupyter Notebook]

2/08 Thursday
Cross-Image Matching (1): Detecting Key Points

2/13 Tuesday
Cross-Image Matching (2): Extracting Feature Descriptors from Key Points

2/15 Thursday
Cross-Image Matching (3): Robust Matching of Descriptors [Slides 2/08 + 2/13 + 2/15]
(Extended Materials i: Review of Linear Algebra, especially EVD, SVD and PCA)
(Extended Materials ii: Geometric Interpretation of PCA)


2/20 Tuesday
Mapping 3D World to Image (1): Pinhole and Lens Cameras

2/22 Thursday
Mapping 3D World to Image (2): Developing the Pinhole Camera Model

2/27 Tuesday
Mapping 3D World to Image (3): Geometric Camera Calibration [Slides 2/20 + 2/22 + 2/27]
(Extended Materials i: Solving Least Sqaures using SVD)
(Extended Materials ii: Geometric Camera Calibration in Action: An OpenCV Example)


2/29 Thursday
Stereo Vision (1): Two-Camera Models, and Triangulation

3/05 Tuesday
Stereo Vision (2): Epipolar Geometry

3/07 Thursday
Stereo Vision (3): Essential and Fundemental Matrices

3/12 Tuesday
- No Class (Spring Break) -

3/14 Thursday
- No Class (Spring Break) -

3/19 Tuesday
Stereo Vision (4): Depth Estimation [Slides 2/29 + 3/05 + 3/07 + 3/19]

3/21 Thursday
Midterm Exam (in class)

3/26 Tuesday
Video and Optical Flow (1)

3/28 Thursday
Video and Optical Flow (2) [Slides 3/26 + 3/28]

4/02 Tuesday
Classical Machine Learning [Slides 4/02]

4/04 Thursday
Image Classification: Bag-of-Words [Slides 4/04]

4/09 Tuesday
Object Detection and Segmentation (1)

4/11 Thursday
Object Detection and Segmentation (2) [Slides 4/09 + 4/11]
(Extended Materials: The Viola-Jones Algorithm Explained in Details)


4/16 Tuesday
Deep Learning in Computer Vision (1)

4/18 Thursday
Deep Learning in Computer Vision (2)

4/23 Tuesday
Deep Learning in Computer Vision (3)

4/25 Thursday
Deep Learning in Computer Vision (4) [Slides 4/16 + 4/18 + 4/23 + 4/25]

Acknowledgement

Many materials included in this course are adapted from the existing teaching or tutorial slides, created by colleagues in CMU, Stanford, UIUC, UC Berkeley, GaTech, Brown, and more. The instructor owes many thanks for their generosity of sharing those materials publicly.