BMI / CS 771 Learning Based Methods for Computer Vision

Fall 2022, MW 2:30PM - 3:45PM, 2535 Engineering Hall
Instructor: Yin Li

TA: Abrar Majeedi

Computer Vision, art by kirkh.deviantart.com

Course Description

The course addresses the problems of representation and reasoning for large amounts of visual data, including images and videos, medical imaging data, and their associated tags or text descriptions. We will introduce deep learning in the context of computer vision, and cover topics on visual recognition using deep models, such as image classification, object detection, human pose estimation, action recognition, 3D understanding, and medical image analysis. The course emphasizes the design of vision and learning algorithms and models, as well as their practical implementations.

Discussion Group

We will use Piazza. You are encouraged to post your questions on the discussion board so that others can learn from your questions.

Prerequisites

Students are strongly encouraged to have knowledge of computer vision or machine learning (such as CS 540) or medical image analysis (such as BMI/CS 567). In addition, the following skills are necessary for this class:

Programming: Students should have basic proficiency in programming (Python). Projects are to be completed and graded in Python. TA's will support questions about Python.
Math: Linear algebra, vector calculus, and probability theory.

Grading

Your final grade will be made up from

48% 4 homework assignments (mini-projects) that involve programming
40% 1 course project with several milestones
12% Single-page course write-up

Most of the assignments and projects are team based. We do not allow late homework assignments or projects. However, you have three "late days" for the whole course. That is to say, the first 24 hours after the due date and time counts as 1 day, up to 48 hours is two and 72 for the third late day.

These late days are intended to cover unexpected clustering of due dates, travel commitments, interviews, hackathons, etc. Don't ask for extensions to due dates because we are already giving you a pool of late days to manage yourself.

Homework Assignments

The course will consist of 4 homework assignments. All assignments except the first one are team based. Teams of 2-3 students are preferred. In your submission, please clearly identify the contribution of all the team members. Please note that members in the same group might not necessarily get the same grade.

Please post all of your questions on Piazza so that others may learn from your questions as well. Do not email the professor or TA directly with homework questions.

All homeworks are to be submitted by midnight on the due date. All files should be included in a zip file named hwX_yourNetID.zip (where X is the homework number) and uploaded to Canvas. Late submissions should be emailed to the TA (and carbon the instructor). Please attach the zip file in your email.

All starter code and assignments will be in Python with the use of various third party libraries. We will make an effort to support MacOS, Windows, and Linux. The course includes a quick Python tutorial (optional) and assumes you have enough familiarity with procedural and object-oriented programming languages to complete the projects.

Projects

The final project is research-oriented. It can be a pure computer vision project or an application of exisiting vision methods in the student's own research area. You are expected to implement one (or more) related research papers, or think of some interesting novel ideas and implement them using the techniques discussed in class. Students are encouraged to propose their own project topics. You should work on the project in groups of 2-3. In your submission, please clearly identify the contribution of both group members.

There will be four checkpoints for the final project: a project proposal, an intermediate milestone report, a final project report and a project presentation. The details are listed below.

Project Proposal (5%): This will be a single-page document. You will explain what problem you are trying to solve, why you want to solve it, and what are the possible steps to the solution.
Project Mid-Term Report (5%): This will be a single-page brief summary of current progress, including your current results, the difficulties that arise during the implementation, and how your proposal may have changed in light of current progress.
Project Final Report (15%): The final report will be a four-page document. You will describe the motivation of the project, the previous literate, your method and the results. You can reuse the materials that are presented in your proposal / mid-term report. Please include your source code in the submission.
Project Presentation (15%, in class): Each team will be allocated a 15-min slot in class. This slot includes a 12-min presentation and a 3-min QA session.

Course Write-up

The course write-up will be a document that captures your reflection of the course work, e.g., what you have learned, what are the most interesting findings in the course. The write-up must be completed individually. It can be submited as a PDF file or a link to a webpage.

Academic Integrity

This course follows the University of Wisconsin-Madison Code of Academic Integrity. Unless specifically authorized by the instructor, all coursework is to be done by the student working alone. Violations of the rules will not be tolerated.

You are permitted and encouraged to discuss ideas with other students. However, you are expected to implement the core components of each assignment / project on your own / with your team. You should not view or edit anyone else's code. You should not post code to Piazza, except for starter code / helper code that isn't related to the core project.

Contact Info and Office Hours

If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or TA.

Yin: yin[dot]li[at]wisc[dot]edu
Abrar: majeedi[at]wisc[dot]edu

Office Hours

Yin, 1pm - 3pm Thursday (in-person at MSC 6730 or over Zoom)
Abrar, 11am - 1pm Thursday (in-person at MSC 6729 or over Zoom).

Syllabus

Class Date	Topic	Slide	Reading	Assignment
Computer Vision Meets Machine Learning
Wed, Sep 7	Introduction to Visual Recognition	See Canvas		Sign up for Piazza
Mon, Sep 12	Data Driven Paradigm for Computer Vision		Paper 1, 2
Wed, Sep 14	Image Processing using Python (Tutorial led by TA)			Homework 1 out
Deep Learning
Mon, Sep 19	Introduction to Neural Networks		Ch 6, Deep Learning
Wed, Sep 21	Convolutional Neural Networks: Theory		Ch 9, Deep Learning
Mon, Sep 26	Convolutional Neural Networks: Practice		Paper 1, 2, 3
Wed, Sep 28	PyTorch and Cloud Computing (Tutorial led by TA)			Homework 1 due
Mon, Oct 3	Recurrent Neural Networks and Transformers: Part I		Ch 10, Deep Learning
Wed, Oct 5	Recurrent Neural Networks and Transformers: Part II		Paper 1, 2
Mon, Oct 10	Advanced Training		Ch 8, Deep Learning
Visual Recognition
Wed, Oct 12	Image Classification and Adversarial Samples		Paper 1, 2	Project Proposal due
Mon, Oct 17	Object Detection & Instance Segmentation: Part I		Paper 1, 2	Homework 2 out
Wed, Oct 19	Object Detection & Instance Segmentation: Part II		Paper 1, 2
Mon, Oct 24	Semantic Segmentation and Dense Image Labeling		Paper 1, 2
Wed, Oct 26	Human Pose Estimation		Paper 1, 2
Mon, Oct 31	Beyond Classification: Vision & Language		Paper 1, 2, 3
Wed, Nov 2	Action Recognition and Video Understanding: Part I		Paper 1, 2	Homework 2 due
Mon, Nov 7	Action Recognition and Video Understanding: Part II		Paper 1, 2
Wed, Nov 9	3D Scene Understanding: Part I		Paper 1, 2, 3
Mon, Nov 14	3D Scene Understanding: Part II		Paper 1, 2, 3
Wed, Nov 16	Mid-term Project Presentation
Mon, Nov 21	Mid-term Project Presentation			Mid-term report due
Wed, Nov 23	No Class; Happy Thanksgiving!
Mon, Nov 28	Generating Images and Beyond		Paper 1, 2
Wed, Nov 30	Self-supervised Visual Representation Learning		Paper 1, 2	Homework 3 out
Mon, Dec 5	Medical Image Analysis: Part I		Paper
Wed, Dec 7	Medical Image Analysis: Part II		Paper
Project Presentations
Mon, Dec 12	Project Presentations
Wed, Dec 14	Project Presentations and Course Wrap-up			Homework 3 due Project report due Course write-up due
	Final Exam Period - not used

Acknowledgments

The materials from this class rely significantly on slides prepared by other instructors, especially many slides are modified from those of Abhinav Gupta, Svetlana Lazebnik and Alexei A. Efros, who in turn uses materials from many people. Each slide set contains acknowledgments. Feel free to use these slides for academic or research purposes, but please maintain all acknowledgments.