COMPUTER SCIENCE 426 - Computer Vision

Larry Davis 
A. V. Williams 2113
lsd@umiacs.umd.edu
405-6718
Office hours:  Monday, Wednesday 2:00-3:00
See new project 1 deadline below  


To get to the   Computer Vision  home page  on the internet for 
image archives, papers and research descriptions click
here. 

Teaching Assistant: Yanqing Zeng (yzeng@cs.umd.edu)

Online tutorials

One good set of tutorial notes can be found here. These contain segments on perspective image formation, motion, etc.

Class FAQ.

SYLLABUS

Introduction - January 28; February 2,4

Binary vision systems - February 4, 9, 16,18

Grey scale vision systems -February 23; March 2, 4, 9.

3-D Vision: March 9, 11,16, 18, ; April 6, 8, 13

Motion and Navigation - April 15,20,22, 27

Color - April 29; May 4

Projects

  1. First programing project (current)

  2. Second programming project
    1. Powerpoint presentation
    2. Some notes on the project
    3. Final presentation on project


Course requirements

  1. Tests - There will be three exams during the semester; each is worth 20% of your grade
    1. Test 1 is scheduled for February 25 and will cover the material up to February 18
    2. Test 2 is scheduled for April 1 and will cover the material through March 25
    3. Test 3 will be the last day of class and will cover the remaining topics
  2. Projects - There will be two projects worth a total of 40% of your grade.


Things for you to think about for Test 1

  1. Example test
  2. The first exam will cover the material presented in class through the lecture of February 18.
  3. There are many definitions of terms in the notes and book. You should know the definitions of all technical terms introduced in the lectures. They are listed below, with links to the definitions.
  4. Things to make sure you know for test 1:
    1. Applying the lensmakers equation to determine the distance behind the lens at which a point will be brought into focus
    2. How to prove that normalized central moments are invariant to scale changes of sets of pixels

Things for you to think about for Test 2

  1. In constructing an image pyramid, we replace each 2x2 nonoverlapping neighborhood of pixels at level i with a single pixel at level (i-1). If the original image is 2^n x 2^n, then how many pixels are there, in total, in an image pyramid? Be able to prove your result using induction.
  2. In coarse fine template matching we correlate a reduced resolution template against a reduced resolution image, and at those locations where the correlation is sufficiently high we correlate the full resolution template against a window of the full resolution image. Suppose that you have an 1000x1000 full resolution image and a 25x25 full resolution template. Brute force template matching requires 1000^2 x 25^2 operations. Suppose that we reduce both the template and image to 200 x 200 and 5 x 5, respectively, and also suppose that when the correlation of the 5x5 template to the 200x200 image is sufficiently high, then we correlate the 25x25 template against the 5x5 window of the full resolution image centered at the high correlation position. At what point does this two stage matching algorithm become less efficient than the brute force algorithm? That is, how many pixels in the 200x200 image can have above threshold correlations with the 5x5 reduced template before we do more work using the two stage method?
  3. Perspective imaging - know how 3-D points project ino 2-D images, what vanishing points are and why parallel lines give rise to vanshing points in images; how stereo images are formed, what conjugate points and lines are.

Things for you to think about for Test 3

  • Example test
    1. Motion - understand the geometry of time-varying images (especially what the focus of expansion is, how optical flow relates to 3-D motion); algorithms for estimating optical flow, especially the gradient based technique and the related aperture problem. Understand basic technqiues for overcoming the constraints imposed by the aperture problem.
    2. 3-D object recognition - how polyhedral objects are represented, what the problem of pose estimation is, how it can be solved using n-point perspective models, Hough transforms and geometric hashing.
    3. Color - How illumination, surface reflectance and spectral sensitivity of receptors combine. Basic ideas on how one can recover estimates of illumination functions and reflectance functions from images.


    Terms and definitions

    1. Image formation
      1. refracted ray
      2. Lensmaker's equation
      3. Optical power
      4. Accommodation
      5. Depth of field
      6. Chromatic abberation
    2. Statistical Pattern Recognition
    3. Segmentation by thresholding
      1. pixel
      2. intensity or grey level
      3. noise
      4. blur
      5. binary image
      6. thresholding
      7. histogram
    4. Connected Component Analysis
      1. 4-neighbors
      2. 8-neighbors
      3. 4-adjacent
      4. path
      5. foreground
      6. connected
      7. connected component
      8. background
      9. boundary
      10. interior
    5. Edge and Local Feature Detection
      1. gradient
      2. convolution
      3. separable function
      4. non-maxima supressiont
      5. simple point
      6. end point
      7. junction
      8. curvature
    6. Correlation matching
      1. template
      2. pyramid
      3. Sum of squared differences
      4. Hough transform
      5. Distance Transform
      6. Chamfer matching
      7. Hausdorff distance