Instructor:
Dr. George Bebis
- Email:
bebis@cse.unr.edu
- Phone:
784-6463
- Office: 235 SEM
- Office Hours: MW 1:00pm - 2:30pm or by appointment.
Prerequisites
Good background in image processing (CS674), computer vision (CS685), pattern
recognition (CS679), linear algebra, probabilities, and statistics.
Texts
We will not use any text in this course; all of the material will be drawn
from lecture notes and research papers.
Useful Texts
- Emanuele Trucco, Alessandro Verri, Introductory Techniques for 3-D Computer Vision, Prentice Hall, 1998.
- Forsyth and Ponce, Computer Vision - A modern approach, Prentice Hall, 2002.
- Shapiro and Stockman, Computer Vision, Prentice Hall, 2001.
Computer Vision Resources
Object Recognition Resources
Object Recognition Challenges and Datasets
Segmentation Datasets and Benchmarks
Useful Software
Description and Objectives
Recognizing objects from images has been a challenging task in computer vision.
This is because objects may look very different from different viewing
positions. The most successful approach is in the context of "model-based"
object recognition, where the environment is rather constrained and recognition
relies upon the existence of a set of predefined model objects. Given an
unknown scene, recognition implies: (i) the identification of a set of features
from the unknown scene which approximately match a set of features from a known
view of a model object, (ii) the recovery of the geometric transformation that
the model object has undergone (i.e., pose recovering) and, (iii) verification
that other features coincide with predictions. Since usually there is no
a-priori knowledge of which model points correspond to which scene
points, recognition can be computationally too expensive, even for a moderate
number of models. Our goal in this course would be to study several well
known techniques in object recognition.
This course is primarily intended for highly motivated students interested
in doing research in object recognition and computer vision in general.
It will be essential for students to have a solid understanding of basic
topics in math, such as linear algebra, probability and statistics, and
calculus. It will also be useful to have some knowledge of computer vision,
image processing, and geometry. In general, the more math a student knows,
the easier the course will be.
Topics
- Image Formation and Perspective Projection
- Approximations to Perspective Projection
- Segmentation and Feature Extraction
- 2D Object Recognition Using Geometric Models
- 3D Object Recognition Using Geometric Models
- Object Recognition Using Appearance Models
- Grouping
- Error Analysis
Course Requirements
This course is primarily intended for highly motivated students interested
in doing research in object recognition and computer vision in general. It
will be essential for students to have a solid understanding of basic topics
in math, such as linear algebra, probability and statistics, and calculus.
It will also be useful to have some knowledge of computer vision, image
processing, and geometry. There would be no exams in this course. Grading
will be based on paper presentations, reports, class participation, and a
project. Details are provided in the course syllabus.
Syllabus
Schedule of Presentations
1/22 Course Objectives and Requirements (Bebis)
1/22 Perspective projection (reading assignment)
1/24, 1/29 Introduction to Object Recognition (Bebis)
1/31 Geometric Hashing (Bebis - based on 2DORGM [1]-[5])
2/5, 2/7 Object Recognition Using Algebraic Functions of Views (Bebis - based on 3DORGM [2]-[4])
2/12 Object Recognition Using Genetic Algorithms (Bebis - based on 3DORGM [5])
2/12 Video Lecture: Color-based object recognition (Jan-Mark Greusebroek , University of Amsterdam, and Frank Seinstra , University of Amsterdam)
2/14 Video Lecture: Building local part models for category-level recognition (Cordelia Schmid, INRIA) (reading material: ORLD [2][15][18]-[20])
2/19 Normalized Cuts and Image Segmentation (Tavakkoli - based on S [1])
2/22 Video Lecture: Trainable visual models for object classification (Andrew Zisserman , University of Oxford) (reading material: ORLD [5] [16] [17] [22])
2/26 Learning to Detect Natural Image Boundaries Using Local Brightness, Color and Texture Cues (King - based on S [2])
2/28 Video Lecture: Trainable visual models for object classification (Andrew Zisserman , University of Oxford) (reading material: ORLD [5] [16] [17] [22])
3/4 Distinctive Image Features from Scale-Invariant Keypoints (Chang - based on ORLD [1])
3/6 Video Lecture: Generative Models for Visual Objects and Object Recognition via Bayesian Inference (Fei-Fei Li, Princeton University) (reading material: GOR [2])
3/11 Video Lecture: Learning and Recognizing Visual Object Categories (Daniel Huttenlocher, Cornell University) (reading material: ORP [8])
3/13 A Component-based Framework for Face Detection and Identification (Ambardekar - based on ORP [2])
3/18 Wide base-line stereo matching based on
local, affinely invariant regions (Tavakkoli - based on ORLD [7])
3/20 Student Proposal Presentations
3/25, 3/27 Spring Break
4/1 Object Recognition Using Local Affine Frames on Maximally Stable Extremal Regions (King - based on ORLD [8])
4/3 Video Lecture: Visual Categorization with Bags of Keypoints (Christopher Dance, XEROX Research Centre Europe) (reading material: ORLD [13])
4/8 Video Google: Efficient Visual Search of Videos (Chang - based on ORLD [5])
4/10 Video Lecture: Learning shared representations for Object Recognition (Antonio Torralba, MIT) (reading material: ORLD [4])
4/15 Video Lecture: Learning Visual Distance Function for Object Identification from one Example (Frederic Jurie , Institut National de Recherche en Informatique et en Automatique) (reading material: ORLD [14])
4/15 Video Lecture: Discriminative Training for Object Recognition Using Image Patches, (Thomas Deselaers, Aachen University ) (reading material: N/A)
4/17 Interim Reports and Presentations
4/22 Video Lecture: Pascal Challenge 101 Objects (Chris Williams, University of Edinburg) (reading material: DIOR [1])
4/22 Video Lecture: Overview of the Challenge and Results (Mark Everingham, University of Oxford) (reading material: DIOR [1])
4/24 Scale and Affine Invariant Interest Point Detectors (Ambardekar - based on ORLD [23])
4/29 Video Lecture: Learning issues in image segmentation (Joachim M. Buhmann, Institute of Computational Science) (reading material: )
5/1 Perceptual Grouping of Natural Shapes in Cluttered Backgrounds (Leandro Loss)
5/6 Video Lecture: Learning issues in image segmentation (Joachim M. Buhmann, Institute of Computational Science) (reading material: )
5/9 Final Reports, Presentations and Demos
Video Lectures (VL)
Papers
Review Papers (REV)
1. A. Ashbrook and N. Thacker, "Tutorial: Algorithms for 2D Object Recognition", Tina Memo N. 1996-003, 2006.
2. S. Dickinson, "Object Representation and Recognition", Rutgers University Lectures on Cognitive Science, 1999.
3. J. Mundy, "Object Recognition in the Geometric Era: A Retrospective", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 3-28, 2006.
2D Object Recognition using Geometric Models (2DORGM)
1. Y. Lamdan, J. Schwartz, and H. Wolfson, "Affine invariant model-based object recognition" , IEEE Transactions on Robotics and Automation, vol. 6, no. 5, 1990.
2. H. Wolfson and I. Rigoutsos, "Geometric Hashing: An Overview" , IEEE Computational Science and Egineering, pp. 10-21, October-December 1997.
3. Y. Lamdan, J. Schwartz, and H. Wolfson, "Affine invariant model-based object recognition" , IEEE Transactions on Robotics and Automation, vol. 6, no. 5, 1990.
4. A. Califano and R. Mohan, "Multidimensional Indexing for Recognizing Visual Shapes" , IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 4, pp. 373-392, 1994.
5. G. Bebis, M. Georgiopoulos and N. da Vitoria Lobo, "Using Self-Organizing Maps to Learn Geometric Hashing Functions for Model-Based Object Recognition" , IEEE Transactions on Neural Networks Vol 9, No. 3, pp. 560-570, 1998.
6. R. Basri and D. Jacobs, "Recognition Using Region Correspondences" , International Journal of Computer Vision, vol. 25, no. 2, pp. 145-166, 1997.
3D Object Recognition using Geometric Models (3DORGM)
1. D. Huttenlocher and S. Ullman, "Object Recognition Using Alignment, International Conference on Computer Vision, pp. 102-111, 1987.
2. S. Ullman and R. Basri, "Recognition by Linear Combinations of Models", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 10, pp. 992-1006, 1991.
3. G. Bebis, M. Georgiopoulos, M. Shah, and N. da Vitoria Lobo, "Indexing Based on Algebraic Functions of Views", Computer Vision and Image Understanding (CVIU), Vol. 72, No. 3, pp. 360-378, 1998.
4. W. Li, G. Bebis, and N. Bourbakis, "Integrating Algebraic Functions of Views with Indexing and Learning for 3D Object Recognition", IEEE Workshop on Learning in Computer Vision and Pattern Recognition (in conjunction with CVPR04), Washington DC, June 28, 2004.
5. G. Bebis. S. Louis, Y. Varol, and A. Yfantis, "Genetic Object Recognition Using Combinations of Views", IEEE Transactions on Evolutionary Computation, vol 6, no. 2, pp. 132-146, April 2002.
6. D. Jacobs and R. Basri, "3D to 2D Pose Determination with Regions" , International Journal of Computer Vision, vol. 34, no. 2/3, pp. 123-145, 1999.
Pose Clustering (PC)
Segmentation (S)
1. J. Shi and J. Malik, Normalized Cuts and Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intellige, 22(8), August, 2000, pp. 888-905.
2. D. Martin, C. Fowlkes, J. Malik, Learning to Detect Natural Image Boundaries Using Local Brightness, Color and Texture Cues, IEEE Transactions on Pattern Analysis and Machine Intellige, 26(5), 2004, pp. 530-549.
3. D. Comaniciu and P. Meer, Mean Shift: A Robust Approach Toward Feature Space Analysis, IEEE Transactions on Pattern Analysis and Machine Intellige, 24(5), 2002, pp. 603-619.
4. G. Medioni, C. Tang, and M. Lee, Tensor Voting: Theory and Applications, 2000.
5. L. Loss, G. Bebis, M. Nicolescu, and A. Skurikhin, Perceptual Grouping Based on Iterative Multi-Scale Tensor Voting, 2nd International Symposium on Visual Computing (ISVC06), Lake Tahoe, November 6-8, 2006.
6. V. Roth and T. Lange, Adaptive Feature Selection in Image Segmentation, LNCS 3175, pp. 9-17, 2004.
7. C. Fowlkes, S. Belongie, F. Chung, and J. Malik, Spectral Grouping Using the Nystrom Method, IEEE Transactions on Pattern Analysis and Machine Intellige, 26(2), 2004, pp. 214-225.
Grouping (G)
1. D. Jacobs, Robust and Efficient Detection of Convex Groups, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 1,
pp. 23-37, 1996.
2. S. Mahamud, L. Williams, K Thornber, and Kanglin Xu, Segmentation of multiple salient closed contours from real images, IEEE Transactions on Pattern Analysis and Machine Intellige, 25(4), pp. 433- 444, 2003.
3. S. Wang et al., Salient Closed Boundary Extraction with Ratio Contour, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 4, pp. 546-561, 2005.
Object Recognition using Local Descriptors (ORLD)
1. D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints", International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004. SIFT demo program
2. K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005.
3. A. Ferencz, E. Miller, and J. Malik, "Learning to Locate Informative Features for Visual Identification", draft submitted to a special issue of the International Journal of Computer Vision.
4. A. Torralba, K. Merphy, and W. Freeman, "Shared Features for Multiclass Object Detection", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 345-361, 2006.
5. J. Sivic and A. Zisserman, "Video Google: Efficient Visual Search of Videos" , J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 127-144, 2006.
6. Sivic, F. Schaffalitzky, and A. Zisserman, "Object Level Grouping for Video Shots", International Journal of Computer Vision, vol. 67, no. 2, pp. 189-210, 2006.
7.T. Tuytelaars and L. Van Gool"Wide Baseline Stereo Matching based on Local, Affinely Invariant Regions" British Machine Vision Conference, 2000.
8. S. Obdrzalek and J. Matas, "Object Recognition Using Local Affine Frames on Maximally Stable Extremal Regions", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 83-104, 2006.
9. K. Murphy et al. "Object Detection and Localization Using Local and Global Features", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 382-400, 2006.
10. I. Ulusoy and C, Bishop, "Comparison of Generative and Discriminative Techniques for Object Detection and Classification", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 173-195, 2006.
11. A. Berg and J. Malik, "Shape Matching and Object Recognition", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 483-507, 2006.
12. F. Schaffalitzky and A. Zisserman "Multi-view matching for unordered image sets, or How do I organize my holiday snaps?", European Conference on Computer Vision, Denmark, 2002.
13. G. Cruska et al., "Visual Categorization with Bags of Keypoints", European Conference on Computer Vision, Czech Republic, 2004.
14. E. Nowak and F. Jurie, "Learning Visual Similarity Measures for Comparing Never Seen Objects", Computer Vision and Pattern Recognition 2007.
15. C. Schmid and R. Mohr, "Local Greyvalue Invariants for Image Retrieval", IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997.
16. R. Fergus, P. Perona, and A. Zisserman, "Object Class Recognition by Unsupervised Scale-Invariant Learning", CVPR 2003.
17. R. Fergus, P. Perona, and A. Zisserman, "A visual category filter for google images", ECCV 2004.
18. K. Mikolajczyk and C. Schmid, "An Affine Invariant Interest Point Detector ", ECCV 2002.
19. K. Mikolajczyk and C. Schmid, "Indexing Based on Scale Invariant Interest Points", ICCV 2001.
20. S. Lazebnik, C. Schmid, J. Ponce, "Semi-Local Affine Parts for Object Recognition", BMVC 2004.
21. J. Ponce, S. Lazebnik, F. Rothganger, and C. Schmid, "Toward True 3D Object Recognition", 2004.
22. M. Weber, M. Welling, P. Perona, "Unsupervised Learning of Models for Recognition", ECCV 2000.
23. K. Mikolajczyk and C. Schmid, "Scale and Affine Invariant Interest Point Detectors", International Journal of Computer Vision, vol. 60, no. 1, pp. 63-86, 2004.
24. K. Mikolajczyk et al, "A Comprison of Affine Region Detectors", International Journal of Computer Vision, vol. 65(1/2), pp. 43-72, 2005.
25.T. Deselaers, D. Keysers, and H. Ney, "Discriminative training for object recognition using image patches", CVPR 2005.
Object Recognition using Parts (ORP)
1. B. Heisele, I. Riskov, and C. Morgenstern, "Components for Object Detection and Identification, J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 225-237, 2006.
2. B. Heisele, T. Serre, and T. Poggio, "A Component-based Framework for Face Detection and Identification", International Journal of Computer Vision, vol. 74, no. 2, pp. 167-181, 2007.
3. S. Agarwal, A. Awan, and D. Roth, "Learning to Detect Objects in Images via a Sparse, Part-Based Representations", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1475-1490, 2004.
4. S. Belongie, J. Malik, and J. Puzicha, "Shape Matching and Object Recognition Using Shape Contexts, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 24, no. 24, pp. 509-522, 2002.
5. A. Frome and J. Malik, "Object Recognition Using Locality-Sensitive Hashing of Shape Contexts", in Nearest-Neighbor Methods in Learning and Vision. Ed. Gregory Shakhnarovich, Trevor Darrell, and Piotr Indyk. MIT Press, 2006. pp. 221-247.
6. F. Rothganger et al., "3D Object Modeling and Recognition Using Affine-Invariant Patches and Multi-View Spatial Constraints", Computer Vision and Pattern Recognition, 2003.
7. S. Ullman and B. Epshtein, "Visual Classification by a Hierarchy of Extended Fragments", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 321-344, 2006.
8. P. Felzenswalb and D. Huttenlocher, "Pictorial Structures for Object Recognition", International Journal of Computer Vision, 61(1), pp. 55-79, 2005.
Object Recognition by Combining Geometric and Appearance Models (ORGAP)
1. D. Crandall, P. Felzenswalb, and D. Huttenlocher, "Object Recognition by Combining Appearance and Geometry", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 462-482, 2006.
2. A. Opelt, A. Pinz, and A. Zisserman, "Fusing shape and appearance information for object category detection", British Machine Vision Conference, 2006.
Object Categorization (GOR)
1. A. Opelt, A. Pinz, and M. Fussenegger, "Generic Object Recognition with Boosting", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 3, pp. 416-431, 2006.
2. L. Fei-Fei, R. Fergus, and P. Perona, "One-Shot Learning of Object Categories", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 594-611, 2006.
Dataset Issues in Object Recognition (DIOR)
Object Recognition Applications (ORA)
1. M. Merler, C. Galleguillos, and S. Belongie, "Recognizing Groceries in situ Using in vitro Training Data", SLAM 2007. (for more info, click here)
2. Y. Hirano et al., "Industry and Object Recognition: Applications, Applied Research and Challenges", J. Ponce et al. (Eds): Toward Category-Level Object Recognition, LNCS 4170, pp. 49-64, 2006.
Project Topics
Department of Computer Science & Engineering, University of Nevada, Reno, NV 89557
Page created and maintained by:
Dr. George Bebis
(bebis@cse.unr.edu)