EN161 Image Understanding Projects

View Projects From Image Processing 2002

 

All projects will entail careful reading and understanding 1-2 main papers and reading several other supplementary papers as the foundation to enable you to implement and test a current method in your chosen topic.  You will be expected to be able to discuss the strengths and weaknesses of the method.

 

Contact TA: MingChing Chang


1. Perceptual Grouping

 

Williams Grouping of Edges IJCV 2000

http://www.cs.unm.edu/~williams/williams-ijcv99.pdf

 

We propose a new measure of perceptual saliency and quantitatively compare its ability to detect natural shapes in cluttered backgrounds to five previously proposed measures. As defined in the new measure, the saliency of an edge is the fraction of closed random walks which contain that edge. The transition-probability matrix defining the random walk between edges is based on a distribution of natural shapes modeled by a stochastic motion. Each of the saliency measures in our comparison is a function of a set of affinity values assigned to pairs of edges. Although the authors of each measure define the affinity between a pair of edges somewhat differently, all incorporate the Gestalt principles of good-continuation and proximity in some form. In order to make the comparison meaningful, we use a single definition of affinity and focus instead on the performance of the different functions for combining affinity values. The primary performance criterion is accuracy. We compute false-positive rates in classifying edges as signal or noise for a large set of test figures. In almost every case, the new measure significantly outperforms previous measures.

 


2. Deformable Shapes

 

Papers:

http://www.ai.mit.edu/people/pff/papers/shapes.pdf
http://www.ai.mit.edu/people/pff/papers/pff.pdf

We present a new method for detecting deformable shapes in images. The main di culty with deformable template models is the very large (or infinite) number of possible non-rigid transformations of the templates. This makes the problem of finding an optimal match of a deformable template to an image incredibly hard. Using a new representation for deformable shapes we show how to e ciently find a global optimal solution to the non-rigid matching problem. Our matching algorithm can minimize a large class of energy functions, making it applicable to a wide range of problems. We present experimental results of detecting shapes in medical and natural images. Because we don’t rely on local search techniques, our method is very robust, yielding good matches even in images with high clutter.

 

Code: /vision/projects/kimia/segmentation/Felzeszwalb

 

Apply it to spline applications on Cleary images (/vision/images/medical/Spline-Images/Cleary-Images).

 


3. Image Reconstruction


Elder's Image Reconstruction (Diffusion Method)

 

See http://www.lems.brown.edu/~tcl/en298_summary.html
which came after the Johannes project.

http://www.lems.brown.edu/vision/courses/computer-vision-1999/projects/image-edit-msj/project/recon.html

 


4. Object Class Recognition

 

Object Class Recognition by Unsupervised Scale-Invariant Learning

R. Fergus, P. Perona, A. Zisserman

http://csdl.computer.org/comp/proceedings/cvpr/2003/1900/02/190020264abs.htm

 

We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible constellations of parts. A probabilistic representation is used for all aspects of the object: shape, appearance, occlusion and relative scale. An entropy-based feature detector is used to select regions and their scale within the image. In learning the parameters of the scale-invariant object model are estimated. This is done using expectation-maximization in a maximum-likelihood setting. In recognition, this model is used in a Bayesian manner to classify images. The flexible nature of the model is demonstrated by excellent results over a range of datasets including geometrically constrained classes (e.g. faces, cars) and flexible objects (such as animals).

 


5. Tracking Through Tree-Search

 

D. Freedman. Effective tracking through tree-search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5):604-615, 2003.

Paper: pdf

(http://www.cs.rpi.edu/~freedd/publications.html)

 

A new contour tracking algorithm is presented. Tracking is posed as a matching problem between curves constructed out of edges in the image, and some shape space describing the class of objects of interest. The main contributions of the paper are to present an algorithm which solves this problem accurately and efficiently, in a provable manner. In particular, the algorithm’s efficiency derives from a novel tree-search algorithm through the shape space, which allows for much of the shape space to be explored with very little effort. This latter property makes the algorithm effective in highly cluttered scenes, as is demonstrated in an experimental comparison with a condensation tracker.

 


 

6. Shape from Shading Using The Level Set Approach

 

Ronnie Kimmel, Kaleem Siddiqi, Benjamin B. Kimia, and Alfred M. Bruckstein. Shape from shading : Level set propagation and viscosity solutions. IJCV, 16(2), October 1995.

 


 

7. Model-based Reconstruction from CT View Data

 

Despite the significant role of geometry in the image formation process and the need for its recovery, the traditional approaches to computerized
tomography construct intensity images from X-ray measurement without an explicit notion of geometry. Ray attenuations are represented as a
sinogram parameterized in the viewing angle an distance and reconstructed via a variety of methods. The most popular of these is filtered backprojection which, by an application of the central slice theorem, first filters the measured data for each viewing angle and then cumulatively
projects it back into the image. Since the ideal filter cannot be realized, finite energy approximations have been developed. However, this leads to a blurring of the image data. While working directly in the measurement space, whenever possible, would avoid this artifact, the ultimate solution is to introduce geometry directly into the estimation procedure.

A second aspect of the traditional approach which needs to be re-examined is the discretization of space into "voxels", for which reconstruction
algorithms report an average value. This averaging leads to blurring when multiple structures are sampled by a voxel, e.g., near a boundary,
the well-known partial volume effect. Observe that if the voxels were to be reshaped so that voxel boundaries would be coincident with anatomical
structure, such a blurring would not occur. However, such a reconfiguration of the voxels, requires a priori knowledge of the anatomy, a chicken-and-egg problem! We hypothesize that a simultaneous estimation of underlying geometry and intensity would substantially improve reconstruction results. Specifically, the use of geometric models in the reconstruction process avoids both of the above difficulties. First, the use of models generally reduces the number of parameters to be estimated, thus leveraging the information in the measurement ray. Second, the use of models prevents partial volume effects since "voxel" boundaries are matched with the anatomy.

The use of models, however, has two potential drawbacks. First, since the space of models is to be matched with the underlying normative anatomy, it
could be argued that the useful regularization derived in the use of models can potentially also miss estimating pathological anatomies. In our
experience with the use of models, deviations from models has typically lead to large errors which can highlight regions which a radiologist should
closely examine. Second, there is a fundamental combinatorial problem in the use of models: voxels are generically placed in a regular rectangular array for all types of images. However, the placement of more sophisticated geometric models faces combinatorial explosion. We plan to bootstrap the simultaneous estimation of combinations of models and their geometric arrangement by resorting to local estimates from the raw measurement data, which restricts the number of possible arrangements. In this regard, the selection of the lung as an initial study has several advantages. First, lung vessel anatomy is complicated only in the connectivity of various segments, while each vessel segment can be approximated by a rather simple cylindrical geometry. Second, we are able to work directly with raw (not reconstructed) data due to the formulation proposed here. Third, one of the background elements, air, stands in sharp contrast (in Hounsfeld units) to the remaining tissue. Fourth, the vessel tree and bronchial trees follow each other closely, thus providing a measure of anatomical validity.


X. Battle, G. Cunningham, and K. Hanson.
Tomographic reconstruction using 3{D} deformable models. Phys. Med. Biol., 43:983--990, 1998.

J. G. Brankov, Y. Yang, and M. N. Wernick.
Tomographic image reconstruction using content-adaptive mesh modeling. IEEE ICIP, pages 7--10, Oct. 2001.
 


8. Stereo

 

Multi-view Stereo Beyond Lambert

Hailin Jin, Stefano Soatto, Anthony J. Yezzi CVPR03

 

We consider the problem of estimating the shape and radiance of an object from a calibrated set of views under the assumption that the reflectance of the object is non-Lambertian. Unlike traditional stereo, we do not solve the correspondence problem by comparing image-to-image. Instead, we exploit a rank constraint on the radiance tensor field of the surface in space, and use it to define a discrepancy measure between each image and the underlying model. Our approach automatically returns an estimate of the radiance of the scene, along with its shape, represented by a dense surface. The former can be used to generate novel views that capture the non-Lambertian appearance of the scene.

 


9. Normalized Cuts and Image Segmentation

 

http://citeseer.nj.nec.com/shi97normalized.html

 

We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total...

Paper: http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

 


10. Image Height Ridge Detection

 

D. Eberly, "Ridges in image and data analysis," in Computational Imaging and Vision. Dordrecht, The Netherlands: Kluwer Academic, 1996, vol. 7.

 

Combinatorial Classification of Pixels for Ridge Extraction in a Gray-scale Fingerprint Image

Paper: http://citeseer.nj.nec.com/553141.html

 

R. Haralick. Ridges and Valleys on Digital Images. Comput. Vis. Graph. Imag. Process., vol. 22, pages 28-38, 1983.

 


11. Shape-Based Compression

 

Progressive Content-Based Shape Compression for Retrieval of Binary Images
Corinne Le Buhan Jordan, Touradj Ebrahimi and Murat Kunt

Computer Vision and Image Understanding
Volume 71, Issue 2

 

This paper deals with content-based compression of binary-shape images. The proposed method is based on a polygonal approximation of the shape contours. A well-known approximation algorithm, from computer vision applications such as shape analysis and boundary pattern matching, is adapted to achieve a progressive representation. The resulting various levels of shape quality are encoded, from a coarse representation for fast browsing up to a lossless representation for final rendering. In order to perform efficient compression of the progressive shape information, discrete geometrical constraints inherent to the image grid quantization are exploited. While the proposed scheme offers a content-based description (shape boundary as opposed to bitmap blocks) together with a quality scalable representation, it remains comparable, in terms of compression efficiency, with state of the art shape coding methods that do not combine such functionalities.

 


12. Texture based segmentation

 

Segmentation of Textured Images

http://www-dbv.informatik.uni-bonn.de/image/segmentation.html

 

The unsupervised segmentation of textured images is a difficult and challenging low level vision problem with important applications in vision-guided autonomous robotics, product quality inspection, medical diagnosis and in the analysis of remotely sensed images. Algorithms for subsequent image processing stages like motion analysis and tracking, stereo vision, object recognition and scene interpretation often rely on a high quality image segmentation.

The segmentation problem can be informally described as the task of partitioning an image into homogeneous regions. For textured images one of the main conceptual difficulties is the definition of a homogeneity measure in mathematical terms.The segmentation problem can be informally described as the task of partitioning an image into homogeneous regions. For textured images one of the main conceptual difficulties is the definition of a homogeneity measure in mathematical terms. Our approach to unsupervised texture segmentation is based on four cascaded design decisions, concerning the questions of image representation, texture homogeneity, objective functions and optimization procedures.
 

 

Realistic Textures for Virtual Anastylosis
Alexey Zalesny, Dominik Auf der Maur, Rupert Paget, Maarten Vergauwen and Luc Van Gool

http://www.lems.brown.edu/vision/conferences/ACVA03/ACVA03.html

 

See some cool results here:

http://www.vision.ee.ethz.ch/~rpaget/texture.htm

 

 


13. Texture Synthesis

 

See http://www.vision.ee.ethz.ch/~zales/

 

Alexey Zalesny, Vittorio Ferrari, Geert Caenen, and Luc Van Gool, "Parallel Composite Texture Synthesis", Texture 2002 Workshop in conjunction with ECCV 2002, pp. 151-155.


Geert Caenen, Vittorio Ferrari, Alexey Zalesny, and Luc Van Gool, "Analyzing the layout of composite textures", Texture 2002 Workshop in conjunction with ECCV 2002, pp. 15-19.


Alexey Zalesny, Vittorio Ferrari, Geert Caenen, Dominik Auf der Maur, and Luc Van Gool, "Composite Texture Descriptions", ECCV 2002, Vol. 3, pp. 180-194.