EN161 Image Understanding Projects
View Projects From Image Processing 2002
|
All projects will entail careful reading and understanding 1-2 main papers and reading several other supplementary papers as the foundation to enable you to implement and test a current method in your chosen topic. You will be expected to be able to discuss the strengths and weaknesses of the method. |
Contact TA: MingChing Chang
1. Perceptual Grouping
Williams Grouping of Edges IJCV 2000
http://www.cs.unm.edu/~williams/williams-ijcv99.pdf
We propose a new measure of perceptual saliency and quantitatively compare its ability to detect natural shapes in cluttered backgrounds to five previously proposed measures. As defined in the new measure, the saliency of an edge is the fraction of closed random walks which contain that edge. The transition-probability matrix defining the random walk between edges is based on a distribution of natural shapes modeled by a stochastic motion. Each of the saliency measures in our comparison is a function of a set of affinity values assigned to pairs of edges. Although the authors of each measure define the affinity between a pair of edges somewhat differently, all incorporate the Gestalt principles of good-continuation and proximity in some form. In order to make the comparison meaningful, we use a single definition of affinity and focus instead on the performance of the different functions for combining affinity values. The primary performance criterion is accuracy. We compute false-positive rates in classifying edges as signal or noise for a large set of test figures. In almost every case, the new measure significantly outperforms previous measures.
2. Deformable Shapes
Papers:
http://www.ai.mit.edu/people/pff/papers/shapes.pdf
http://www.ai.mit.edu/people/pff/papers/pff.pdf
We present a new method for detecting deformable shapes in images. The main di culty with deformable template models is the very large (or infinite) number of possible non-rigid transformations of the templates. This makes the problem of finding an optimal match of a deformable template to an image incredibly hard. Using a new representation for deformable shapes we show how to e ciently find a global optimal solution to the non-rigid matching problem. Our matching algorithm can minimize a large class of energy functions, making it applicable to a wide range of problems. We present experimental results of detecting shapes in medical and natural images. Because we don’t rely on local search techniques, our method is very robust, yielding good matches even in images with high clutter.
Code: /vision/projects/kimia/segmentation/Felzeszwalb
Apply it to spline applications on Cleary images (/vision/images/medical/Spline-Images/Cleary-Images).
3. Image Reconstruction
Elder's Image Reconstruction (Diffusion Method)
See
http://www.lems.brown.edu/~tcl/en298_summary.html
which came after the Johannes project.
4. Object Class Recognition
Object Class Recognition by Unsupervised Scale-Invariant Learning
R. Fergus, P. Perona, A. Zisserman
http://csdl.computer.org/comp/proceedings/cvpr/2003/1900/02/190020264abs.htm
We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible constellations of parts. A probabilistic representation is used for all aspects of the object: shape, appearance, occlusion and relative scale. An entropy-based feature detector is used to select regions and their scale within the image. In learning the parameters of the scale-invariant object model are estimated. This is done using expectation-maximization in a maximum-likelihood setting. In recognition, this model is used in a Bayesian manner to classify images. The flexible nature of the model is demonstrated by excellent results over a range of datasets including geometrically constrained classes (e.g. faces, cars) and flexible objects (such as animals).
5. Tracking Through Tree-Search
D. Freedman. Effective tracking through tree-search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5):604-615, 2003.
Paper: pdf
(http://www.cs.rpi.edu/~freedd/publications.html)
A new contour tracking algorithm is presented. Tracking is posed as a matching problem between curves constructed out of edges in the image, and some shape space describing the class of objects of interest. The main contributions of the paper are to present an algorithm which solves this problem accurately and efficiently, in a provable manner. In particular, the algorithm’s efficiency derives from a novel tree-search algorithm through the shape space, which allows for much of the shape space to be explored with very little effort. This latter property makes the algorithm effective in highly cluttered scenes, as is demonstrated in an experimental comparison with a condensation tracker.
6. Shape from Shading Using The Level Set Approach
Ronnie Kimmel, Kaleem Siddiqi, Benjamin B. Kimia, and Alfred M. Bruckstein. Shape from shading : Level set propagation and viscosity solutions. IJCV, 16(2), October 1995.
7. Model-based Reconstruction from CT View Data
Despite the significant role of
geometry in the image formation process and the need for its recovery, the
traditional approaches to computerized
tomography construct intensity images from X-ray measurement without an explicit
notion of geometry. Ray attenuations are represented as a
sinogram parameterized in the viewing angle an distance and reconstructed
via a variety of methods. The most popular of these is filtered
backprojection which, by an application of the central slice theorem, first
filters the measured data for each viewing angle and then cumulatively
projects it back into the image. Since the ideal filter cannot be realized,
finite energy approximations have been developed. However, this leads to a
blurring of the image data. While working directly in the measurement space,
whenever possible, would avoid this artifact, the ultimate solution is to
introduce geometry directly into the estimation procedure.
A second aspect of the traditional approach which needs to be re-examined is the
discretization of space into "voxels", for which reconstruction
algorithms report an average value. This averaging leads to blurring when
multiple structures are sampled by a voxel, e.g., near a boundary,
the well-known partial volume effect. Observe that if the voxels were to be
reshaped so that voxel boundaries would be coincident with anatomical
structure, such a blurring would not occur. However, such a reconfiguration of
the voxels, requires a priori knowledge of the anatomy, a chicken-and-egg
problem! We hypothesize that a simultaneous estimation of underlying geometry
and intensity would substantially improve reconstruction results. Specifically,
the use of geometric models in the reconstruction process avoids both of the
above difficulties. First, the use of models generally reduces the number of
parameters to be estimated, thus leveraging the information in the measurement
ray. Second, the use of models prevents partial volume effects since "voxel"
boundaries are matched with the anatomy.
The use of models, however, has two potential drawbacks. First, since the space
of models is to be matched with the underlying normative anatomy, it
could be argued that the useful regularization derived in the use of models can
potentially also miss estimating pathological anatomies. In our
experience with the use of models, deviations from models has typically lead to
large errors which can highlight regions which a radiologist should
closely examine. Second, there is a fundamental combinatorial problem in the use
of models: voxels are generically placed in a regular rectangular array for all
types of images. However, the placement of more sophisticated geometric models
faces combinatorial explosion. We plan to bootstrap the simultaneous estimation
of combinations of models and their geometric arrangement by resorting to local
estimates from the raw measurement data, which restricts the number of possible
arrangements. In this regard, the selection of the lung as an initial study has
several advantages. First, lung vessel anatomy is complicated only in the
connectivity of various segments, while each vessel segment can be approximated
by a rather simple cylindrical geometry. Second, we are able to work directly
with raw (not reconstructed) data due to the formulation proposed here. Third,
one of the background elements, air, stands in sharp contrast (in Hounsfeld
units) to the remaining tissue. Fourth, the vessel tree and bronchial trees
follow each other closely, thus providing a measure of anatomical validity.
X. Battle, G. Cunningham, and K. Hanson.
Tomographic reconstruction using 3{D} deformable models. Phys. Med. Biol.,
43:983--990, 1998.
J. G. Brankov, Y. Yang, and M. N. Wernick.
Tomographic image reconstruction using content-adaptive mesh modeling. IEEE
ICIP, pages 7--10, Oct. 2001.
8. Stereo
Multi-view Stereo Beyond Lambert
We consider the problem of estimating the shape and radiance of an object from a calibrated set of views under the assumption that the reflectance of the object is non-Lambertian. Unlike traditional stereo, we do not solve the correspondence problem by comparing image-to-image. Instead, we exploit a rank constraint on the radiance tensor field of the surface in space, and use it to define a discrepancy measure between each image and the underlying model. Our approach automatically returns an estimate of the radiance of the scene, along with its shape, represented by a dense surface. The former can be used to generate novel views that capture the non-Lambertian appearance of the scene.
9. Normalized Cuts and Image Segmentation
http://citeseer.nj.nec.com/shi97normalized.html
We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total...
Paper: http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf
10. Image Height Ridge Detection
D. Eberly, "Ridges in image and data analysis," in Computational Imaging and Vision. Dordrecht, The Netherlands: Kluwer Academic, 1996, vol. 7.
Combinatorial Classification of Pixels for Ridge Extraction in a Gray-scale Fingerprint Image
Paper: http://citeseer.nj.nec.com/553141.html
R. Haralick. Ridges and Valleys on Digital Images. Comput. Vis. Graph. Imag. Process., vol. 22, pages 28-38, 1983.
11. Shape-Based Compression
Progressive Content-Based Shape
Compression for Retrieval of Binary Images
Corinne Le Buhan Jordan, Touradj Ebrahimi and Murat Kunt
Computer Vision and Image Understanding
Volume 71, Issue 2
This paper deals with content-based compression of binary-shape images. The proposed method is based on a polygonal approximation of the shape contours. A well-known approximation algorithm, from computer vision applications such as shape analysis and boundary pattern matching, is adapted to achieve a progressive representation. The resulting various levels of shape quality are encoded, from a coarse representation for fast browsing up to a lossless representation for final rendering. In order to perform efficient compression of the progressive shape information, discrete geometrical constraints inherent to the image grid quantization are exploited. While the proposed scheme offers a content-based description (shape boundary as opposed to bitmap blocks) together with a quality scalable representation, it remains comparable, in terms of compression efficiency, with state of the art shape coding methods that do not combine such functionalities.
12. Texture based segmentation
Segmentation of Textured Images
http://www-dbv.informatik.uni-bonn.de/image/segmentation.html
The unsupervised segmentation of
textured images is a difficult and challenging low level vision problem with
important applications in vision-guided autonomous robotics, product quality
inspection, medical diagnosis and in the analysis of remotely sensed images.
Algorithms for subsequent image processing stages like motion analysis and
tracking, stereo vision, object recognition and scene interpretation often rely
on a high quality image segmentation.
The segmentation problem can be informally described as the task of partitioning
an image into homogeneous regions. For textured images one of the main
conceptual difficulties is the definition of a homogeneity measure in
mathematical terms.The segmentation problem can be informally described as the
task of partitioning an image into homogeneous regions. For textured images one
of the main conceptual difficulties is the definition of a homogeneity measure
in mathematical terms. Our approach to unsupervised texture segmentation is
based on four cascaded design decisions, concerning the questions of image
representation, texture homogeneity, objective functions and optimization
procedures.
Realistic Textures for Virtual
Anastylosis
Alexey Zalesny, Dominik Auf der Maur, Rupert Paget, Maarten Vergauwen and Luc
Van Gool
http://www.lems.brown.edu/vision/conferences/ACVA03/ACVA03.html
See some cool results here:
http://www.vision.ee.ethz.ch/~rpaget/texture.htm
13. Texture Synthesis
See http://www.vision.ee.ethz.ch/~zales/
Alexey Zalesny, Vittorio Ferrari, Geert Caenen, and Luc Van Gool, "Parallel Composite Texture Synthesis", Texture 2002 Workshop in conjunction with ECCV 2002, pp. 151-155.
Geert Caenen, Vittorio Ferrari, Alexey Zalesny, and Luc Van Gool,
"Analyzing the layout of composite textures", Texture 2002 Workshop in
conjunction with ECCV 2002, pp. 15-19.
Alexey Zalesny, Vittorio Ferrari, Geert Caenen, Dominik Auf der Maur, and Luc
Van Gool,
"Composite Texture Descriptions", ECCV 2002, Vol. 3, pp. 180-194.