On a Transformation-Based Approach to
Recognition and Categorization of Shape
Benjamin B. Kimia
Laboratory for Engineering Man/Machine
Systems (LEMS)
* Outline *
** Outline **
*
Organization of the Shape Space *
-
Uni-dimensional sequences of shapes arise from stretching, bending, protruding, etc.
-
Slightly more complex collections
defies organization into a small number of dimensions
-
Shape Space has infinite dimensions
of variations
-
Goal: Gain an understanding of this structure, its
neighborhood relations, distances and similarities.
-
Approach:
-
Relate a shape to another by paths connecting them (thus, morphing one
to another)
-
Distance between two shapes is the cost of the optimal -- least action
-- path between them
-
Paths are sequences of continuous deformations punctuated by sudden
changes in the representation (transitions)
-
Only paths going through simpler shapes need to be considered
* Shock-Based Representation of Shapes and
Images *
1. Skeletal Representation of Shapes:
-
Focus of a large effort in computer vision
-
Evidence from psychophysics/neurophysiology for central representations
of shape
-
Medial Axis (Blum)
-
Cordal Axes (Brady)
-
PISA (Leyton)
-
Full Symmetry Set (Giblin, Levine, Pizer, ....)
-
Shock Set (Kimia et al.)
2. Shock-Based Representation
of Shapes:
3. Shock-Based
Representation of Images:
-
Figure-Ground Segmentation of Shape is DIFFICULT!
-
Segmentation need not precede recognition
-
Extract skeletons directly from grey-scale images:
-
CORE (Pizer)
-
Symmetry Operators (Levine)
-
Full Symmetry Set (Wright et al)
-
Our approach (Huseyin Tek):
-
Form shocks from the propagation of orientation elements (edges)
-
Requires a network for propagation
of waves on a discrete grid (ISMM98)
-
Label shocks into three classes depending on whether the colliding
waves are rarefaction waves
-
Regular: Indicate True
Symmetries
-
Semi-degenerate: Indicate
Parts, Gaps
-
Degenerate: Indicate Parts,
Gaps
-
Examples:
-
Regular Shocks denote true symmetries, other require shock
transformations
-
Main point: recognition can precede segmentation in this
approach
* Recognition of Shape via Matching Shock Graphs: Graduated
Assignment *
* Previous work
-
Yuille's FORMS (Buther Shop)
-
Zucker et al.: eigen-decomposition for shock graph matching
-
Zhu: Stochastic Skeletons
-
Geiger: Dynamic programming approach
* Shock Graph Matching by Graduated Assignment (Daniel Sharvit:
SPIE97, CVPR98, JVIS98)
-
Adapted from Gold and Rangarajan's graduated assignment graph matching
-
Consider all permutations of node assignments
-
Pick the optimal correspondence by optimizing an energy functional
-
Energy represents goodness of match
-
Node similarity
-
Link similarity
-
Quadratic functional
-
Move from current match matrix to the optimal by a form of gradient
descent
-
Non-binary (fuzzy) match matrix
-
Graduated non-convexity to avoid local minima and converge to a
permutation
-
Update using gradient descent: equivalent to solving an assignment
problem
-
Examples and Results
-
Maintaining coherence of shape in the matching process
-
Quadratic energy functional: only pairwise consistency
-
Maximum clique approaches: treat topological coherence, but not
link/node similarity
-
Implicit representation of deformations
-
Missing/extra nodes: Slack variable
-
Missing/extra links: Noise
* Transformation-based Approach to Matching *

-
Characterizing Deformation Paths based on changes in the graph
representation
-
Continuous:
-
Fixed graph structure
-
Continuous changes in graph link and node attributes
-
Stretch/Shrink
-
Bend/Unbend
-
Protrude/Indent
-
Sudden Changes: transitions
-
Sudden changes in the discrete graph representation
-
Changes in symmetry
-
Compose parts/partition
-
Close gaps / break outline

-
The sudden changes break the shape space into regions
-
Need to zip-up at transitions
-
Main Point: Use these transitions as coordinates
in the shape space
* Shock Transitions *
-
Classify shock transitions (collaboration with Peter Giblin)
-
A1A3
-
A1^2: infinite velocity, zero acceleration
-
A1^3 plus infinite velocity
-
A1_4
* Edit Distance Graph Matching *
-
Use of Edit Distance for String Matching
-
Edit Operations
-
Delete: TELEVIISION --> TELEVISION
-
Insert: TELEVSION --> TELEVISION
-
relabel: TELEVUSION --> TELEVISION
-
Cost associated with each edit operation
-
Dynamic Programming approach to finding the least action (optimal)
path
-
Adaptation of edit distance algorithm to unrooted planar embedded trees
(Collaboration with Philip Klein)
-
Edit Operations
-
Splice (prune) A1A3
-
Merge1
-
Merge2 and Merge 3
-
Contract A1^4 Transition

-
Results
-
Asymmetric Reference
-
Tilted rectangle --> a vertical one
-
protruded skewed rectangle--> unprotruded skewed rectangle
-
skewed rectangle --> rectangle
-
rectangle --> square
-
Matching two square-like object requires a path going through a square
-
Main point:
-
shapes at transition become a "traffic hub"
-
transition shapes with higher degrees of degeneracy attract more
"traffic"
-
formation of prototypes from transition shapes?
-
who has ever seen a circle?
* Image Indexing: Need for Categorization *
Successful Image Retrieval requires Efficient Access
-
A large number of objects introduces organizational complexities
-
Research Questions (work with Ana Popescu)
-
How can we form categories/clusters from metric data, d(x,y)?
-
How can formed categories be represented?
-
prototypes?
-
examples?
-
examples with outliers?
-
How can hierarchies of categories be formed for efficient indexing and
search?
-
Clustering Techniques:
-
Nearest neighbor search (KD trees, VP trees, etc.)
-
Minimal Spanning Trees
-
Physical Analogy: Granular Magnetic Model
-
Currently, we are constructing a very large database of shapes
* Open Questions: Need for psychophysics of shape similarity *
-
What is the perceptual cost of
-
stretching/shrinking?
-
bending/unbending?
-
protruding/indenting?
-
composition/partitioning?
-
Are theses dimensions separable?
How should dimensions be combined?
-
How are shape categories formed?
-
Do shape transitions are more prototypical?