On a Transformation-Based Approach to
Recognition and Categorization of Shape
Benjamin B. Kimia
Laboratory for Engineering Man/Machine
Systems (LEMS)
* Outline *
** Outline **
*
Organization of the Shape Space *
-
Uni-dimensional sequences of shapes arise from stretching, bending, protruding, etc.
-
Slightly more complex collections
defies organization into a small number of dimensions
-
Shape Space has infinite dimensions
of variations
-
Goal: Gain an understanding of this structure, its
neighborhood relations, distances and similarities.
-
Approch:
-
Relate a shape to another by paths connecting them (thus, morphing one
to another)
-
Distance between two shapes is the cost of the optimal -- least action
-- path between them
-
Paths are sequences of continuous deformations punctuated by suddens
changes in the representation (transitions)
-
Only paths going through simpler shapes need to be considered
* Shock-Based Representation of Shapes and
Images *
1. Skeletal Representation of Shapes:
-
Focus of a large effort in computer vision
-
Evidence from psychophyscis/neurophysiology for central representations
of shape
-
Medial Axis (Blum)
-
Cordal Axes (Brady)
-
PISA (Leyton)
-
Full Symmetry Set (Giblin, Levine, Pizer, ....)
-
Shock Set (Kimia et al)
2. Shock-Based Representation
of Shapes:
3. Shock-Based Representation
of Images:
-
Figure-Ground Segmentation of Shape is DIFFICULT!
-
Segmentation need not preceed recognition
-
Extract skeletons directly from grey-scale images:
-
CORE (Pizer)
-
Symmtery Operators (Levine)
-
Full Symmetry Set (Wright et al)
-
Our approach (Huseyin Tek):
-
Form shocks from the propagation of orientation elements (edges)
-
Requires a network for propagation
of waves on a discrete grid (ISMM98)
-
Label shocks into three classes depending on whether the colliding
waves are rarefaction waves
-
Regular: IndicateTrue
Symmetries
-
Semi-degenerate: Indicate
Parts, Gaps
-
Degenerate: Indicate Parts,
Gaps
-
Examples:
-
Regular Shocks denote true symmtries, other require shock
transformations
-
Main point: recognition can preceed segmentation in this
approah
* Recognition of Shape via Matching Shock Graphs: Graduated
Assignment *
* Previous work
-
Yuille's FORMS (Buther Shop)
-
Zucker et al: eigendecomposition for shock graph matching
-
Zhu: Stochastic Skeletons
-
Geiger: Dynamic programming approach
* Shock Graph Matching by Graduated Assignment (Daniel Sharvit:
SPIE97, CVPR98, JVIS98)
-
Adapted from Gold and Rangarajan's graduated assiignment graph matching
-
Consider all permutations of node assignments
-
Pick the optimal correspondence by optimizing an energy funcational
-
Energy represents goodness of match
-
Node similarity
-
Link similarity
-
Quadratic functional
-
Move from current match matrix to the optimal by a form of gradient
descent
-
Non-binary (fuzzy) match matrix
-
Graduated non-convenxity to avoid local minima and converge to a
permutation
-
Update using gradient descent: equivalent to solving an assignment
problem
-
Examples and Results
-
Maintaining coherence of shape in the matching process
-
Quadratic energy functional: only pairwise consistency
-
Maximum clique approaches: treat toplogical coherence, but not
link/node similarity
-
Implicit representation of deformations
-
Missing/extra nodes: Slack variable
-
Missing/extra links: Noise
* Tranformation-based Approach to Matching *

-
Characterizing Deformation Paths based on changes in the graph
representation
-
Continuous:
-
Fixed graph structure
-
Continuous changesin graph link and node attributes
-
Stretch/Shrink
-
Bend/Unbend
-
Protrude/Indent
-
Sudden Changes: transitions
-
Sudden changes in the discrete graph representation
-
Changes in symmetry
-
Compose parts/partition
-
Close gaps / break outline

-
The sudden changes break the shape space into regions
-
Need to zip-up at transitions
-
Main Point: Use these transitions as coordinates
in the shape space
* Shock Transitions *
-
Classify shock transitions (collaboration with Peter Giblin)
-
A1A3
-
A1^2: infinite velocity, zero accelerartion
-
A1^3 plus infinite velocity
-
A1_4
* Edit Distance Graph Matching *
-
Use of Edit Distance for String Matching
-
Edit Opertions
-
Delete: TELEVIISION --> TELEVISION
-
Insert: TELEVSION --> TELEVISION
-
relabel: TELEVUSION --> TELEVISION
-
Cost associated with each edit operation
-
Dynamic Programming approach to finding the least action (optimal)
path
-
Adapation of edit distance algorithm to unrooted planar embedded trees
(Collaboration with Philip Klein)
-
Edit Operations
-
Splice (prune) A1A3
-
Merge1
-
Merge2 and Merge 3
-
Contract A1^4 Transition

-
Results
-
Aymmetric Reference
-
Tilted rectangle --> a vertical one
-
protruded skewed rectangle--> unprotuded skewed rectangle
-
skewed rectangle --> rectangle
-
rectangle --> square
-
Matching two square-like object requires a path going through a square
-
Main point:
-
shapes at transition become a "traffic hub"
-
transition shapes with higher degrees of degeneracy attract more
"traffic"
-
formation of proptotypes from transition shapes?
-
who has ever seen a circle?
* Image Indexing: Need for Categorization *
Successful Image Retrieval requires Efficient Access
-
A large number of objects introduces organizational complexities
-
Research Questions (work with Ana Popescu)
-
How can we form categories/clusters from metric data, d(x,y)?
-
How can formed categories be represented?
-
prototypes?
-
examplars?
-
examplars with outliers?
-
How can hierarchies of categories be formed for efficient indexing and
search?
-
Clustering Techniques:
-
Nearest neighbor search (KD trees, VP trees, etc.)
-
Minimal Spanning Trees
-
Physical Analogy: Granular Magnetic Model
-
Currently, we are constructing a very large database of shapes
* Open Questions: Need for psychophysics of shape similarity *
-
What is the perceptual cost of
-
strecthing/shrinking?
-
bending/unbending?
-
protruding/indenting?
-
compositing/partitioning?
-
Are theses dimensions separable?
How should dimensions be combined?
-
How are shape categories formed?
-
Do shape transitions are more prototypical?