Abstract
The recognition of objects from their projected two-dimensional shapes
is a challenging problem owing to the spectrum of possible variations reflected
in the image domain, e.g., those caused by movement of parts, changes
in viewing geometry, occlusion, etc. This motivates a need for quantitative
as well as qualitative descriptions of shape in terms of structural relations
between components; the latter remain largely invariant under the above
changes. In this paper we confront the theoretical and practical difficulties
of computing such a representation, based on the detection of shocks
or singularities that arise as a shape is deformed, as organized in
two stages. First, we develop subpixel local detectors for the detection
of shocks and a classification of them into four types. Second, we show
that shock patterns are not arbitrary, but obey the rules of a grammar
which limits the possible shock combinations. In addition, shock patterns
satisfy specific topological and geometric constraints. We develop this
shock grammar and exploit the topological and geometric constraints to
enforce global consistency: shock hypotheses that violate the grammar
or are topologically or geometrically invalid are pruned, and survivors
are organized into higher level structures. The result is a computational
method for the detection, classification, and grouping of shocks. This
leads to a description of shape as a hierarchical graph of shock groups.
The graph is computed in the reaction-diffusion space, where diffusion
plays a role of regularization to determine the significance of each shock-group.
The representation is stable with rotations, scale changes, occlusion,
movement of parts, noise and other variations, even at very low resolutions.
We illustrate the suitability of this representation for recognition by
discussing several examples.