Indexing into Large Databases
Main Difficulties
-
Lack of a suitable database
-
Computationally expensive metric
Main Contributions
-
Creation of a large shape database
-
Develop a coarse scale version of shock graph matching
-
Represent categories using a few exemplars
Creation of large shape database
-
1032 shapes from 40 categories
-
Acknowledgments: MPEG-7 test database, Farzin Mokhtarian (fish, sea animals),
Mike Tarr (greebles), and Stan Sclaroff (tools).
Development of coarse-scale shock graph matching
-
Edit matching of shock graphs is expensive (2-3 minutes per match)
-
Impractical for large database
-
Querying a database of 1000 shapes requires 1000 matches, 2000 minutes
(~35 hours!)
-
Most of the computational effort is in computing edit costs, in particular
deform costs
-
Deform cost computation for every pair of paths; requires
optimal
alignments
-
Main idea of coarse scale match
-
Compute deform costs by a directly comparing shock path properties without
using optimal alignment

-
Significant speed-up in computation time (2-3 seconds instead of 2-3 minutes)
Results
-
Top matches contain most of the shapes from the correct category
-
Recall (ratio of number of correct shapes retrieved to total number of
shapes retrieved) averaged over 200 exemplars
Coarse Matching with Notion of Categories
-
However, to retrieve all shapes in the category a large number of
matches need to be considered
-
Alternatively, if a shape is related to its neighboring shapes using a
notion of categories fewer top matches are required
Comparison of Coarse and Fine Matching
-
Fine-scale matching is used to refine top matches from coarse-scale matching

Indexing Results with Exemplars
-
Matching a query with all shapes in database is unnecessary
-
If the query is "distant" from a fish shape, it will be distant from other
fish shapes as well
-
In context of all shapes, a collection of fish will appear tightly clustered
-
Each category is represented using a few exemplars
-
A similarity function is defined for query Q to an exemplar
-
To account for variations in each category, a membership function based
on similarity is defined
-
Indexing results using 1032 shapes, 204 exemplars (3-6 exemplars per category)
-
Correct category is in top 1, 2 and 5 matches 78%, 91% and
99% respectively

-
Experiments with fewer exemplars suggest a hierarchical representation
at exemplar level
-
Using 2-3 primary examplars, 75% of categories can be ruled out
-
Additional 2-3 auxilliary exemplars can be used to rule out a further 50%
of the categories