Index-driven similarity search in metric spaces (Survey Article)

TitleIndex-driven similarity search in metric spaces (Survey Article)
Publication TypeJournal Articles
Year of Publication2003
AuthorsHjaltason GR, Samet H
JournalACM Trans. Database Syst.
Pagination517 - 580
Date Published2003/12//
ISBN Number0362-5915
Keywordsdistance-based indexing, Hiearchical metric data structures, nearest neighbor queries, range queries, Ranking, similarity searching

Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some similarity measure. In this article, we focus on methods for similarity search that make the general assumption that similarity is represented with a distance metric d. Existing methods for handling similarity search in this setting typically fall into one of two classes. The first directly indexes the objects based on distances (distance-based indexing), while the second is based on mapping to a vector space (mapping-based approach). The main part of this article is dedicated to a survey of distance-based indexing methods, but we also briefly outline how search occurs in mapping-based methods. We also present a general framework for performing search based on distances, and present algorithms for common types of queries that operate on an arbitrary "search hierarchy." These algorithms can be applied on each of the methods presented, provided a suitable search hierarchy is defined.