Finding comparable temporal categorical records: A similarity measure with an interactive visualization

TitleFinding comparable temporal categorical records: A similarity measure with an interactive visualization
Publication TypeConference Papers
Year of Publication2009
AuthorsWongsuphasawat K, Shneiderman B
Conference NameIEEE Symposium on Visual Analytics Science and Technology, 2009. VAST 2009
Date Published2009/10/12/13
ISBN Number978-1-4244-5283-5
Keywordsdata visualisation, Educational institutions, Feedback, Information retrieval, interactive search tool, interactive systems, interactive visualization tool, large databases, M&M Measure, Match & Mismatch measure, Medical services, numerical time series, parameters customization, Particle measurements, Similan, similarity measure, Similarity Search, temporal categorical databases, Temporal Categorical Records, temporal databases, Testing, Time measurement, time series, transportation, usability, very large databases, visual databases, Visualization

An increasing number of temporal categorical databases are being collected: Electronic Health Records in healthcare organizations, traffic incident logs in transportation systems, or student records in universities. Finding similar records within these large databases requires effective similarity measures that capture the searcher's intent. Many similarity measures exist for numerical time series, but temporal categorical records are different. We propose a temporal categorical similarity measure, the M&M (Match & Mismatch) measure, which is based on the concept of aligning records by sentinel events, then matching events between the target and the compared records. The M&M measure combines the time differences between pairs of events and the number of mismatches. To accom-modate customization of parameters in the M&M measure and results interpretation, we implemented Similan, an interactive search and visualization tool for temporal categorical records. A usability study with 8 participants demonstrated that Similan was easy to learn and enabled them to find similar records, but users had difficulty understanding the M&M measure. The usability study feedback, led to an improved version with a continuous timeline, which was tested in a pilot study with 5 participants.