DiST: fully decentralized indexing for querying distributed multidimensional datasets

TitleDiST: fully decentralized indexing for querying distributed multidimensional datasets
Publication TypeConference Papers
Year of Publication2006
AuthorsNam B, Sussman A
Conference NameParallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Date Published2006/04//
ISBN Number1-4244-0054-6
KeywordsComputer network management, distributed multidimensional dataset querying, failure recovery, fault tolerant computing, fully decentralized multidimensional indexing, grid computing, Indexing, large scale distributed resource management, Large-scale systems, Multidimensional systems, Network servers, P2P systems, Peer to peer computing, peer-to-peer computing, peer-to-peer systems, Publishing, Query processing, query routing, resource allocation, Resource management, telecommunication network routing, wide area networks

Grid computing and peer-to-peer (P2P) systems are emerging as new paradigms for managing large scale distributed resources across wide area networks. While grid computing focuses on managing heterogeneous resources and relies on centralized managers for resource and data discovery, P2P systems target scalable, decentralized methods for publishing and searching for data. In large distributed systems, a centralized resource manager is a potential performance bottleneck and decentralization can help avoid this bottleneck, as is done in P2P systems. However, the query functionality provided by most existing P2P systems is very rudimentary, and is not directly applicable to grid resource management. In this paper, we propose a fully decentralized multidimensional indexing structure, called DiST, that operates in a fully distributed environment with no centralized control. In DiST, each data server only acquires information about data on other servers from executing and routing queries. We describe the DiST algorithms for maintaining the decentralized network of data servers, including adding and deleting servers, the query routing algorithm, and failure recovery algorithms. We also evaluate the performance of the decentralized scheme against a more structured hierarchical indexing scheme that we have previously shown to perform well in distributed grid environments