%0 Conference Paper
%B Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on
%D 2011
%T Computing the Tree of Life: Leveraging the Power of Desktop and Service Grids
%A Bazinet,A. L
%A Cummings, Michael P.
%K (artificial
%K (mathematics);user
%K analysis;science
%K BOINC
%K computation;grid
%K computational
%K computing
%K computing;information
%K computing;machine
%K data;grid
%K estimation;portals;trees
%K evolutionary
%K Grid
%K grids;service
%K handling;evolutionary
%K history;evolutionary
%K intelligence);maximum
%K interface;computational
%K interfaces;
%K jobs;HPC
%K learning;maximum
%K likelihood
%K load;Internet;data
%K method;molecular
%K model;genetic
%K model;substantial
%K portal;service
%K power;data
%K project;life
%K resource;lattice
%K resource;Web
%K sequence
%K services;learning
%K sets;evolutionary
%K software;GARLI
%K system;heterogeneous
%K systematics;phylogenetic
%K tree
%X The trend in life sciences research, particularly in molecular evolutionary systematics, is toward larger data sets and ever-more detailed evolutionary models, which can generate substantial computational loads. Over the past several years we have developed a grid computing system aimed at providing researchers the computational power needed to complete such analyses in a timely manner. Our grid system, known as The Lattice Project, was the first to combine two models of grid computing - the service model, which mainly federates large institutional HPC resources, and the desktop model, which harnesses the power of PCs volunteered by the general public. Recently we have developed a "science portal" style web interface that makes it easier than ever for phylogenetic analyses to be completed using GARLI, a popular program that uses a maximum likelihood method to infer the evolutionary history of organisms on the basis of genetic sequence data. This paper describes our approach to scheduling thousands of GARLI jobs with diverse requirements to heterogeneous grid resources, which include volunteer computers running BOINC software. A key component of this system provides a priori GARLI runtime estimates using machine learning with random forests.
%B Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on
%P 1896 - 1902
%8 2011/05//
%G eng
%R 10.1109/IPDPS.2011.344
%0 Conference Paper
%B Shape Modeling and Applications, 2005 International Conference
%D 2005
%T The half-edge tree: a compact data structure for level-of-detail tetrahedral meshes
%A Danovaro,E.
%A De Floriani, Leila
%A Magillo,P.
%A Puppo,E.
%A Sobrero,D.
%A Sokolovsky,N.
%K application;
%K compact
%K computational
%K data
%K detection;
%K edge
%K encoding;
%K generation;
%K geometry;
%K half-edge
%K iterative
%K level-of-detail
%K mesh
%K meshes;
%K methods;
%K model;
%K structure;
%K structures;
%K tetrahedral
%K tree
%K tree;
%X We propose a new data structure for the compact encoding of a level-of detail (LOD) model of a three-dimensional scalar field based on unstructured tetrahedral meshes. Such data structure, called a half-edge tree (HET), is built through the iterative application of a half-edge collapse, i.e. by contracting an edge to one of its endpoints. We also show that selective refined meshes extracted from an HET contain on average about 34% and up to 75% less tetrahedra than those extracted from an LOD model built through a general edge collapse.
%B Shape Modeling and Applications, 2005 International Conference
%P 332 - 337
%8 2005/06//
%G eng
%R 10.1109/SMI.2005.47
%0 Conference Paper
%B Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on
%D 2004
%T Arbitrate-and-move primitives for high throughput on-chip interconnection networks
%A Balkan,A.O.
%A Gang Qu
%A Vishkin, Uzi
%K 8
%K arbiter
%K arbitrate-and-move
%K architecture;
%K asynchronous
%K balanced
%K binary
%K circuit
%K circuit;
%K circuits;
%K consumption;
%K data
%K explicit
%K interconnection
%K interconnections;
%K leaf
%K mesh-of-trees
%K multi-threading;
%K Multithreading
%K n-leaf
%K network;
%K pipeline
%K pipelined
%K power
%K primitive
%K processing;
%K reduced
%K simulation;
%K structures;
%K synchronous
%K synchrony
%K system-on-chip;
%K tree
%K tree;
%X An n-leaf pipelined balanced binary tree is used for arbitration of order and movement of data from n input ports to one output port. A novel arbitrate-and-move primitive circuit for every node of the tree, which is based on a concept of reduced synchrony that benefits from attractive features of both asynchronous and synchronous designs, is presented. The design objective of the pipelined binary tree is to provide a key building block in a high-throughput mesh-of-trees interconnection network for Explicit Multi Threading (XMT) architecture, a recently introduced parallel computation framework. The proposed reduced synchrony circuit was compared with asynchronous and synchronous designs of arbitrate-and-move primitives. Simulations with 0.18 mu;m technology show that compared to an asynchronous design, the proposed reduced synchrony implementation achieves a higher throughput, up to 2 Giga-Requests per second on an 8-leaf binary tree. Our circuit also consumes less power than the synchronous design, and requires less silicon area than both the synchronous and asynchronous designs.
%B Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on
%V 2
%P II - 441-4 Vol.2 - II - 441-4 Vol.2
%8 2004/05//
%G eng
%R 10.1109/ISCAS.2004.1329303
%0 Conference Paper
%B Image Analysis and Processing, 2003.Proceedings. 12th International Conference on
%D 2003
%T Depth-first k-nearest neighbor finding using the MaxNearestDist estimator
%A Samet, Hanan
%K branch-and-bound
%K data
%K depth-first
%K distance;
%K DNA
%K documents;
%K estimation;
%K estimator;
%K finding;
%K images;
%K k-nearest
%K matching;
%K maximum
%K MaxNearestDist
%K mining;
%K neighbor
%K parameter
%K pattern
%K possible
%K process;
%K processing;
%K query
%K search
%K searching;
%K sequences;
%K series;
%K similarity
%K text
%K TIME
%K tree
%K video;
%X Similarity searching is an important task when trying to find patterns in applications which involve mining different types of data such as images, video, time series, text documents, DNA sequences, etc. Similarity searching often reduces to finding the k nearest neighbors to a query object. A description is given of how to use an estimate of the maximum possible distance at which a nearest neighbor can be found to prune the search process in a depth-first branch-and-bound k-nearest neighbor finding algorithm. Using the MaxNearestDist estimator (Larsen, S. and Kanal, L.N., 1986) in the depth-first k-nearest neighbor algorithm provides a middle ground between a pure depth-first and a best-first k-nearest neighbor algorithm.
%B Image Analysis and Processing, 2003.Proceedings. 12th International Conference on
%P 486 - 491
%8 2003/09//
%G eng
%R 10.1109/ICIAP.2003.1234097
%0 Conference Paper
%B Data Engineering, 2003. Proceedings. 19th International Conference on
%D 2003
%T PXML: a probabilistic semistructured data model and algebra
%A Hung,E.
%A Getoor, Lise
%A V.S. Subrahmanian
%K algebra;
%K data
%K databases;
%K instances;
%K model;
%K models;
%K probabilistic
%K processing;
%K PXML;
%K query
%K relational
%K semistructured
%K structures;
%K tree
%K XML;
%X Despite the recent proliferation of work on semistructured data models, there has been little work to date on supporting uncertainty in these models. We propose a model for probabilistic semistructured data (PSD). The advantage of our approach is that it supports a flexible representation that allows the specification of a wide class of distributions over semistructured instances. We provide two semantics for the model and show that the semantics are probabilistically coherent. Next, we develop an extension of the relational algebra to handle probabilistic semistructured data and describe efficient algorithms for answering queries that use this algebra. Finally, we present experimental results showing the efficiency of our algorithms.
%B Data Engineering, 2003. Proceedings. 19th International Conference on
%P 467 - 478
%8 2003/03//
%G eng
%R 10.1109/ICDE.2003.1260814
%0 Conference Paper
%B Scientific and Statistical Database Management, 2002. Proceedings. 14th International Conference on
%D 2002
%T Efficient techniques for range search queries on earth science data
%A Shi,Qingmin
%A JaJa, Joseph F.
%K based
%K computing;
%K content
%K data
%K data;
%K databases;
%K Earth
%K factors;
%K large
%K mining
%K mining;
%K natural
%K processing;
%K queries;
%K query
%K range
%K raster
%K retrieval;
%K scale
%K Science
%K sciences
%K search
%K spatial
%K structures;
%K tasks;
%K temporal
%K tree
%K tree-of-regions;
%K visual
%X We consider the problem of organizing large scale earth science raster data to efficiently handle queries for identifying regions whose parameters fall within certain range values specified by the queries. This problem seems to be critical to enabling basic data mining tasks such as determining associations between physical phenomena and spatial factors, detecting changes and trends, and content based retrieval. We assume that the input is too large to fit in internal memory and hence focus on data structures and algorithms that minimize the I/O bounds. A new data structure, called a tree-of-regions (ToR), is introduced and involves a combination of an R-tree and efficient representation of regions. It is shown that such a data structure enables the handling of range queries in an optimal I/O time, under certain reasonable assumptions. We also show that updates to the ToR can be handled efficiently. Experimental results for a variety of multi-valued earth science data illustrate the fast execution times of a wide range of queries, as predicted by our theoretical analysis.
%B Scientific and Statistical Database Management, 2002. Proceedings. 14th International Conference on
%P 142 - 151
%8 2002///
%G eng
%R 10.1109/SSDM.2002.1029714
%0 Conference Paper
%B Information Visualization, 2002. INFOVIS 2002. IEEE Symposium on
%D 2002
%T SpaceTree: supporting exploration in large node link tree, design evolution and empirical evaluation
%A Plaisant, Catherine
%A Grosjean,J.
%A Bederson, Benjamin B.
%K browser;
%K camera
%K data
%K design
%K diagrams;
%K dynamic
%K evolution;
%K experiment;
%K exploration;
%K filter
%K functions;
%K graphical
%K icons;
%K integrated
%K interfaces;
%K large
%K link
%K movement;
%K node
%K novel
%K optimized
%K rescaling;
%K search;
%K SpaceTree;
%K structures;
%K topology;
%K tree
%K user
%K visualisation;
%K visualization;
%X We present a novel tree browser that builds on the conventional node link tree diagrams. It adds dynamic rescaling of branches of the tree to best fit the available screen space, optimized camera movement, and the use of preview icons summarizing the topology of the branches that cannot be expanded. In addition, it includes integrated search and filter functions. This paper reflects on the evolution of the design and highlights the principles that emerged from it. A controlled experiment showed benefits for navigation to already previously visited nodes and estimation of overall tree topology.
%B Information Visualization, 2002. INFOVIS 2002. IEEE Symposium on
%P 57 - 64
%8 2002///
%G eng
%R 10.1109/INFVIS.2002.1173148
%0 Conference Paper
%B Foundations of Computer Science, 1996. Proceedings., 37th Annual Symposium on
%D 1996
%T Efficient approximate and dynamic matching of patterns using a labeling paradigm
%A Sahinalp,S. C
%A Vishkin, Uzi
%K algorithm;replaced
%K algorithmics;substrings;suffix
%K algorithms;pattern
%K approximate
%K characters;dynamic
%K characters;labeling
%K characters;string
%K complexity;indexing;parallel
%K construction;computational
%K data
%K dictionary
%K dynamic
%K indexing;efficient
%K matching;deleted
%K matching;dynamic
%K matching;efficient
%K matching;inserted
%K matching;string
%K matching;tree
%K paradigm;optimal
%K Parallel
%K pattern
%K PROCESSING
%K string
%K structures;
%K text
%K tree
%X A key approach in string processing algorithmics has been the labeling paradigm which is based on assigning labels to some of the substrings of a given string. If these labels are chosen consistently, they can enable fast comparisons of substrings. Until the first optimal parallel algorithm for suffix tree construction was given by the authors in 1994 the labeling paradigm was considered not to be competitive with other approaches. They show that this general method is also useful for several central problems in the area of string processing: approximate string matching, dynamic dictionary matching, and dynamic text indexing. The approximate string matching problem deals with finding all substrings of a text which match a pattern ldquo;approximately rdquo;, i.e., with at most m differences. The differences can be in the form of inserted, deleted, or replaced characters. The text indexing problem deals with finding all occurrences of a pattern in a text, after the text is preprocessed. In the dynamic text indexing problem, updates to the text in the form of insertions and deletions of substrings are permitted. The dictionary matching problem deals with finding all occurrences of each pattern set of a set of patterns in a text, after the pattern set is preprocessed. In the dynamic dictionary matching problem, insertions and deletions of patterns to the pattern set are permitted
%B Foundations of Computer Science, 1996. Proceedings., 37th Annual Symposium on
%P 320 - 328
%8 1996/10//
%G eng
%R 10.1109/SFCS.1996.548491
%0 Conference Paper
%B Data Compression Conference, 1993. DCC '93.
%D 1993
%T Algorithms for fast vector quantization
%A Arya,S.
%A Mount, Dave
%K algorithm;
%K algorithms;
%K directed
%K fast
%K graph
%K graph;
%K graphs;
%K Neighborhood
%K performance;
%K problems;
%K quantisation;
%K quantization;
%K running
%K search
%K time;
%K tree
%K vector
%X This paper shows that if one is willing to relax the requirement of finding the true nearest neighbor, it is possible to achieve significant improvements in running time and at only a very small loss in the performance of the vector quantizer. The authors present three algorithms for nearest neighbor searching: standard and priority k -d tree search algorithms and a neighborhood graph search algorithm in which a directed graph is constructed for the point set and edges join neighboring points
%B Data Compression Conference, 1993. DCC '93.
%P 381 - 390
%8 1993///
%G eng
%R 10.1109/DCC.1993.253111
%0 Journal Article
%J Signal Processing, IEEE Transactions on
%D 1993
%T VLSI implementation of a tree searched vector quantizer
%A Kolagotla,R. K.
%A Yu,S.-S.
%A JaJa, Joseph F.
%K (mathematics);
%K 2
%K 20
%K chips;
%K coding;
%K compression;
%K data
%K design;
%K digital
%K image
%K implementation;
%K MHz;
%K micron;
%K PROCESSING
%K quantisation;
%K quantizer;
%K searched
%K signal
%K tree
%K TREES
%K vector
%K VLSI
%K VLSI;
%X The VLSI design and implementation of a tree-searched vector quantizer is presented. The number of processors needed is equal to the depth of the tree. All processors are identical, and data flow between processors is regular. No global control signals are needed. The processors have been fabricated using 2 mu;m N-well process on a 7.9 times;9.2 mm die. Each processor chip contains 25000 transistors and has 84 pins. The processors have been thoroughly tested at a clock frequency of 20 MHz
%B Signal Processing, IEEE Transactions on
%V 41
%P 901 - 905
%8 1993/02//
%@ 1053-587X
%G eng
%N 2
%R 10.1109/78.193225
%0 Conference Paper
%B Software Reliability Engineering, 1992. Proceedings., Third International Symposium on
%D 1992
%T An improved classification tree analysis of high cost modules based upon an axiomatic definition of complexity
%A Tian,Jianhui
%A Porter, Adam
%A Zelkowitz, Marvin V
%K (mathematics);
%K analysis;
%K axiomatic
%K classification
%K classification;
%K complexity;
%K computational
%K cost
%K decision
%K definition;
%K high
%K metrics;
%K model;
%K modules;
%K overall
%K program
%K reliability;
%K software
%K subroutines;
%K system
%K tree
%K TREES
%X Identification of high cost modules has been viewed as one mechanism to improve overall system reliability, since such modules tend to produce more than their fair share of problems. A decision tree model has previously been used to identify such modules. In this paper, a previously developed axiomatic model of program complexity is merged with the previously developed decision tree process for an improvement in the ability to identify such modules. This improvement has been tested using data from the NASA Software Engineering Laboratory
%B Software Reliability Engineering, 1992. Proceedings., Third International Symposium on
%P 164 - 172
%8 1992/10//
%G eng
%R 10.1109/ISSRE.1992.285848