# Streaming algorithms for k-center clustering with outliers and with anonymity

Title | Streaming algorithms for k-center clustering with outliers and with anonymity |

Publication Type | Journal Articles |

Year of Publication | 2008 |

Authors | Matthew McCutchen R, Khuller S |

Journal | Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques |

Pagination | 165 - 178 |

Date Published | 2008/// |

Abstract | Clustering is a common problem in the analysis of large data sets. Streaming algorithms, which make a single pass over the data set using small working memory and produce a clustering comparable in cost to the optimal offline solution, are especially useful. We develop the first streaming algorithms achieving a constant-factor approximation to the cluster radius for two variations of the k-center clustering problem. We give a streaming (4 + ε)-approximation algorithm using O(ε − 1 kz) memory for the problem with outliers, in which the clustering is allowed to drop up to z of the input points; previous work used a random sampling approach which yields only a bicriteria approximation. We also give a streaming (6 + ε)-approximation algorithm using O(ε − 1 ln (ε − 1) k + k 2) memory for a variation motivated by anonymity considerations in which each cluster must contain at least a certain number of input points. |

DOI | 10.1007/978-3-540-85363-3_14 |