Cluster Analysis of Massive Datasets in Astronomy

Woncheol Jang

Duke University

March 2006

Clusters of galaxies are a useful proxy to trace the mass distribution of the universe. By measuring the mass of clusters of galaxies at different scales, one can follow the evolution of the mass distribution (Martinez and Saar, 2002). It can be shown that finding galaxies clustering is equivalent to finding density contour clusters (Hartigan, 1975): connected components of the level set Sc ={f > c} where f is a probability density function. Cuevas et al. (2000, 2001) proposed a nonparametric method for density contour clusters. They attempt to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data.

Keywords: Density contour cluster; level set; clustering; Fast Fourier Transform.


The manuscript is available in PDF formats.