Scott Chen, Ramesh Gopinath
High dimensional data modeling is difficult mainly because the so-called "curse of dimensionality". We propose a technique called "Gaussianiza(cid:173) tion" for high dimensional density estimation, which alleviates the curse of dimensionality by exploiting the independence structures in the data. Gaussianization is motivated from recent developments in the statistics literature: projection pursuit, independent component analysis and Gaus(cid:173) sian mixture models with semi-tied covariances. We propose an iter(cid:173) ative Gaussianization procedure which converges weakly: at each it(cid:173) eration, the data is first transformed to the least dependent coordinates and then each coordinate is marginally Gaussianized by univariate tech(cid:173) niques. Gaussianization offers density estimation sharper than traditional kernel methods and radial basis function methods. Gaussianization can be viewed as efficient solution of nonlinear independent component anal(cid:173) ysis and high dimensional projection pursuit.