Gaussian distributions have been commonly assumed when clustering functional data. When the normality condition fails, biased results will follow. Additional challenges occur as the number of the clusters is often unknown a priori. This paper focuses on clustering non-Gaussian functional data without the prior information of the number of clusters. We introduce a semiparametric mixed normal transformation model to accommodate non-Gaussian functional data, and propose a penalized approach to simultaneously estimate the parameters, transformation function, and the number of clusters. The estimators are shown to be consistent and asymptotically normal. The practical utility of the methods is confirmed via simulations as well as an application of the analysis of Alzheimer's disease study. The proposed method yields much less classification error than the existing methods. Data used in preparation of this paper were obtained from the Alzheimer's Disease Neuroimaging Initiative database.