Purpose Previous studies have shown enhanced pitch and impaired time perception in individuals with autism spectrum disorders (ASD). However, it remains unclear whether such deviated patterns of auditory processing depending on acoustic dimensions would transfer to the higher level linguistic pitch and time processing. In this study, we compared the categorical perception (CP) of lexical tones and voice onset time (VOT) in Mandarin Chinese, which utilize pitch and time changes, respectively, to convey phonemic contrasts. Method The data were collected from 22 Mandarin-speaking adolescents with ASD and 20 age-matched neurotypical controls. In addition to the identification and discrimination tasks to test CP performance, all the participants were evaluated with their language ability and phonological working memory. Linear mixed-effects models were constructed to evaluate the identification and discrimination scores across different groups and conditions. Results The basic CP pattern of cross-boundary benefit when perceiving both native lexical tones and VOT was largely preserved in high-functioning adolescents with ASD. The degree of CP of lexical tones in ASD was similar to that in typical controls, whereas the degree of CP of VOT in ASD was greatly reduced. Furthermore, the degree of CP of lexical tones correlated with language ability and digit span in ASD participants. Conclusions These findings suggest that the unbalanced acoustic processing capacities for pitch and time can be generalized to the higher level linguistic processing in ASD. Furthermore, the higher degree of CP of lexical tones correlated with better language ability in Mandarin-speaking individuals with ASD.