The main dilemma in the case of classification tasks is to find-from among many combinations of methods, techniques and values of their parameters-such a structure of the classifier model that could achieve the best accuracy and efficiency. The aim of the article is to develop and practically verify a framework for multi-criteria evaluation of classification models for the purposes of credit scoring. The framework is based on the Multi-Criteria Decision Making (MCDM) method called PROSA (PROMETHEE for Sustainability Analysis), which brought added value to the modelling process, allowing the assessment of classifiers to include the consistency of the results obtained on the training set and the validation set, and the consistency of the classification results obtained for the data acquired in different time periods. The study considered two aggregation scenarios of TSC (Time periods, Sub-criteria, Criteria) and SCT (Sub-criteria, Criteria, Time periods), in which very similar results were obtained for the evaluation of classification models. The leading positions in the ranking were taken by borrower classification models using logistic regression and a small number of predictive variables. The obtained rankings were compared to the assessments of the expert team, which turned out to be very similar.