Login / Signup

A clustering approach to identify multidimensional poverty indicators for the bottom 40 percent group.

Mariah Abdul RahmanNor Samsiah SaniRusnita HamdanZulaiha Ali OthmanAzuraliza Abu Bakar
Published in: PloS one (2021)
The Multidimensional Poverty Index (MPI) is an income-based poverty index which measures multiple deprivations alongside other relevant factors to determine and classify poverty. The implementation of a reliable MPI is one of the significant efforts by the Malaysian government to improve measures in alleviating poverty, in line with the recent policy for Bottom 40 Percent (B40) group. However, using this measurement, only 0.86% of Malaysians are regarded as multidimensionally poor, and this measurement was claimed to be irrelevant for Malaysia as a country that has rapid economic development. Therefore, this study proposes a B40 clustering-based K-Means with cosine similarity architecture to identify the right indicators and dimensions that will provide data driven MPI measurement. In order to evaluate the approach, this study conducted extensive experiments on the Malaysian Census dataset. A series of data preprocessing steps were implemented, including data integration, attribute generation, data filtering, data cleaning, data transformation and attribute selection. The clustering model produced eight clusters of B40 group. The study included a comprehensive clustering analysis to meaningfully understand each of the clusters. The analysis discovered seven indicators of multidimensional poverty from three dimensions encompassing education, living standard and employment. Out of the seven indicators, this study proposed six indicators to be added to the current MPI to establish a more meaningful scenario of the current poverty trend in Malaysia. The outcomes from this study may help the government in properly identifying the B40 group who suffers from financial burden, which could have been currently misclassified.
Keyphrases
  • healthcare
  • electronic health record
  • big data
  • mental health
  • primary care
  • rna seq
  • physical activity
  • young adults
  • machine learning
  • risk factors
  • data analysis
  • health insurance