Evaluating machine learning models to classify occupants' perceptions of their indoor environment and sleep quality from indoor air quality.

Hagen FritzMengjia TangKerry KinneyZoltan Nagy

Published in: Journal of the Air & Waste Management Association (1995) (2022)

A variety of factors can affect a person's perception of their environment and health, but one factor that is often overlooked in indoor settings is the air quality. To address this gap, we develop and evaluate four Machine Learning (ML) models on two disparate datasets using Indoor Air Quality (IAQ) parameters as primary features and components of self-reported IAQ satisfaction and sleep quality as target variables. In each case, we compare models to each other as well as to a simple model that always predicts the majority outcome. In the first analysis, we use open-source data collected from 93 California residences to predict occupant's satisfaction with their indoor environment. Results indicate building ventilation rate, Relative Humidity (RH), and formaldehyde are most influential when predicting IAQ perception and do so with an accuracy greater than the simplified model. The second analysis uses IAQ data gathered from a field study we conducted with 20 participants over 11 weeks to train similar models. We obtain accuracy and F1 scores similar to the simplified model where PM 2.5 and TVOCs represent the most important predictors. Our results underscore the ability of IAQ to affect a person's perception of their built environment and health and highlight the utility of ML models to explore the strength of these relationships. Implications : The results from this study show that two outcome variables - occupant's indoor air quality (IAQ) satisfaction and perceived sleep quality - are related to the measured IAQ parameters but not heavily influenced by typical values measured in apartments and homes. This study highlights the ability of machine learning models as exploratory analysis tools to determine underlying relationships within and across datasets in addition to understanding the importance of certain features on the outcome variable. We compare four different models and find that the random forest classifier has the best performance in both analysis on IAQ satisfaction and perceived sleep quality. It is a suitable model for predicting IAQ-related subjective metrics and also provides value insight into the feature importance of the IAQ parameters. The accuracy of any of these machine learning models in predicting occupants' comfort or sleep quality is limited by the dataset size, how data is collected, and range of data. This study identifies the factors that are important to IAQ perception: ventilation rate, relative humidity, and concentrations of formaldehyde, NO2, and particulate matter. It indicates that sensors that can measure these variables are necessary for future, related studies that model occupants' IAQ satisfaction. However, this study does not find strong relationships between any of the IAQ parameters measured in this study and perceived sleep quality despite the logical pathway between these many pollutants and respiratory issues. A prediction model of IAQ perception or sleep quality can be integrated into home management systems to automatically adjust building operations such as ventilation rates in smart buildings. Once buildings are equipped with a network of low-cost sensors that measure concentrations of pollutants and operating conditions of the ventilation system, the prediction model can be used to predict the occupants' comfort and facilitate the control of the ventilation system.