Title Strategies to Enhance Performance in Machine Learning-based Construction Duration Estimation Through Imputation of Missing Data in Actual Construction Duration Dataset
Authors 이하늘(Lee, Ha-Neul) ; 강윤호(Kang, Yun-Ho) ; 윤영채(Yun, Yeong-Chae) ; 윤석헌(Yun, Seok-Heon)
DOI https://doi.org/10.5659/JAIK.2024.40.2.267
Page pp.267-273
ISSN 2733-6247
Keywords 기계학습, 결측치, 데이터전처리, 공사기간
Abstract In construction projects, the timeframe often relies on the project manager's experience or past construction records rather than a quantitative workload analysis. Accurate predictions necessitate estimating based on actual construction duration data, factoring in the workload. However, integrating construction duration predictions into machine learning models requires extensive big data, and missing data is a common challenge. This study aims to enhance the learning performance of construction duration prediction models by employing and comparing various imputation methods in the data preprocessing stage. Suitable imputation methods were proposed for machine learning model training based on the average error rate. Results showed that the median imputation method was the most fitting single imputation method, while the random forest regression imputation method stood out among multiple imputation methods. Additionally, with an increasing volume of data, regression imputation methods within multiple imputation proved more suitable than single imputation methods.