| Title |
Study on Office Vacancy Rate Prediction Models in Korea using Machine Learning |
| DOI |
https://doi.org/10.5659/JAIK.2026.42.5.241 |
| Keywords |
Office Vacancy Rate; Commercial Real Estate; Machine Learning; Panel Data Analysis; Forecasting Models |
| Abstract |
This study develops and evaluates prediction models for office vacancy rates in the Seoul office market by integrating traditional panel
regression methods with machine learning?based approaches. Office vacancy rates are a key indicator reflecting the balance between supply
and demand in commercial real estate markets, and they are closely linked to macroeconomic conditions, financial environments, and
corporate space demand. While previous studies have primarily relied on linear regression, time-series, and panel models to identify the
determinants of vacancy rates, such approaches are limited in capturing nonlinear relationships and complex interactions among variables that
have become increasingly prominent in recent office markets. Using a building-level panel dataset covering the period from the first quarter
of 2017 to the fourth quarter of 2024, this study constructs office building?quarter observations for Seoul. The dependent variable is the
vacancy rate measured at the individual building level, while explanatory variables include major macroeconomic indicators (consumer price
index, money supply, interest rates, industrial production, unemployment rate) as well as office market variables and building characteristics.
A fixed-effects panel regression model is employed as a benchmark explanatory model, based on the results of the Hausman test. In addition,
machine learning?based prediction models(LightGBM, XGBoost) are applied to capture nonlinear relationships and interaction effects among
variables. Prediction performance is evaluated using RMSE and MAE under a time-based train?test split to reflect realistic forecasting
conditions. The empirical results show that both LightGBM and XGBoost significantly outperform the traditional fixed-effects panel model in
terms of prediction accuracy, with XGBoost exhibiting the best overall performance. Furthermore, SHAP analysis reveals that macroeconomic
variables such as inflation, operating yields, interest rates, and industrial activity play a dominant role in explaining vacancy rate fluctuations,
and that their effects operate in a nonlinear and state-dependent manner. These findings suggest that office vacancy rates are not merely the
outcome of linear supply?demand adjustments, but rather the result of complex, nonlinear interactions among macroeconomic, financial, and
market-specific factors. This study contributes to the literature by extending office vacancy research from an explanation-oriented framework
to an explainable prediction framework, and by demonstrating the practical value of machine learning?based models for commercial real
estate market analysis and policy-relevant forecasting. |