Title |
Comparison of Chlorophyll-a Prediction and Analysis of Influential Factors in Yeongsan River Using Machine Learning and Deep Learning |
Authors |
심선희 ( Sun-Hee Shim ) ; 김유흔 ( Yu-Heun Kim ) ; 이혜원 ( Hye Won Lee ) ; 김민 ( Min Kim ) ; 최정현 ( Jung Hyun Choi ) |
DOI |
https://doi.org/10.15681/KSWE.2022.38.6.292 |
Keywords |
Chlorophyll-a; Deep learning; Feature importance; Machine learning; Yeongsan river |
Abstract |
The Yeongsan River, one of the four largest rivers in South Korea, has been facing difficulties with water quality management with respect to algal bloom. The algal bloom menace has become bigger, especially after the construction of two weirs in the mainstream of the Yeongsan River. Therefore, the prediction and factor analysis of Chlorophyll-a (Chl-a) concentration is needed for effective water quality management. In this study, Chl-a prediction model was developed, and the performance evaluated using machine and deep learning methods, such as Deep Neural Network (DNN), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost). Moreover, the correlation analysis and the feature importance results were compared to identify the major factors affecting the concentration of Chl-a. All models showed high prediction performance with an R2 value of 0.9 or higher. In particular, XGBoost showed the highest prediction accuracy of 0.95 in the test data. The results of feature importance suggested that Ammonia (NH3-N) and Phosphate (PO4-P) were common major factors for the three models to manage Chl-a concentration. From the results, it was confirmed that three machine learning methods, DNN, RF, and XGBoost are powerful methods for predicting water quality parameters. Also, the comparison between feature importance and correlation analysis would present a more accurate assessment of the important major factors. |