The Journal of
the Korean Society on Water Environment

Bimonthly
  • ISSN : 2289-0971 (Print)
  • ISSN : 2289-098X (Online)
  • KCI Accredited Journal

Editorial Office

The Impact of Data Anomalies on the Performance of Machine Learning Models for Algae Bloom Prediction

이은지(Eunji Lee) ; 박정수(Jungsu Park)

https://doi.org/10.15681/KSWE.2025.41.5.313

Field data often contain various anomalies due to natural variability and errors from sensors and experimental procedures. Since these anomalies can negatively affect model performance, it is crucial to detect and handle them. This study developed machine learning models to predict chlorophyll-a, a quantitative indicator of algal blooms, using water quality data collected in the field from 2015 to 2024 as independent variables. It also analyzed the impact of anomaly removal through an anomaly detection algorithm on model performance. First, datasets were constructed by randomly introducing anomalies into 5%, 10%, 15%, and 20% of the original data. Then, the Isolation Forest (IForest), an anomaly detection algorithm, was employed to detect and remove these anomalies. The effect of anomaly removal was assessed by applying the cleaned data to Extreme Gradient Boosting (XGBoost), an ensemble machine learning algorithm. The model trained on the original data achieved a root mean squared error (RMSE) of 7.541, while the RMSE of models trained on data with anomalies ranged from 8.777 to 17.503. Models trained on datasets with lower anomaly ratios demonstrated better performance. In contrast, models trained on data from which anomalies had been removed using IForest showed RMSE values ranging from 7.645 to 8.067. Similarly, better performance was observed in models trained on data with lower anomaly ratios prior to removal, although the performance differences based on the proportion of anomalies were relatively small. The results of this study demonstrate that anomaly removal can enhance the performance of machine learning models.

Development of a Pollutant Load Delivery Model for Stream Water Quality: A Case Study of the Kyeongan TMDL Subwatershed

공동수(Dongsoo Kong)

https://doi.org/10.15681/KSWE.2025.41.5.321

This study introduces the Delivery Load?Flowrate?Discharge Load?Seasonality (LQLS) model, designed to assess pollutant dynamics in the Kyeongan Stream watershed, a key tributary of the Paldang Reservoir that supplies drinking water to Seoul. Utilizing monitoring data from 2021 to 2023 collected at two TMDL-designated terminal points, we categorized pollutant sources into discharges from sewage treatment plants (STPs), individual point sources, and non-point sources. The LQLS model, which integrates flow-dependent and seasonal functions, outperformed traditional L?Q and LQL models in predictive accuracy. The model performed well for BOD5 and total nitrogen across all statistical measures; however, it faced limitations in estimating total phosphorus, particularly in terms of Nash-Sutcliffe Efficiency (NSE) and RSR, despite showing acceptable bias and agreement metrics. Seasonal analysis indicated that BOD5 and phosphorus accumulated during the dry season, followed by gradual dilution during the monsoon, while total nitrogen exhibited a significant first-flush effect. The peak runoff depths for non-point source pollutant concentrations varied by parameter, with phosphorus showing sharp increases at high flow levels. Although non-point sources contributed the largest portion of total pollutant loads (BOD5: 63?74%, TN: 52?57%, TP: 61?68%), their impact on the annual mean delivered concentration was relatively low (BOD5: 13?20%, TN: 34?38%, TP: 18?26%). This discrepancy is attributed to the episodic nature of non-point source pollution and the exclusion of high-flow data from this study. Consequently, reductions in pollutant loads from STPs had the most significant effect on improving delivered concentrations, while reductions from non-point sources had the least impact.

Evaluation of Optical Tracers for Quantifying TOC Contributions from Non-Point Source in Baseflow of Agricultural Watersheds

김정훈(Jeong-Hoon Kim) ; 박태준(Tae Jun Park) ; 김규범(Gyoo-Bum Kim) ; 허진(Jin Hur)

https://doi.org/10.15681/KSWE.2025.41.5.337

In agricultural watersheds, dissolved organic matter (DOM) from both natural and human-induced sources is transported to streams via baseflow after soil infiltration. Baseflow, a key hydrological component during low-flow periods, significantly contributes to the total organic carbon loads in receiving waters. However, no research has traced DOM sources through baseflow pathways. This study aimed to assess the effectiveness of optical indicators in identifying DOM sources in baseflow using end-member mixing analysis. We selected litter- and compost-derived DOM as contrasting end-members of terrestrial organic matter and compared their spectroscopic characteristics before and after soil interaction through batch adsorption and soil column infiltration experiments. Both end-members showed decreased specific UV absorbance (SUVA) and increased humification index (HIX) and biological index (BIX) after adsorption, indicating a preferential removal of smaller-sized aromatic compounds. Parallel factor analysis (PARAFAC) consistently revealed an increase in humic-like components (C1) and a decrease in protein-/polyphenol-like components (C2). Among the various spectroscopic indices, BIX and the fluorescence index (FI) demonstrated strong linearity (R²: BIX=0.99, FI=0.95) and high source sensitivity (slope-to-standard deviation ratio, S/SD: BIX=0.06, FI=0.05) across different mixing ratios and conditions. These findings suggest that BIX and FI are reliable tracers for quantifying DOM source contributions via baseflow, even after substantial soil interaction. In contrast, PARAFAC-derived components (%C1 and %C2) showed limited applicability under column conditions. This study underscores the value of fluorescence indices?particularly BIX and FI?as effective tools for tracking DOM sources in agricultural landscapes and provides a scientific foundation for managing non-point source pollution through baseflow pathways.

Coastal Sediment Pollution by Heavy Metals in Ulsan: A Comparative Analysis with Incheon and Yeosu

김수진(Sujin Kim) ; 조민기(Minkee Cho) ; 김동우(Dongwoo Kim) ; 고수근(Sugeun Go) ; 배효관(Hyokwan Bae)

https://doi.org/10.15681/KSWE.2025.41.5.350

This study analyzed national monitoring data from 2015 to 2024 to evaluate patterns of heavy metal contamination in sediments and seawater along the industrialized coasts of Incheon, Ulsan, and Yeosu in Korea. Ulsan exhibited notably higher levels of heavy metal contamination in sediment, with median concentrations of Zn (80.4 mg/kg), Hg (0.155 mg/kg), and As (20.4 mg/kg) frequently exceeding the NOAA threshold effect levels (TEL), indicating persistent high-risk conditions. The ecological hazard quotients (HQ_EC50) were significantly higher than those derived from chemical assessments, particularly for Hg (8.00) and As (10.20), underscoring the substantial underestimation of ecological risks when relying solely on chemical criteria. Time-series analysis indicated a significant decline in the concentrations of regulated metals, especially Cu in Ulsan, following targeted pollution control measures, with an annual rate of decrease of -6.21% (p = 0.015). Conversely, concentrations of non-regulated metals such as As (annual increase of 3.92%, p = 0.010) and Cr (annual increase of 1.96%, p = 0.006) continued to rise. Incheon frequently exceeded long-term ecological criteria for Cu in seawater, although Hg concentrations significantly decreased at an annual rate of -8.26% (p = 0.012). Yeosu maintained consistently low and stable levels of heavy metals in both sediment and seawater, reflecting the effectiveness of environmental management efforts implemented in 2012. These findings highlight the critical need for multi-criteria ecological risk assessment frameworks and strengthened management strategies for heavy metals in industrial coastal areas.

Performance Characteristics of Deep Learning Models for Algal Bloom Prediction in Rivers Using Transfer Learning

박정수(Jungsu Park)

https://doi.org/10.15681/KSWE.2025.41.5.365

Acquiring sufficient high-quality data is essential for developing machine learning models. However, collecting water quality data in river environments can be both costly and time-consuming, and the availability of adequate data is not always guaranteed. Transfer learning provides a promising solution by enabling the application of pre-trained models, developed using data from different locations, to a target site. In this study, we employed two deep learning models commonly used for time series forecasting: long short-term memory (LSTM) and one-dimensional convolutional neural network (1D CNN), to predict chlorophyll-a concentrations, a key indicator of algal blooms. For the model input data, we tested various sequence lengths (Seq): 1, 3, 5, 7, and 9 for the LSTM model, and 3, 5, 7, and 9 for the 1D CNN model. The results indicated that the LSTM model utilizing transfer learning achieved the best performance at sequence lengths of 7 and 9, with a Nash?Sutcliffe efficiency (NSE) of 0.843. In contrast, the corresponding models without transfer learning yielded significantly lower NSE values of 0.349 and ?0.014, respectively. For the 1D CNN model, the highest performance using transfer learning was observed with an NSE of 0.814 at Seq 5, while the model without transfer learning had an NSE of 0.608. Although the degree of improvement varied by model type and sequence length, the results clearly demonstrate that transfer learning has the potential to enhance the performance of algal bloom predictions.

Functional Resonance Analysis Method (FRAM) for Structural Diagnosis and Resonance Scenario Analysis in Domestic Sewage Treatment Operations

최영환(Choi, Younghwan) ; 김세영(Kim, Seyeong) ; 권진홍(Kwon, Jinhong) ; 양현명(Yang, Hyeonmyeong) ; 전항배(Jun, Hangbae)

https://doi.org/10.15681/KSWE.2025.41.5.375

Domestic sewage treatment plants are critical for environmental protection and public health. Ensuring their operational efficiency and stability requires a systematic analysis of the complex interactions between various processes. Current technical diagnoses have largely focused on quantitative assessments of individual processes or equipment-centered evaluations, which fall short in revealing structural relationships and root causes of complex operational issues arising from cascading functional interactions. This study introduces the Functional Resonance Analysis Method (FRAM) as an innovative approach to diagnosing complex systems, applying it to domestic sewage treatment plants. We developed a systematic functional network modeling framework and identified key resonance pathways for a structural interpretation of diagnostic results. Utilizing the “Domestic Sewage Treatment Facility Technical Diagnosis Case Study” (2018-2019) published by the Korea Environment Corporation, we systematically analyzed 368 major cases from 54 facilities with processing capacities exceeding 3,000 m³/day. A comprehensive analytical template was created for systematically mapping problems and improvements, resulting in the identification of 16 major technical operational functions. Each function was structured according to FRAM's six essential elements, and the functional interconnection pathways were visualized using the FRAM Model Visualizer program. Quantitative analysis of 22 cases during winter’s low temperatures revealed that the aeration tank (F07) serves as a hub, demonstrating dominant cascade patterns from effluent to aeration to secondary clarification. Based on this hub-and-spoke structure, we developed a five-stage integrated response strategy aimed at fundamentally preventing cascade resonance, showcasing structural advantages over traditional individual response methods. This study establishes a new diagnostic paradigm for addressing complex issues in sewage treatment systems and lays the groundwork for future expansion into socio-technical system analysis.

Analysis on the Mesohabitat Specificity of Benthic Macroinvertebrates Community Indices in the Jojong Stream

공동수(Dongsoo Kong) ; 권용주(Yongju Kwon) ; 김예지(Ye Ji Kim)

https://doi.org/10.15681/KSWE.2025.41.5.386

This study examined the relationship between community indices and area for benthic macroinvertebrates across different mesohabitats (riffle, run, pool, riparian) in the Jojong Stream, Gyeonggi Province, Korea, utilizing four non-probabilistic and fifteen probabilistic distribution models. Rarefaction-based estimates of expected species richness revealed significant discrepancies from observed values in smaller survey areas, indicating potential distortions in ecological interpretation. Most community indices?including species richness, abundance, diversity, dominance, evenness, and the benthic macroinvertebrate (BMI) saprobic index?varied with sampling area. Specifically, species richness and abundance consistently increased with area, while diversity and dominance tended to align even at smaller scales. In riffle habitats, BMI appeared relatively unaffected by area. Although the four-parameter generalized logistic power function showed a high overall fit, its complexity and risk of overfitting limit its practical application. In contrast, three-parameter models such as lognormal, Weibull, inverse Weibull, gamma, and generalized exponential distributions provided comparable accuracy with greater efficiency. The estimated habitat carrying capacity for species diversity was highest in main channel flows and riffles. Additionally, differential entropy indicated high heterogeneity in main channels and riparian zones, attributed to complex environmental factors such as substrate variability and vegetation. To accurately assess species-area relationships and habitat capacity at the site level, increased sampling effort is necessary. The lognormal model exhibited the highest accuracy, suggesting that larger stream systems support greater biodiversity, particularly where spatial heterogeneity enhances ecological richness.

Sampling Method for Community Structure of Benthic Macroinvertebrates in Freshwater Sediment

송재하(Jea-ha Song) ; 곽인실(Ihn-Sil Kwak) ; 공동수(Dong-Soo Kong)

https://doi.org/10.15681/KSWE.2025.41.5.403

Effective biomonitoring of freshwater ecosystems necessitates the implementation of standardized and efficient methodologies for the sampling of benthic macroinvertebrates. This study investigated four critical methodological factors-core sampler diameter, number of replicates, sampling depth, and sieve mesh size-to optimize sediment sampling in Korean rivers. Field surveys were conducted at 15 sites across major river systems in South Korea from June 2022 to October 2023. Three core sizes (Φ5, 7.5, and 10 cm) were compared using 15 replicates each to assess collection efficiency. The optimal number of replicates was determined through the application of the Weibull model to evaluate species accumulation curves and stabilization of community indices. To explore vertical macroinvertebrate distribution patterns, sediment samples were stratified into six depth layers. Three sieve sizes (1.0, 0.5, and 0.2 mm) were assessed to examine their impact on the recovery of small-bodied taxa. The results indicated that the Φ7.5 cm core sampler provided an optimal balance between efficiency and representativeness. A sampling effort of six replicates (0.026 m²) was found to be sufficient to stabilize most community indices. The majority of taxa were concentrated in the upper 0-6 cm layer, while extending sampling to depths of 10-20 cm increased coverage to over 90% and revealed additional deep-burrowing species, suggesting that a depth of 15-20 cm is appropriate for comprehensive assessments. The 0.2 mm sieve was critical for capturing small-bodied taxa, which were significantly underestimated when utilizing 0.5 or 1.0 mm sieves. These findings provide evidence-based guidelines for standardized benthic macroinvertebrate sampling in Korean freshwater ecosystems, thereby enhancing the reliability of ecological assessments and long-term biomonitoring efforts.

A Study on Effect of Repeated Freeze-Thaw Perturbation on Acclimation of Anammox Bacteria in Low-temperature

김수빈(Subin Kim) ; 김상균(Sangkyun Kim) ; 박면호(Myeonho Park) ; 김성아(Seunga Kim) ; 이민주(Minjoo Lee) ; 박준홍(Joonhong Park)

https://doi.org/10.15681/KSWE.2025.41.4.245

In our previous study, we found that Freeze-Thaw Cycle (FTC) pre-treatment under substrate-free conditions (with ammonia and nitrite absent) effectively enriched Anammox bacteria for low-temperature environments. However, it remains unclear whether FTC-treated Anammox bacteria can acclimate during low-temperature reactivation when substrates are added. This study investigated the survival and functional recovery of Anammox bacteria subjected to repeated FTC pre-treatment during low-temperature reactivation. The results indicated that specific Anammox activity (SAA) increased with the number of FTC cycles compared to the control (30°C) and No Freeze-Thaw (NFT) conditions (15°C without FTC). However, this increase was primarily driven by nitrite removal efficiency (NRE), while ammonia removal efficiency (ARE) remained low. Subsequent qPCR analysis revealed a decline in RNA/DNA ratios, suggesting reduced transcriptional activity and survival of Anammox bacteria. This indicates that repeated FTC perturbations hindered their acclimation and recovery during reactivation. Additionally, our previous NGS-based study demonstrated that repetitive FTC significantly enriched Pseudomonas within the same inoculum. Consequently, the increased NRE observed in this study is likely facilitated by cold-adaptive denitrifiers such as Pseudomonas. These selectively grown Pseudomonas can utilize nitrite effectively even in low temperatures, potentially outcompeting Anammox bacteria for nitrite and playing a critical role in overall nitrogen removal.These findings suggest that while FTC pre-treatment selectively enriches cold-adaptive Anammox bacteria, successful reactivation of these FTC-pretreated Anammox bacteria at low temperatures is not guaranteed. This highlights gaps in our understanding of bacterial population dynamics during the low-temperature reactivation of FTC-pretreated Anammox bacteria

Analysis on the Water Quality Change of Kyeongan TMDL Target Site Using Probability Distribution Models

공동수(Dongsoo Kong)

https://doi.org/10.15681/KSWE.2025.41.4.256

This study examined water quality trends in the Kyeongan Stream, which is a tributary of the Paldang Reservoir?a crucial drinking water source for the Seoul metropolitan area. Monitoring focused on a designated Total Maximum Daily Load (TMDL) site and was conducted in two phases: Phase 1 (2005?2011), after TMDL implementation, and Phase 2 (2012?2023), during which water quality showed improvement. To describe water quality variables, ten shifted and truncated probability distribution models were utilized, with least squares estimation demonstrating better goodness-of-fit compared to the method of moments and maximum likelihood estimation. Among the goodness-of-fit tests, the Anderson?Darling test proved to be the most stringent, followed by the chi-square and Kolmogorov?Smirnov tests. The modeling results indicated that electrical conductivity (negatively skewed) was best represented by zero-truncated Weibull or two-sided truncated beta distributions. Dissolved oxygen (slightly positively skewed) was modeled using a truncated beta distribution, while pH was best described by a shifted logistic power model. BOD? and TOC were modeled using shifted generalized exponential or lognormal distributions, and truncated logistic or gamma distributions, respectively. Total nitrogen and total phosphorus were aligned with shifted lognormal or gamma models, and truncated beta or logistic power models. Total suspended solids (TSS) followed shifted Weibull or logistic power distributions. In Phase 2, median, mean, standard deviation, and differential entropy increased for electrical conductivity, pH, and dissolved oxygen, while other parameters showed a decline. These changes were attributed to variations in rainfall (which impacted pH, conductivity, TSS, and DO) and policy-driven improvements (affecting BOD?, TOC, total nitrogen, and total phosphorus). Notably, the strengthened phosphorus effluent standards in 2012 led to significant reductions and altered flow-dependent behavior.

Development of a Sustainable Disinfection Technology using Induction-Heated Blast Furnace Slag and Green Tea Extract

김현중(Hyunjung Kim) ; 김이중(Ijung Kim)

https://doi.org/10.15681/KSWE.2025.41.4.276

In response to the growing demand for resource circularity and environmentally sustainable practices, there is increasing interest across various sectors in developing innovative technologies that both ensure environmental protection and facilitate the reuse of recyclable materials. This study explored a novel disinfection method using blast furnace slag (BFS) and a natural antimicrobial agent derived from green tea leaves. Induction heating was applied to water contaminated with Escherichia coli (E. coli) and containing varying amounts of BFS (0?10 g). When more than 8 g of BFS was used, nearly complete inactivation of E. coli was achieved within 10 minutes, with the temperature rising from 22.1°C to 40.1°C. BFS demonstrated consistent disinfection performance over multiple cycles without significant degradation, indicating its potential for repeated use in practical applications. The antimicrobial effect of green tea extract, attributed to its catechin content, was confirmed through a disk diffusion test, which showed an inhibition zone of 0.52 mm at a catechin concentration of 0.06 g/mL. Additionally, batch experiments revealed that the combined application of green tea extract and saponin resulted in a 2.3 log removal (99.49%) of E. coli within 5 hours. These findings underscore the potential of utilizing recyclable industrial residues and plant-derived antimicrobial compounds to develop effective and sustainable disinfection technologies, thereby contributing to environmental protection and circular resource utilization.

Statistical Assessment of Flow-Dependent, Time Series, and Seasonal Patterns in Long-Term Water Quality of the Kyeongan Stream

공동수(Dongsoo Kong)

https://doi.org/10.15681/KSWE.2025.41.4.285

This study analyzed long-term water quality trends at the Kyeongan B site, a TMDL (Total Maximum Daily Load) target point along the Kyeongan stream, which flows into the Paldang reservoir, a key water source for the Seoul metropolitan area. A log-linear model that incorporated flow, time-series data, and seasonal functions was utilized. Upgrading the time-series component from a second-order to a third-order polynomial resulted in minimal statistical changes but improved the model’s capacity to capture complex trends, especially for TOC and total nitrogen. Each water quality parameter responded differently to the model components: TOC was mainly influenced by flow, BOD? and total nitrogen were affected by seasonality, while total phosphorus was driven by the time-series component. These variations reflect the characteristics of each parameter in terms of environmental pool, chemical speciation, and regulatory controls. BOD? and total nitrogen showed strong seasonal fluctuations due to monsoonal wash-off, whereas TOC and total phosphorus, which were more influenced by short-term rainfall events, displayed weaker seasonal patterns. A significant decrease in total phosphorus since 2012 was linked to the implementation of stricter effluent standards, highlighting the impact of policy-driven time-series effects. After controlling for the confounding influence of flow variability, a comparison between the early TMDL period (2005?2007) and recent years (2022?2024) indicated substantial improvements in water quality: total phosphorus decreased by 74%, BOD? by 52%, total nitrogen by 38%, and TOC by 17%. These findings underscore the effectiveness of water quality management efforts, even in the face of increasing pollutant sources.