Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 12, No. 03, p.215-222

ISSN (online) :

2287-5255

Received : 6 December 202328 February 202330 June 2023

DOI :

https://doi.org/10.5573/IEIESPC.2023.12.3.215

Regular Paper

Comparative Study of Machine Learning Models and Distributed Runoff Models for Predicting Flood Water Level

Kubo Tasuku¹ Okazaki Takeo²

(Graduate school of Engineering and Science, University of the Ryukyus / Senbaru, Nishihara, Japan k218587@ie.u-ryukyu.ac.jp )
(Department of Computer Science and Intelligent Systems, University of the Ryukyus / Senbaru, Nishihara, Japan okazaki@ie.u-ryukyu.ac.jp )

^*Corresponding Author: Tasuku Kubo

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

Conventional flood forecasting methods can be roughly classified as physics-based models and data-oriented models, which both require parameter optimization. In parameter optimization, a search is generally done to minimize the magnitude of overall errors. However, under- and overestimation errors are not equivalent in flood forecasting since underestimation of the water level leads to delays in decision making. We propose a risk-aware forecasting method that uses a weighted loss function. We applied the proposed method to both physics-based models and machine learning models and compared the prediction results to clarify the difference in the prediction results according to the base model used. The results show that the model optimized by the weighted loss function reduced the underestimation error while maintaining the overall error.

Keywords

Flood forecasting, Parameter optimization, Machine learning, Physics-based model, Underestimation error

1. Introduction

Much of Japan is covered with mountains, and the rivers flowing through these mountains are steeper than those on the continent. Floods are now one of the most familiar disasters to the people living in Japan, and much damage occurs every year. In order to reduce the damage, it is necessary to predict water level fluctuations in rivers based on rainfall conditions and to take countermeasures. However, the process from rainfall to river discharge is very complicated and difficult to predict accurately because it is affected by many factors, such as topography, land cover, soil classification, and wetness. Various models have been used in studies on flood forecasting. A physics-based model was derived from Navier-Stokes equations ^[1] to describe the fluid motion, and predictions were calculated from the physical equations, so the predictions do not deviate significantly from the rules defined by the governing equations. Because of its clear physical background and high reproducibility, this model has been used for a long time in the field of runoff analysis.

Physics-based models can be divided into lumped types and distributed types based on the treatment of its parameters. A lumped model describes the rainfall-runoff relationship for the entire watershed and is represented by the storage function method ^[2] and tank model ^[3]. In a lumped model, the parameters are representative of the entire watershed, so there are few parameters that need to be adjusted, and it has been used as a simple forecasting method. However, many of the factors that influence the runoff process, such as rainfall, elevation, and soil, are spatially heterogeneous, so the accuracy is often less than that of a distributed model, and it is often impossible to reproduce flood cases.

A distributed model expresses the water movement in the entire watershed by dividing it into fine meshes and calculating the water exchange between them. The Rainfall-Runoff-Inundation model proposed by Sayama et al. ^[4] is a typical distributed model. The model is a model that expresses the process of rainfall from its arrival at the ground surface to its discharge into a river using four differential equations for slope flows, lateral flows, vertical permeable flows, and channel flows.

In forecasting using a physics-based model, the future runoff volume or water level is calculated by identifying the parameters of the model beforehand and providing the predicted rainfall as input. Therefore, there are some problems, such as the difficulty of identifying the parameters and the inability to reflect the observed water level obtained at each time. To deal with these problems, some methods that utilize data have been proposed. Kobayashi et al. ^[5] proposed a method for applying the Levenberg Marquardt (LM) method, a mathematical optimization technique, to determine the parameters of a distributed model by viewing it as a problem of searching for parameters that minimize the error between the observed and calculated flow rate. Miyake et al. ^[6] proposed a method to reflect the water level obtained at each time in a forecast by sequentially correcting the water level in the model each time observations are obtained using an optimal interpolation method as a data assimilation method.

Recently, some data-oriented approaches have been proposed, which focus only on the input-output relationship of the data without considering the physical aspects. Hitokoto et al. ^[7] proposed a forecasting method based on deep neural networks. A deep neural network is a basic machine learning model that improves the expressive capability of the model by using many intermediate layers of neural networks. They constructed a model that expresses the time-series nature of the water level by adding several timesteps of data obtained from stations in a watershed to the characteristic quantities. The model achieved highly accurate prediction.

In prediction using machine learning, the model itself optimizes parameters to represent the rainfall-runoff process during a flood by using past flood data and rainfall data at the time of the flood. Therefore, the model can be constructed even if the user does not have much knowledge of hydrology and can make accurate predictions relatively easily. The background and focus of the physics-based models and machine learning models are different. However, they both search for optimal parameters from data, and the parameters have a significant impact on the forecasting. Generally, in parameter optimization, an objective function focusing only on the magnitude of error, such as least-squares error, is generally used to search for parameters to minimize the error with the observed data.

Underestimation of water levels during floods leads to a delay in decision making, which may aggravate the damage. In order to address this issue, Miyamoto et al. ^[8] proposed a method for optimizing model parameters in terms of the difference in risk of underestimation and overestimation of forecast errors. However, it is unclear whether this method is practical since they did not actually verify its accuracy in real-time forecasting. In this study, we examined the effectiveness and practicability of the models adjusted by a weighted loss function. Furthermore, we clarified the difference in the forecasting of the physics-based model and the machine learning model by comparing the results of flood forecasting.

2. Forecasting Models

2.1 Distributed Runoff Model

In this study, the distributed runoff model consists of four types of flows: surface flow, subsurface flow, infiltration flow, and river channel flow. The kinematic wave method was applied to the surface flow and the subsurface flow. The surface flow equations and the subsurface flow are described by the following equations:

(1)

$\frac{\partial h_{s}}{\partial t}+\frac{\partial q_{sx}}{\partial x}+\frac{\partial q_{sy}}{\partial y}=r-f$,

(2)

$q_{sx}=-\frac{1}{n}h_{s}^{5/3}\frac{\partial H_{s}}{\partial x}$,

(3)

$q_{sy}=-\frac{1}{n}h_{s}^{5/3}\frac{\partial H_{s}}{\partial y}$,

where h$_{s}$ is the water depth on the local surface, Hs is the height of water from the datum, qsx and qsy are the unit discharges in the x and y directions, respectively, r is rainfall, n is Manning’s roughness parameter, and f is the infiltration.

(4)

$\phi \frac{\partial h_{g}}{\partial t}+\frac{\partial q_{gx}}{\partial x}+\frac{\partial q_{gy}}{\partial y}=f$,

(5)

$q_{gx}=-h_{g}k_{a}\frac{\partial H_{g}}{\partial x}$,

(6)

$q_{gy}=-h_{g}k_{a}\frac{\partial H_{g}}{\partial y}$,

where h$_{g}$ is the water depth in the ground, H$_{g}$ is the height of water from the datum, q$_{gx}$ and q$_{gy}$ are the unit discharges in the x and y directions, respectively, ${\phi}$ is the porosity, and k$_{a}$ is the permeability coefficient.

For the infiltration f, we applied the Green-Ampt infiltration model:

(7)

$f=k_{v}\left[1+\frac{\left(\phi -\Theta \right)S_{f}}{F}\right]$,

where kv is the effective saturated hydraulic conductivity, $\theta $ is the initial water volume content, $S_{f}$ is the suction at the vertical wetting front, and F is the cumulative infiltration. To solve Eqs. (1)-(7), the finite element method and the fifth-order adaptive-time Runge-Kutta method were applied to approximate the spatial direction and time direction, respectively. The relationship between the surface water depth, groundwater depth, rainfall, and infiltration in each grid cell is shown in Fig. 1.

For the channel flow, a one-dimensional version of the equation for the surface flow was used. However, it is necessary to consider the cross-sectional area of the channel flow because the amount of change in water depth depends on the width of the channel at each point, even when the same amount of water flows into the channel. In this study, the cross section of the channel was assumed to be a rectangle, and its width and depth were determined by using a of the catchment area ^[4]. The width W and the depth D of the river channel at a certain point are represented by Eqs. (8) and (9):

(8)

$W=C_{w}A^{{S_{w}}}$,

(9)

$D=C_{d}A^{{S_{d}}}$,

where Cw, Sw, Cd, and Sd are parameters for each river, and A is the catchment area for each point.

The water supply to the river channel was calculated using the inflow from the surface flow. The outflow from the subsurface flow was also added to the surface water depth as a return flow when it exceeds the depth of the seepage layer so that the final amount of water flowing into the river would be determined by only the surface water depth. We adopted a step-down equation ^[4] in which inflow occurs when the river water level is lower than the ground surface. The lateral inflow into the river channel at that time, q, is calculated by the Eq. (10):

(10)

$q=\sqrt{\left(\frac{2}{3}h_{s}\right)^{3}g}$,

where g is gravitational acceleration.

Data assimilation is also required to update the state of the model each time a value is observed. In this study, optimal interpolation was used as the data assimilation method. An optimal interpolation method is widely used as a simple data assimilation method and in real-time forecasting because it is considered to have a lower computational cost at each time step than other methods ^[9]. In the field of runoff analysis, Miyake et al. ^[6] applied an optimal interpolation method to the runoff calculation of a distributed model and demonstrated its usefulness by calculating and comparing the water level one hour later from the pre-assimilated and assimilated states.

In the optimal interpolation method, the assimilated value (analytical value) x$^{{a}}$ is expressed by Eq. (11).

(11)

$x^{a}=x^{b}+BH^{T}\left(HBH^{T}+R\right)^{-1}\left(y-Hx^{b}\right)$,

Fig. 1. Relationships among rainfall, surface flow, subsurface flow, and infiltration in a distributed runoff model.

where x$^{{b}}$ and y are first estimates (pre-assimilation values computed by the model) and observed values, respectively. B is the covariance matrix of background errors, R is the covariance matrix of observed errors, and H is an observation matrix representing interpolation from model space to observation space.

Both R and B need to be determined in advance. We set the background error variance to 1.0 and the observed error variance to 0.5 based on a previous study ^[6]. For the covariance matrix of the observation errors, we assumed that each observation error is independent and uncorrelated, and we set the matrix as a diagonal matrix. The spatial correlation coefficients of the background error covariance matrix were calculated from the water levels in each grid cell at the peak time of the multiple simulations.

The prediction method by the physics-based model in this study has the following steps.

[Step 1.]Generate grid cells as a basis for calculation using elevation, flow direction, and catchment area data and then determine the river channel parameters using the cross-sectional profile of the target river.

[Step 2.]Prepare flood data for calculating the weights of the optimal interpolation and calculate the runoff using the rainfall data at the time. Calculate the weight matrix W of the optimal interpolation using the difference between the calculated water level and the observed water level at the peak time.

[Step 3.]Optimize the parameters using the training data. The initial water level of the channel mesh is calculated using the observed water level W at each station in Step 2 and Eq. (11). A quasi-Newtonian method is used for optimization and is updated based on the errors between the calculated water level and the observed water level.

[Step 4.]Calculate the runoff using the optimized parameters. Every 10 minutes, the calculation step proceeds using the observed rainfall data. When the water level is observed every hour, the data are assimilated using the optimal interpolation method. However, the assimilated values are only used in Step 5 and are not reflected in the next calculation steps.

[Step 5.]Make predictions using the assimilated values. Apart from Step 4, simulations are run using the assimilated values as the river water depths. In this simulation, the surface and subsurface water depths are taken from the data in Step 4, and the observed rainfall is used as the predicted rainfall data. The calculated water level is regarded as the predicted value.

2.2 Neural Network Model

For the machine learning model, we constructed a model based on a DNN ^[7]. A neural network is a model that consists of many layers of artificial neurons that imitate neurons in the human brain. They are generally called an artificial neural network when there is one intermediate layer and a DNN when there are two or more layers. An artificial neuron is a simple model in which the output value is determined by a function called the activation function and the weighted value of each input signal. The basic structure of a DNN is shown in Fig. 2, and an artificial neuron is represented by Eq. (12):

(12)

$z=u\left(\sum _{i=1}^{n}w_{i}x_{i}+\Theta \right)$,

where x is an input variable, w is a weight parameter, $\theta $ is a threshold value, and u is an activation function.

In this study, we used the ReLU function as the activation function of the intermediate layer and the constant function as the activation function of the output layer. The ReLU function outputs the input signal as it is when the input signal is positive and 0 when the input signal is negative. It is widely used as an activation function for the intermediate layer because of its good computational efficiency and stable learning. The water level prediction method by machine learning in this study has the following steps.

[Step 1.]Determine the network structure according to the forecast condition. The input layer is determined based on the data to be used for the forecast, such as the number of stations in the watershed. The output layer is determined by the forecast target, such as the number of time points to forecast.

[Step 2.]Train the model using multiple flood cases.

[Step 3.]The observed water level, observed rainfall, and predicted rainfall are given to the trained model at each time, and the amount of change from the water level at the current time is calculated. The predicted values are obtained by adding them to the current water level. Since it is necessary to consider the model performance and the predicted rainfall separately, the rainfall prediction is assumed to be a perfect prediction, and the observed rainfall is used for training and forecasting.

Fig. 2. Relationships among rainfall, surface flow, subsurface flow, and infiltration in a distributed runoff model.

3. Estimate Parameters using Weighted Loss

The parameter estimation of a distributed model was replaced by an optimization problem that minimizes an objective function to evaluate the error between the observed values and the calculated values of the model. There are various functions to evaluate the error, but generally, the mean squared error (MSE) or mean absolute error (MAE) is used. MSE is evaluated by squaring the errors, and larger errors are given a larger evaluation. Therefore, the overall error is small, but it is greatly affected by outliers. On the other hand, MAE is strong against outliers because it evaluates the magnitude of the error.

In addition, there is Huber loss, which has characteristics of both MAE and MSE. Huber loss works as MAE for errors larger than a given threshold value and as MSE for smaller values. Hence, the optimization becomes more robust to errors than with MSE, and the overall error is smaller than with MAE. Huber loss is expressed by the following equation:

(13)

$L_{Huber}=\left\{\begin{array}{l} \frac{1}{2}\left(y~ -~ \hat{y}\right)^{2}~ ~ ~ ~ ~ ~ ~ ~ ~ ~ for~ ~ \left| y-\hat{y}\right| \leq \delta \\ \delta \left(\left| y~ -~ \hat{y}\right| ~ -~ \frac{1}{2}\delta \right)~ ~ ~ ~ ~ otherwise \end{array}\right.$,

where $\delta $ is an arbitrary threshold value, y and $\hat{y}$ are observed and calculated values, respectively.

These evaluation functions do not focus on the positive and negative values of the errors, although the evaluated values vary depending on the magnitude of the errors. However, in flood forecasting, underestimation of the water level leads to delays in evacuation decisions. Therefore, underestimation error should be given a larger penalty than overestimation error. In this study, we searched for parameters that do not underestimate the water level by optimizing with a weighted evaluation function ^[10]. We used Huber loss L$_{Huber}$ as the base function, and the weighted Huber loss is expressed by the following equation:

(14)

$Loss~ =~ \left\{\begin{array}{l} \alpha L_{Huber}\\ \left(1-\alpha \right)L_{Huber} \end{array}\right.$,

where $\alpha $ is a sensitivity parameter. In the weighted loss function, the weights of overestimation and underestimation errors are adjusted according to the value of $\alpha $. When $\alpha =0.5$, the function works as general Huber loss, and as $\alpha $ is increased, a larger evaluation value is given to the underestimation error.

4. Comparative Experiments

4.1 Target Watersheds and Used Data

The target watershed is Hiwatari water level station in the Oyodo River basin, Miyazaki Prefecture, Japan ^[7]. Hiwatari water level station has a watershed area of 816 km$^{2}$, as shown in Fig. 3, and is surrounded by mountains and hills. There are 5 water level stations and 14 rainfall stations in the upstream area.

In the machine learning model, rainfall amounts and water levels obtained at the stations were obtained from the sluicegate water-quality database. Elevation data and radar rainfall data were used to construct a distributed model. C-band radar was used for the radar rainfall data, and the functions of the RRI model were used to obtain the elevation data and to partition the mesh.

Fig. 3. Basin boundaries and observatories for water level and rain in Hiwatari.

4.2 Experimental Conditions

Manning's roughness coefficient n and porosity $\phi $ for the slope and river have a great influence on the velocity of the downstream flow, flood arrival time, and peak water level, so they were considered for the optimization. Since the approximate ranges of these values has been clarified by previous studies ^[11,^12], we used them for optimization. For other parameters, common values in studies of distributed runoff models were used. Let n$_{s}$ be the roughness coefficient in the channel and n$_{\mathrm{r}}$ be the roughness coefficient on the slope. The parameters are summarized in Table 1.

For the calculation of the background error variance for data assimilation, the peak water levels of four cases in September 2011, June 2012, July 2012, and June 2016 were used. The case of July 2007 was used for parameter optimization. The predicted water level was calculated by correcting the calculated water level with the optimal interpolation method and doing the runoff calculation separately. Therefore, the observed water levels do not affect the original runoff calculations and are reflected in only the predictions. When constructing the deep learning model, we used a DNN with two intermediate layers based on another model ^[7]. The detailed settings of each layer and the learning conditions were determined by trial and error, as shown in Table 2.

For the features, we used the water level at 1 hour before and at the current time of the target station, the water level fluctuation up to 4 hours before from each station, and the rainfall from 4 hours before to 9 hours after. The output layer corresponds to the water level change from 1 hour to 9 hours after. As training data, we used 5 days of data from the top 20 cases for the period of 2007-2016.

Table 1. Experiment parameters of distributed runoff model.

Parameters	Values
Roughness coefficient on the s: n_s	0.5 - 1.0
Roughness coefficient in the channel: n_r	0.04 - 0.3
Porosity:$\phi $	0.2 - 0.95
Vertical saturated hydraulic conductivity: k_v	8.33*10^-7
Suction at the vertical wetting front: S_f	3.163*10^-2
Soil depth (m)	1.0
River parameter: S_w, S_d, C_w, C_d	Decided based on the cross-section of the river channel.

Table 2. Various settings related to deep learning model.

Settings	values
The number of perceptrons in each layer	233-80-40-9
Dropout	0.5
Epoch	100
Activation function	ReLU
Learning rate	0.001

4.3 Results

Figs. 3 and 4 show the predicted water level from 1 to 9 hours after the observation time. Fig. 3 shows the results of predictions by the physics-based model, and Fig. 4 shows the results of predictions by the machine learning model. Figs. 3(a-1) and (b-1) were obtained by using MSE in parameter optimization, while Figs. 3(a-2) and (b-2) were obtained by using WHL in parameter optimization. Figs. 5(a-2) and (b-2) show the results of the model predictions with parameters optimized by WHL, superimposed on the observed water levels. Figs. 5(c-1) and (c-2) show the results of the physics-based model and the machine learning model, respectively.

The prediction by the model optimized by MSE had a small error in the rising part of the water level, but the underestimation near the peak was significant, which indicates that the parameters were optimized so that the overall error is small. For the model optimized by WHL, the peak water level and the rising part of the water level were overestimated, and there was some increase in the overall error. However, the peak flood was predicted earlier and with a slight overestimation. As mentioned, the prediction for the extended lead time converged to a certain value, but this value was also overestimated compared to that of MSE.

In Fig. 5, we can see that the prediction by the physics-based model converged to a certain value even when the lead time was extended. This was due to correcting only the depth of the river channel sequentially using the observed water level. When the lead time was short, the effect of the correction was strong because the upstream water corrected by the optimal interpolation method flowed downstream, but the effect of the correction became smaller as the time step was increased. As time passed after the correction, the effect of the runoff component due to soil moisture increased, yet the values were almost constant because they were not corrected by the optimal interpolation method. These constant values were almost consistent with those obtained in the overall simulation, and the correction was effective only for prediction up to two or three hours ahead.

The prediction results by the machine learning model show that the prediction values were generally scattered and that the accuracy varied with the lead time. According to the prediction results for each lead time, the prediction accuracy worsened as the lead time increased, although it was almost error free in the short-term prediction. It is also considered that the contribution of rainfall increases as the lead time increases, while the change of water level in the upper stream contributes significantly to the prediction accuracy in the short-term prediction.

Table 3 shows the values of various evaluation indices for the prediction results. In terms of the overall error magnitude, the prediction by the physics-based model had a larger error than that by the machine learning model, and the error was about twice as large as that by the machine learning model, regardless of the lead time. This is considered to result from the influence of the channel geometry in the physics-based model. Since the physics-based model calculates the amount of water movement between meshes using the difference in elevation and channel depth, it often maintains a constant value when the fluctuation of water depth is moderate, such as before the rise of water level or after the peak. This constant value was determined by the difference in elevation from the surrounding mesh and the depth of the river channel, but in the distributed model used in this study, the channel cross-section was assumed to be rectangular, and there was a discrepancy with the actual cross section. As a result, the model stabilized at a higher position than the actual water level at many points in time, resulting in a large overall error.

In terms of the peak time, the predictions by the physics-based model were stable, while those by the machine learning model were unstable, and the largest delay was three hours. It is considered that the peak time of the predictions by the physics-based model did not change even if the water depth in the river channel was corrected by the optimal interpolation method because the water depth at the ground surface and underground determines the discharge into the river channel. The peak time of the prediction by the machine learning model was unstable because the model does not know the storage volume of the entire watershed, although it uses rainfall data before and after the starting time of the prediction.

Fig. 3. Predicted water levels 1 to 9 hours after the time of each observation by the machine learning models using (a-1) MSE and (a-2) WHL for parameter optimization.

Fig. 4. Predicted water levels 1 to 9 hours after the time of each observation by the physics-based model using (b-1) MSE and (b-2) use WHL for parameter optimization.

Fig. 5. Prediction results for each lead time by physics-based model (c-1) and machine learning model (c-2).

Table 3. Values of various evaluation indices for the forecasting results for lead time (LT)=1, 3, 5, 7, 9 hours. Each column compares the evaluation values for each lead time and for each objective function used in the optimization. The rows show the values of the various evaluation indices for each forecasting method.

Objective function		MSE					WHL
		Lead time					Lead time
Model	Metrics	1	3	5	7	9	1	3	5	7	9
Distributed Runoff model	RMSE	0.618	0.909	1.130	1.210	1.235	0.765	1.138	1.401	1.502	1.539
	Peak time	2	2	2	2	2	2	2	2	2	2
	Peak level	-0.556	-0.738	-0.733	-0.732	-0.732	0.484	0.861	0.870	0.870	0.871
	Total underestimation	5.536	7.998	8.173	8.118	8.099	3.224	2.811	2.754	2.724	2.674
Machine Learning model	RMSE	0.253	0.440	0.350	0.326	0.459	0.229	0.403	0.471	0.507	0.546
	Peak time	0	-1	-1	0	0	0	-1	-3	0	-1
	Peak level	0.035	-0.395	-1.009	0.923	0.078	0.079	0.337	-0.060	-0.369	0.222
	Total underestimation	5.184	8.513	8.513	8.212	11.619	1.953	2.851	2.580	2.054	3.915

For the difference by the objective function, MSE had a noticeable underestimation error in the peak water level, although the overall error was smaller. In contrast, WHL had little underestimation of the peak water level, and both models showed improvement. However, WHL had a larger overall error than MSE and increased RMSE by up to 25%.

5. Conclusion

In this study, we proposed a method for predicting river water levels by considering flood risk and applied it to both physics-based models and machine learning models. The experimental results showed that the proposed method improved the accuracy of both models in terms of peak water level error and underestimation error, suggesting its effectiveness in practical use. In addition, the comparison of the prediction results by each model showed differences depending on the selection of the base model.

ACKNOWLEDGMENTS

Radar rainfall data were collected and distributed by the Research Institute for Sustainable Humanosphere, Kyoto University (http://database.rish.kyoto-u.ac.jp/index-e.html).

REFERENCES

C. R. Doering, ``The 3d navier-stokes problem,'' Annual Review of Fluid Mechanics, Vol. 41, pp. 109-128, 2009.

T. Kimura, ``The recent progress in Storage Function Method,'' PROCEEDINGS OF THE JAPANESE CONFERENCE ON HYDRAULICS, Vol. 22, pp. 191-196, 1978.

M. Sugawara, ``Tank model,'' Journal of Geography (Chigaku Zasshi), Vol. 94, pp. 209-221. 2004.

T. Sayama et al., ``Rainfall-runoff-inundation analysis of the 2010 Pakistan flood in the Kabul River basin,'' Hydrological Sciences Journal, Vol. 57, pp. 298-312. 2012.

K. Kobayashi et al., ``Parameter estimation of a distributed rainfall-runoff model by a levenberg-marquardt optimization algorithm,'' Proceedings of Hydraulic Engineering, Vol. 51, pp 409-414. 2007.

S. Miyake et al., ``Data assimilation of river water levels by distributed hydrological models based on optimal interpolation,'' Journal of JSCE, Ser. B1, Vol. 72, pp. I_175-I_180, 2016.

M. Hitokoto et al., ``Development of the real-time river stage prediction method using deep learning,'' Journal of JSCE, Ser. B1, Vol. 72, pp. I_187-I_192, 2016.

M. Miyamoto et al., ``Calibration considering flood forecasting aptitudes for hydrological parameters of a distributed runoff model,'' Journal of JSCE, Ser. B1, Vol. 72, pp. I_175-I_180, 2016.

L. H. Zheng et al., ``Improvement of the real-time PM2.5 forecast over the beijing-taijin-hebei region using an optimal interpolation data assimilation method,'' Aerosol and Air Quality Research, Vol. 18, pp. 1305-1316, 2018.

M. M. Petri et al., ``Accelerated query processing via similarity score prediction,'' in Proc. of SIGIR '19: The 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 485-494, 2019.

H. Horino et al., ``Experimental studies on roughness coefficients for overland flow under a controlled rainfall,'' Transaction of JSIDRE, Vol. 152, pp. 87-94, 1992.

Y. Asano et al., ``Mesuring the flow and manning’s roughness coefficient of mountain streams,'' Journal of the Japan Society of Erosion Control Engineering, Vol.65, pp. 62-68, 2012.

Author

Tasuku Kubo

Tasuku Kubo received his B.S. degree in computer science and engineering from University of the Ryukyus, Japan, in 2021, where he is now pursuing an M.S. degree.

Takeo Okazaki

Takeo Okazaki received B.Sc. and M.Sc. degrees from Kyushu University in 1987 and 1989, respectively. He was a research assistant at Kyushu University from 1989 to 1995. He earned his Ph.D. from University of the Ryukyus in 2014. He is currently a professor at the University of the Ryukyus. His research interests are statistical data normalization for analysis, statistical analysis, data analysis, genome informatics, tourism informatics, geographic information systems, and data science. He is a member of JSCS, IEICE, JSS, GISA, and BSJ Japan.

IEIE SPC IEIE Transactions on Smart Processing & Computing

Journal Search

Journal XML

Journal Information

Comparative Study of Machine Learning Models and Distributed Runoff Models for Predicting Flood Water Level

Abstract

Keywords

1. Introduction

2. Forecasting Models

2.1 Distributed Runoff Model

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

Fig. 1. Relationships among rainfall, surface flow, subsurface flow, and infiltration in a distributed runoff model.

2.2 Neural Network Model

(12)

Fig. 2. Relationships among rainfall, surface flow, subsurface flow, and infiltration in a distributed runoff model.

3. Estimate Parameters using Weighted Loss

(13)

(14)

4. Comparative Experiments

4.1 Target Watersheds and Used Data

Fig. 3. Basin boundaries and observatories for water level and rain in Hiwatari.

4.2 Experimental Conditions

Table 1. Experiment parameters of distributed runoff model.

Table 2. Various settings related to deep learning model.

4.3 Results

Fig. 3. Predicted water levels 1 to 9 hours after the time of each observation by the machine learning models using (a-1) MSE and (a-2) WHL for parameter optimization.

Fig. 4. Predicted water levels 1 to 9 hours after the time of each observation by the physics-based model using (b-1) MSE and (b-2) use WHL for parameter optimization.

Fig. 5. Prediction results for each lead time by physics-based model (c-1) and machine learning model (c-2).

5. Conclusion

ACKNOWLEDGMENTS

REFERENCES

Author

Tasuku Kubo

Takeo Okazaki

Article Information (continued)

Keywords

IEIE SPC

IEIE Transactions on Smart Processing & Computing