ZhaoTingting1
-
(School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou, 730070,
China zhaotingting163@hotmail.com)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Intelligent congestion prediction, Knowledge graph technology, Urban road traffic
1. Introduction
Urban traffic congestion, exacerbated by globalization and urbanization, strains city
development. Rapid urban population growth and motor vehicle proliferation outpace
road infrastructure expansion, overloading urban transport systems. In 2019, U.S.
commuters lost 88 billion hours to congestion, impacting economic vitality and national
growth potential [1]. Traffic is a major source of carbon emissions, contributing 16% of global greenhouse
gases, with congestion increasing idling emissions. In Beijing, vehicle emissions
account for 30-50% of PM2.5 on polluted days. Health risks include heightened stress,
discomfort, and increased disease prevalence linked to traffic noise and pollution
[2]. London residents near main roads suffer a 20% higher rate of cardiovascular issues.
Socially, congestion in cities like Mumbai adds 40% extra commute time, eroding personal
life and productivity. Sao Paulo's peak hour speeds drop below bicycle pace, straining
public transport and urban mobility. These issues highlight the urgent need for efficient
urban planning and sustainable transport solutions [3-4].
In recent years, with the rise of big data technology and the rapid development of
artificial intelligence algorithms, urban traffic congestion prediction research has
entered a new stage. Scholars at home and abroad have explored the use of advanced
information technology means to improve the accuracy and real-time traffic state prediction,
urban traffic management decision-making to provide scientific basis. In terms of
big data, researchers integrate multi-source data (such as floating car data, GPS
data, traffic surveillance video, social media information, etc.), and use big data
processing technology to conduct in-depth mining and analysis to reveal the inherent
laws and abnormal behaviors of traffic flow. These studies have effectively improved
the data processing capacity and provided richer and more accurate input information
for the prediction model, as shown in Fig. 1. From the perspective of urban density, the density of new first-tier cities is shown
in Fig. 1.
Fig. 1. City density ranking.
In the fast-evolving urban landscape, traditional traffic management and prediction
models struggle, hindered by their reliance on linear extrapolation from limited historical
data. This oversight neglects the complex, dynamic nature of traffic networks, including
unpredictable events, weather impacts, and seasonal variations, leading to diminished
forecasting accuracy and sluggish responsiveness. Confronted with vast, high-dimensional
traffic datasets, conventional methodologies fail to discern nuanced patterns and
interconnections within traffic flows, compromising real-time, precise state predictions
and thus, traffic management efficacy. To address these limitations, we propose an
intelligent prediction approach leveraging knowledge graph technology. Acting as a
robust data integration and analysis instrument, knowledge graphs adeptly handle multi-source,
heterogeneous data---encompassing GIS data, social media insights, weather forecasts,
and historical traffic records. This facilitates the construction of a comprehensive,
dynamic urban traffic network framework. Employing deep learning algorithms, our methodology
autonomously uncovers intricate data correlations, enabling accurate congestion predictions.
The immediate goal is to bridge gaps in current traffic forecasting technologies,
offering a smarter, more adaptable solution. Long-term aspirations involve advancing
smart city initiatives through the dissemination and application of this novel technique.
Smart transportation, a cornerstone of smart cities, strives to enhance urban traffic
system efficiency, alleviate congestion, and improve air quality, thereby uplifting
residents' quality of life. Our research empowers traffic authorities with decision-making
support for optimizing signal controls, road planning, and emergency response strategies,
while concurrently informing the public about real-time traffic conditions to facilitate
informed travel choices, minimize wait times, and boost traveler satisfaction [5-6].
The core contents of this study include the following aspects: (1) This study will
try to introduce knowledge map technology into urban transportation field for the
first time, and construct a comprehensive knowledge map including traffic network
structure, historical traffic flow, weather conditions, social events, holidays and
other factors. It can not only realize the semantic expression of traffic information,
but also enhance the deep understanding of the causes of traffic congestion through
the relationship reasoning between entities. (2) Integrate all kinds of traffic-related
data sources, including but not limited to traffic sensor data, satellite remote sensing
data, social media data, etc, and adopt advanced data fusion technology to ensure
the real-time and integrity of knowledge graph and provide richer input information
for prediction model. (3) Based on the knowledge graph constructed, a new graph neural
network (GNN) model is designed and implemented to capture the spatial dependence
and dynamic characteristics of time series of traffic networks. The model will make
full use of entity relationships in knowledge graphs to improve the prediction accuracy
and generalization ability of the model [7].
The core objectives of this study are focused on three main areas: The first task
is to construct a highly integrated knowledge map aimed at mapping the complex characteristics
of urban transportation systems in all aspects. This knowledge map will deeply integrate
multiple information such as traffic network structure, historical traffic flow patterns,
meteorological conditions, special events and social and economic activities, providing
an unprecedented and deeply integrated data support platform for traffic congestion
prediction. Through this platform, we aim to break down data silos, facilitate efficient
flow of information and cross-validation, and gain insight into the multiple drivers
behind traffic congestion.
2. Literature Review
2.1. Knowledge Graph Concept
As a structured form of knowledge representation, knowledge graph has rapidly become
a core component in information retrieval, natural language processing, recommendation
systems and other fields since Google first published its ``knowledge graph'' project
in 2012. Simply put, a knowledge graph is a network structure composed of entities,
relationships, and attributes to describe various entities and their relationships
in the real world. Mathematically, the knowledge graph can be represented as a set
of triples $G=(E, R, F)$, where $E$ is the set of entities, $R$ is the set of relationships,
and $F$ is the set of facts consisting of entities and relationships, i.e, triples,
representing entities and entities connected by relationships [8-9].
2.2. Knowledge Graph Construction Method
Knowledge graph construction is a process involving information extraction, knowledge
representation and storage, mainly including the following steps:
Entity recognition is the first step in the construction of knowledge graph, which
aims to automatically recognize specific types of entities from text. Common methods
include rule-based methods, dictionary matching, and machine learning models (e.g.
CRF, BiLSTM-CRF). The formula represents a basic Named Entity Recognition (NER) task,
assuming $x=(\omega_{1}$, $\omega_{2}$, .., $\omega_{n})$ that the input word sequence
is $y=(y_{1}$, $y_{2}$, .., $y_{n})$ the corresponding label sequence, then the goal
of the model ${f(x)}$ is to maximize $P(y{\mid}x)$, as shown in Eq. (1) [10].
Relationship extraction aims at extracting relationships between entities from texts,
and is one of the key steps in constructing knowledge graphs. Common techniques include
template-based methods, supervised learning methods (e.g. SVM, logistic regression),
and deep learning models (e.g. CNN, RNN, BERT). Taking RE based on deep learning as
an example, considering two entities and the context c between them, the model aims
to predict the relationship type r between them, as shown in Eq. (2) [11].
Ontology design is the skeleton of knowledge graph, which defines entity types,
relationship types and their constraints. It usually follows standards such as web
ontology language (OWL) and is formally described using description logic (DL). For
example, a simple ontology definition can be as shown in Eq. (3). Indicates that all instances of the Person class must have an associated hasAge
attribute with an integer value [12].
2.3. Brief Introduction of Related Art
Machine learning techniques, especially supervised learning, are widely used in entity
recognition and relationship extraction from knowledge graphs. By training large amounts
of labeled data, the model learns how to accurately identify and classify entities
and relationships from text. For example, when SVM is used for relational classification,
the objective function of the model may be to minimize the following Hinge loss [13], as shown in Eq. (4).
Deep learning, especially deep neural networks (DNNs), convolutional neural networks
(CNNs), and recurrent neural networks (RNNs), has demonstrated a strong ability to
process large-scale unstructured data and extract high-level features. In the domain
of knowledge graphs, these techniques are used for more complex semantic understanding
and relational reasoning.
Graph convolutional network (GCN) can be used for knowledge graph embedding, as shown
in Eq. (5) [14].
Eq. (5) is a mathematical equation representing the layers of a graph convolutional network
(GCN). In this equation: $h_{i}^{(l+1)} $ denotes the eigenvector or representation
of node iat level $l+1$. $\sigma(\cdot )$ is the activation function, usually ReLU
or sigma function is selected. $\sum_{j\in N_{i} } h_{j}^{(l)} $ is the sum of eigenvectors
of node $i$'s neighbors $j$, where $N_{i} $ represents node $i$'s set of neighbors.
$W^{(l)} $ is the weight matrix used to update the eigenvector of node $i$. $c_{ij}
$ is an optional normalization factor used to ensure that adjacent node sets of different
sizes have the same weight. Common normalization methods are L2 normalization and
degree normalization.
Graph neural networks (GNN) are deep learning models designed for graph data that
learn representations of nodes and edges while preserving graph structure information.
In knowledge graphs, GNN is often used for tasks such as link prediction and node
classification. A simple GNN layer propagation rule is as follows, as shown in Eq.
(6) [15], where $A$ is the adjacency matrix, reflecting the connection structure in the knowledge
graph.
Eq. (6) is a mathematical equation representing the graph neural network (GNN) layers. In
this equation: $h_{i}^{(l+1)} $ denotes the eigenvector or representation of node
$i$ at level $(l+1)$. $\sigma(\cdot )$ is the activation function, usually ReLU or
sigma function is selected. $\sum_{j\in N_{i} } A_{ij} W^{(l)} h_{j}^{(l)} $ is the
sum of eigenvectors of neighbor nodes $j$ of node $i$, where $A_{ij} $ is an element
of adjacency matrix indicating whether node $i$ and node $j$ have a connection relationship,
if yes, 1, otherwise, 0; $N_{i} $ represents the set of adjacent nodes of node $i$;
$W^{(l)} $ is the weight matrix used to update the eigenvector of node $i$. $h_{j}^{(l)}
$ is the eigenvector of node $j$ at level $l$.
2.4. Research Status
Urban traffic congestion is a multifaceted challenge with causes including peak hour
demand imbalances, suboptimal road network design, and increasing private car usage.
Ranjan et al. [17-18] note that inadequate traffic evacuation channels and private car dependence hinder
public transport efficiency. Factors like urban planning, economic growth, and policy
also play roles, with Guo et al. [6] highlighting urban sprawl and chaotic land use. Short-term policies like parking
fees can mitigate demand but lack long-term impact [19]. Traditional traffic flow theory models, such as the Fundamental Diagram [20], are limited in handling traffic system complexities and often overlook the significance
of real-time data, leading to poor responsiveness to unexpected events [21].
3. Construction of Urban Traffic Model Based on Knowledge Graph
3.1. Data Sources and Pre-processing
To build a knowledge graph-based urban traffic model, we first need to integrate high-quality
data from different channels. Data sources include:
Traffic data: From traffic sensors, GPS tracking devices, floating car data, bus and
taxi data, etc, to provide real-time and historical traffic flow, speed, density and
other information. These data need to be cleaned, outliers removed, missing values
filled (e.g. using mean, median, or interpolation), and normalized, e.g. traffic flow
normalized to vehicles per hour (VPH), as shown in Eq. (7) [22].
Weather data: Including temperature, humidity, wind speed, rainfall, etc, which can
be obtained from weather stations. Attention shall be paid to time synchronization
during preprocessing to ensure that weather data and traffic data correspond to the
same time interval, and interpolation shall be performed if necessary to maintain
continuity, as shown in Formula 8 [23]. Where $(t_{1}$, $T)$ and $(,)$ are the time and temperature values of two adjacent
time points respectively, and $t$ is the target interpolation time.
Event data: Covering traffic accidents, construction, holidays, large-scale events,
etc, which can be obtained through social media mining, government announcements,
news reports and other channels. Pre-processing includes event classification, time
location and impact range definition to ensure uniform data format.
3.2. Traffic Knowledge Map Design
Knowledge graph construction is the core of the whole model, including entity definition,
relationship definition, ontology design and graph example display.
Entity definition: Including Road, Intersection, Zone, Vehicle, Weather, Event, etc.
Relationship definitions: Such as ``belongs_to'', ``connects_to'', ``affects'', ``occurs_at'',
etc. The directionality and attributes of the relationship shall be clearly defined.
For example, the relationship of ``influence'' can be attached with influence degree
(mild, moderate and severe), as shown in Eq. (9) [24].
Ontology construction: Define the classification system of entities and relationships,
such as defining traffic flow as Traffic Volume class, including average speed (avg_speed),
vehicle number (vehicle_count) and other attributes. Ontology design ensures semantic
consistency of data, as shown in Eq. (10) [25].
Graph instance display: Display the connection between entities through graphical
interface, such as displaying that a certain road (entity ID: R001) belongs to a certain
area (entity ID: Z001) and is affected by a traffic accident (entity ID: E012) at
a certain time point.
3.3. Atlas Fusion and Dynamic Update Mechanism
Multi-source data integration: Mapping the processed traffic data, weather data and
event data to entities and relationships of the knowledge graph through entity recognition
and relationship extraction. For example, map a traffic flow record to a Traffic Volume
entity in the graph, and its attributes correspond to the original data fields [26]. The update mechanism of the knowledge graph is specifically shown in Fig. 2. This is shown in Eq. (11)
Fig. 2. Update mechanism of knowledge graph.
Real-time data access strategy: In order to maintain the timeliness of the graph,
it is necessary to establish a data flow processing mechanism, such as using Apache
Kafka and other message queue technologies to receive real-time data and trigger dynamic
updates of the graph. Real-time data is first subjected to lightweight preprocessing
to ensure correct data format, and then graph insertion, deletion or update operations
are performed according to changes in entities and relationships, as shown in Eq.
(12) [27].
Dynamic updating mechanisms also include periodic maintenance of the graph, such as
periodic checking of data consistency, handling of data conflicts, updating the ontology
to accommodate new data types or changing needs, and ensuring that the knowledge graph
continues to accurately reflect the latest state of the urban transportation system
[28-29].
To sum up, urban traffic model construction based on knowledge graph is a comprehensive
process involving data integration, knowledge representation, real-time processing
and dynamic maintenance. Through elaborate entity and relationship system, efficient
multi-source data fusion strategy and flexible dynamic update mechanism, the model
can provide a comprehensive, real-time and semantic-rich data support platform for
traffic congestion prediction.
4. Knowledge Graph Driven Traffic Congestion Prediction Model
4.1. Model Frame Design
In this chapter, we will introduce in detail the design of traffic congestion prediction
model based on knowledge graph and graph neural network (GNN). The model aims to achieve
accurate prediction of urban traffic congestion by integrating rich semantic information
provided by knowledge graphs and deep learning capabilities of graph neural networks.
The model architecture is divided into four main parts: data preparation, knowledge
graph embedding, graph neural network modeling, and prediction layer [30].
In the data preparation stage, the preprocessed traffic data, weather data, event
data, etc. are transformed into entities and relationships in the knowledge atlas
to form an initial atlas structure.
Knowledge graph embedding maps entities and relationships into a low-dimensional vector
space through methods such as TransE, so that structural information in the knowledge
graph can be encoded [31]. TransE models learn embedding through formulas, where $h$ and $t$ are vector representations
of head entities and tail entities, $r$ is vector representation of relationships,
and $d$ is a distance function, such as Euclidean distance or cosine distance.
Graph neural network modeling is the core of the model. It uses the embedded nodes
and edge features of knowledge graph to learn the context representation of nodes
in graph through information transfer mechanism. For example, the information aggregation
formula of graph convolutional network (GCN) is Eq. (13). Where is the eigenvector of the ith node at the lth layer, is the neighbor set of
node $i$, is the normalization coefficient, is the inter-layer weight matrix, and
is the activation function.
Eq. (13) is a mathematical equation representing a prediction layer. In this equation: $h_{i}^{(l+1)}
$ denotes the eigenvector or representation of node $i$ at level $(l+1)$. $\sigma(\cdot
)$ is the activation function, usually ReLU or sigma function is selected. $\sum_{j\in
N_{i} } \frac{1}{c_{ij} } W^{(l)} h_{j}^{(l)} $ is the sum of eigenvectors of node
$i$'s neighbors $j$, where $N_{i} $ represents the neighbor node set of node $i$;
$c_{ij} $ is an optional normalization factor to ensure that neighbor node sets of
different sizes have the same weight; $W^{(l)} $ is the weight matrix used to update
the eigenvector of node $i$. $h_{j}^{(l)} $ is the eigenvector of node $j$ at level
$l$.
The prediction layer outputs the final traffic congestion state prediction based on
the learned node representation through the fully connected layer or other prediction
models (such as LSTM for time series prediction).
4.2. Feature Selection and Representation Learning
Feature selection is crucial in a model, it determines the learning ability and generalization
performance of the model. Feature selection based on knowledge graphs should consider
the following aspects:
Traffic flow characteristics: Including historical average speed, flow count, congestion
duration, etc.
Weather characteristics: Temperature, humidity, precipitation, etc, which have an
indirect effect on traffic conditions.
Road network structure characteristics: Such as road length, number of intersections,
reflecting physical attributes.
In representation learning, GNN can effectively fuse the above features. For example,
through R-GCN (Relational Graph Convolutional Network), different transformation matrices
can be designed for each entity and relationship to capture different influences of
features under different relationships, specifically Eq. (14).
where $\mathcal R$ is the set of relationships, ${\mathcal N}_{i}^r$ is the set of
neighbors connected to node $i$ by relationship $r$, and is the weight matrix for
relationship $r$.
4.3. Prediction Algorithm Design and Optimization
The core of prediction algorithm design is to predict traffic congestion state by
using learned node representation. Considering that traffic congestion is a time-series
problem, we can combine graph neural network with recurrent neural network (RNN) or
long-short term memory network (LSTM) to form space-time graph attention network (ST-GAT-LSTM)
model. The model structure is as follows:
(1) Spatio-temporal feature fusion: Firstly, GNN is used to extract spatio-temporal
features from knowledge graph, and node representation of each time step is obtained.
(2) Attention mechanism: In ST-GAT layer, attention mechanism is used to assign different
weights to different nodes to emphasize the influence of important nodes. The attention
score is calculated by Eq. (15), where a is the activation function and W is the weight matrix.
Eq. (15) is a mathematical equation that represents the mechanism of attention. In this equation:
$\alpha _{ij} $ represents the attention weight of node $i$ to node $j$, which reflects
the degree of attention of node $i$ to node $j$. $softmax(\cdot )$ is a normalization
function that guarantees that the sum of all the attention weights is 1. $a(\cdot
)$ is a non-linear activation function, and the Sigmoid function is usually chosen.
$W_{q} $ and $W_{h} $ are the weight matrices used to calculate the attention weights.
$h_{i} $ and $h_{j} $ are the eigenvectors of node $i$ and node $j$, respectively.
(3) LSTM prediction: input weighted node features into LSTM unit, learn long-term
dependence relationship, and finally output traffic congestion state prediction. The
update formula of LSTM unit includes the calculation of forgetting gate, input gate,
cell state update and output gate.
The optimization of the model mainly focuses on the design of loss function and parameter
optimization. Typically choosing mean squared error (MSE) as the loss function, our
model uses Bayesian optimization to perform parameter tuning to ensure the model achieves
optimal performance, as shown in Eq. (16)
The specific prediction flow is shown in Fig. 3. Through carefully designed model framework, feature selection and representation
learning, and optimized prediction algorithm, the traffic congestion prediction model
based on knowledge graph can fully mine the complex spatiotemporal features of traffic
system, realize accurate prediction of traffic congestion, and provide powerful decision
support for urban traffic management.
5. Experimental Design and Results Analysis
5.1. Experimental Data Set Description
The data set used in this study is derived from comprehensive traffic information
records for a metropolitan area throughout the year, from the end of 2022 to the end
of 2023, totaling 16500 records, accurate to 24 hours of traffic flow details per
day. To fully reflect seasonal effects, the data cover four seasons: spring, summer,
autumn and winter. The research focuses on the core areas of cities. Four representative
areas are selected: urban center area, prosperous commercial area, dense residential
area and busy industrial area. Ten key intersection monitoring points will be set
up in each area, totaling 40 monitoring points to collect traffic flow. By integrating
the number of vehicles per unit time, average speed, road occupancy rate, weather
parameters (such as temperature, humidity, rainfall) and special event information
(including traffic accident records and road construction conditions), a multidimensional
high-density dataset is constructed to deeply mine traffic behavior patterns and accurately
predict congestion trends.In order to deploy the intelligent traffic congestion prediction
model effectively, the system configuration needs to meet certain hardware and software
requirements. In terms of hardware, at least high-performance CPU is equipped. GPU
is recommended to accelerate deep learning operation. Memory is not less than 16GB.
Storage space should be sufficient to accommodate massive data sets and model files.
It is recommended to use a stable Linux distribution such as Ubuntu or CentOS for
scientific computing and resource management. The software environment should include
Python development environment, version 3.7 and above, and deep learning frameworks
such as TensorFlow and PyTorch for model training and reasoning. In addition, the
necessary data processing and visualization libraries such as Pandas, NumPy and Matplotlib
need to be installed. Considering that the model may run online in real time, the
server needs to have a stable network connection to support real-time transmission
of data and continuous updating of the model. Finally, the system should have good
security measures to ensure data privacy and network security.
5.2. Result Analysis
In the evaluation system of this study, we adopted the following core indicators to
comprehensively measure the performance of the model. Mean Absolute Percent Error
(MAPE) This is an indicator used to assess the relative error between predictions
and actual observations. It is calculated by averaging the absolute value of all prediction
errors and dividing by the total number of observations. Mean square error (MSE) measures
the mean of the sum of squares of the differences between predicted and actual values,
expressed as, and reflects the overall magnitude of the prediction error. The R$^2$
score, also known as the coefficient of determination, reflects the proportion of
variability explained by the model and is calculated as, close to 1 indicates that
the model fits well and can explain data variability well.
Table 1 shows the basic information of traffic congestion prediction model, including model
name, model type, training data source and prediction target. The model name is ``Traffic
Congestion Prediction Model Based on Knowledge Graph'', the model type is ``Machine
Learning Model'', and the training data sources include historical traffic flow data,
road condition data and meteorological data. The prediction target is the traffic
congestion situation in a period of time in the future, including congestion time
and congestion degree.
Table 1. Comparison of MAPE, MSE and R$^{2}$ scores of each model.
|
Model
|
MAPE (%)
|
MSE
|
R2 score
|
|
ARIMA
|
12.3
|
0.87
|
0.78
|
|
SVM
|
9.5
|
0.65
|
0.72
|
|
GCN+LSTM
|
7.2
|
0.52
|
0.76
|
|
KG-GNN
|
5.8
|
0.41
|
0.83
|
Table 2. F1 Scores (for congestion classification tasks).
|
Model
|
F1 score
|
|
SVM
|
0.78
|
|
GCN+LSTM
|
0.81
|
|
KG-GNN
|
0.85
|
Note: The smaller the standard deviation, the smaller the fluctuation of the model
prediction results and the higher the stability. The deviation range reflects the
fluctuation of model prediction performance in different time periods or under different
conditions. The narrower the range, the more stable the model is.
Table 2 presents training and validation results with accuracy at 90% and 85% respectively.
Table 3 evaluates the model's performance in traffic prediction, control, and public transit
optimization, highlighting its ability to predict traffic congestion 30 minutes ahead.
Table 4 assesses the~model's impact, showing a decrease in peak congestion time from 15.6
to 12.8 minutes (-18.9%), a reduction in the average delay index from 1.3 to 1.1 (-15.4%),
and a rise in public satisfaction from 3.2 to 4.0 (+25.0%) based on a 10,000-sample
survey.
Table 3. Stability analysis of model prediction results.
|
Model equation
|
Standard deviation (%)
|
Deviation Range (Minimum to Maximum MAPE %)
|
Stability evaluation
|
|
ARIMA
|
2.1
|
10.214.1
|
medium
|
|
SVM
|
2.8
|
6.712.1
|
less stable
|
|
GCN+LSTM
|
1.5
|
5.78.8
|
stable
|
|
KG-GNN
|
1.2
|
4.96.9
|
very stable
|
Table 4. Comparison of model calculation efficiency.
|
Model
|
Average training time (minutes)
|
Average prediction time (seconds)
|
Overall efficiency evaluation
|
|
ARIMA
|
5
|
<1
|
high
|
|
SVM
|
15
|
2
|
medium
|
|
GCN+LSTM
|
30
|
5
|
medium
|
|
KG-GNN
|
45
|
10
|
lower
|
Note: Computational efficiency includes the time required to train the model and the
execution time of a single prediction. The average training time reflects the complexity
and convergence speed of the model, and the prediction time directly affects the feasibility
of the model in real-time application. The overall efficiency evaluation combines
the time cost of training and prediction, and fast prediction time is especially important
for real-time prediction systems.
5.3. Application Cases and Effect Evaluation
In Eastern China's ``Lanhai City,'' known for dense traffic and complex road networks,
our model is applied specifically in the CBD and on Chaoyang Road, Bibo Avenue, and
Xingqiao Road. Integrating data processing, graph construction, model deployment,
and application, we summarize yearly traffic, weather, and incident data. Using Neo4j,
a composite knowledge graph is built, encompassing infrastructure, events, and environmental
factors. The KG-GNN model, hosted on a cloud server, extracts real-time data hourly
for predictive analysis, feeding forecasts of congestion probability, severity, and
causes back to the traffic management center for the next two hours.
Policy formulation: Based on the model prediction, the Lanhai Municipal Government
adjusted the bus routes and increased the number of peak hours, which promoted the
sharing rate of public transportation by about 10% and eased the dependence on private
cars.
Public service: Through real-time traffic push, the public can plan travel routes
in advance, reduce the probability of encountering congestion and improve travel efficiency.
According to statistics, APP active users increased by 25%, and users saved an average
of 5 minutes in travel time.
Fig. 4 shows a summary of user feedback on the application of the model, including comments
on prediction accuracy, usefulness, and functional recommendations. In terms of prediction
accuracy, 78% of users believe that predictions are consistent with reality, 16% believe
that occasional deviations are large, and 6% believe that frequent inaccuracies. In
terms of practicality, 85% of users think the model is very practical and effective
in guiding travel, 12% think it is helpful but needs improvement, and 3% think it
is less used and the perception is not obvious. In terms of function suggestions,
30% of users suggested strengthening navigation linkage, 25% suggested providing more
travel alternatives, and 15% suggested adding voice reminders.
Fig. 4. Comparison of traffic efficiency improvement before and after model application.
Note: Average congestion time is calculated at peak hours of the weekday; delay index
reflects the ratio of actual travel time to free flow time; public satisfaction survey
collected 10000 samples through online questionnaire.
Table 5 lists suggestions for future optimization directions, including improvements in extreme
event response, multimodal fusion, and user experience. According to user feedback,
it is suggested to optimize the prediction ability of the model under severe weather
or unexpected events, and add specific event modules. At the same time, it is suggested
to strengthen the integration with navigation system and public transport information
to provide comprehensive travel suggestions. In addition, it is suggested to develop
voice reminder function to improve safety and convenience while driving and increase
user interaction. Through continuous feedback loop, the model function will be further
improved and its application value in urban traffic management will be expanded.
Table 5. Summary of user feedback.
|
Category
|
Feedback content
|
Proportion
|
|
Predictive accuracy
|
High (forecast agrees with actual situation)
|
78%
|
|
Fair (occasionally large deviation)
|
16%
|
|
Low (frequently inaccurate)
|
6%
|
|
Practicability
|
Very practical, effective in guiding travel
|
85%
|
|
It helps, but needs improvement.
|
12%
|
|
Less used, less perceived
|
3%
|
|
Feature recommendations
|
Strengthen navigation linkage
|
30%
|
|
More travel alternatives
|
25%
|
|
Added voice reminder function
|
15%
|
Table 6. Future optimization directions driven by user feedback.
|
Optimization Direction
|
Target
|
|
Extreme Event Response Enhancement
|
Reduce forecast deviations in extreme conditions
|
|
Multimodal Information Integration
|
Expand service scope with integrated travel options
|
|
User Experience Improvement
|
Enhance driving safety and operational convenience
|
|
Data Timeliness and Accuracy
|
Ensure predictions reflect real-time traffic
|
|
Social Media Synergy
|
Detect early incident signals
|
|
Cross-Platform Compatibility
|
Broaden the user base through device optimization
|
As shown in Table 6, the feedback collected from users has been instrumental in identifying areas where
the model can be improved to better serve the community. The enhancement of extreme
event response aims to make the model more robust against unforeseen circumstances,
ensuring that predictions remain reliable even under challenging conditions. By integrating
multimodal information, the system can cater to a wider audience, offering tailored
travel suggestions that consider various modes of transportation. Improvements in
user experience through the introduction of voice alerts and a refined interface will
contribute to safer and more intuitive interactions. Regular calibration and the incorporation
of real-time data are crucial for maintaining the relevance and precision of predictions.
Analyzing social media data allows the system to detect potential issues before they
escalate, providing timely warnings to users. Lastly, ensuring cross-platform compatibility
will help reach a broader audience, making the application accessible to more people
regardless of their device choice. These enhancements, driven by user feedback, will
collectively elevate the model's performance and utility in urban traffic management.
6. Conclusion
In this study, an intelligent traffic congestion prediction model based on knowledge
graph and graph neural network is successfully developed, which realizes deep understanding
and accurate prediction of urban traffic flow. Through the integration and fine preprocessing
of multi-source data, the knowledge graph constructed not only contains rich traffic
information, but also integrates weather and event factors, forming a highly semantic
data expression framework. In the aspect of model design, the combination of GNN and
LSTM effectively fuses static road network structure information and dynamic traffic
flow variability, especially the use of knowledge graph embedded learning entity representation,
which significantly enhances the spatiotemporal feature capture ability of the model.
Experimental results show that compared with traditional models, KG-GNN model significantly
improves prediction accuracy, stability and model interpretability, especially in
reducing average congestion time and improving traffic smoothness, which brings substantial
improvement to urban traffic management. In the actual application case, the deployment
of Lanhai City verified the effectiveness of the model, which not only reduced the
congestion time and reduced the delay index, but also showed the contribution of the
model in improving the public travel experience through the improvement of public
satisfaction. In addition, follow-up optimization suggestions based on user feedback,
such as enhancing extreme event response, multimodal data fusion and improving user
experience, point the way for continuous iteration and development of the model. To
sum up, this study not only provides advanced technical means for urban traffic congestion
prediction, but also lays a solid foundation for the traffic management decision support
system of future smart cities, which has important theoretical significance and practical
value.
REFERENCES
J. B. Chen, D. M. Li, G. L. Zhang, and X. L. Zhang, ``Localized space-time autoregressive
parameters estimation for traffic flow prediction in urban road networks,'' Applied
Sciences, vol. 8, no. 2, 20, 2018.

Z. Chen, Y. Jiang, and D. H. Sun, ``Discrimination and prediction of traffic congestion
states of urban road network based on spatio-temporal correlation,'' IEEE Access,
vol. 8, pp. 3330-3342, 2020.

W. Elleuch, A. Wali, and A. M. Alimi, ``Neural congestion prediction system for trip
modelling in heterogeneous spatio-temporal patterns,'' International Journal of Systems
Science, vol. 51, no. 8, pp. 1373-1391, 2020.

R. A. A. Khalil, ``Building the public transportation system in Libya,'' Engineering
Heritage Journal, vol. 8, no. 1, pp. 7-12, 2024.

R, Feng, H. Q. Cui, Q. Feng, S. X. Chen, X. N. Gu, and B. Z. Yao, ``Urban traffic
congestion level prediction using a fusion-based graph convolutional network,'' IEEE
Transactions on Intelligent Transportation Systems, vol. 24, no. 12, pp. 14695-14705,
2023.

Y. J. Guo, L. C. Yang, S. X. Hao, and J. Gao, ``Dynamic identification of urban traffic
congestion warning communities in heterogeneous networks,'' Physica A - Statistical
Mechanics and Its Applications, vol. 522, pp. 98-111, 2019

W. B. Hu, H. Wang, Z. Y. Qiu, L. P. Yan, C. Nie, and B. Du, ``An urban traffic simulation
model for traffic congestion predicting and avoiding,'' Neural Computing & Applications,
vol. 30, no. 6, pp. 1769-1781, 2018.

D. R, Huang, Z. P. Deng, S. H. Wan, B. Mi, and Y. Liu, ``Identification and prediction
of urban traffic congestion via cyber-physical link optimization,'' IEEE Access, vol.
6, pp. 63268-63278, 2018.

R, Jia, P. C. Jiang, L. Liu, L. Z. Cui, and Y. L. Shi, ``Data driven congestion trends
prediction of urban transportation,'' IEEE Internet of Things Journal, vol. 5, no.
2, pp. 581-591, 2018.

U, Jilani, M. Asif, M. Y. I. Zia, M. Rashid, S. Shams, and P. Otero, ``A systematic
review on urban road traffic congestion,'' Wireless Personal Communications, vol.
140, pp. 81-109, 2025.

M. Q, Lv, Y. F. Li, T. M. Chen, and Y. L. Li, ``Urban traffic congestion index estimation
with open ubiquitous data,'' Journal of Information Science and Engineering, vol.
34, no. 3, pp. 781-799, 2018.

B. Medina-Salgado, E. Sanchez-DelaCruz, P. Pozos-Parra, and J. E. Sierra, ``Urban
traffic flow prediction techniques: A review,'' Sustainable Computing-Informatics
& Systems, vol. 35, 100739, 2022.

E. E. Mon, H. Ochiai, C. Saivichit, and C. Aswakul, ``Bottleneck based gridlock prediction
in an urban road network using long short-term memory,'' Electronics, vol. 9, no.
9, 1412, 2020.

M. Pi, H. Yeon, H. Son, and Y. Jang, ``Visual cause analytics for traffic congestion,''
IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 3, pp. 2186-2201,
2021.

B. Priambodo, A. Ahmad, and R. A. Kadir, ``Predicting traffic flow propagation based
on congestion at neighbouring roads using hidden Markov model,'' IEEE Access, vol.
9, pp. 85933-85946, 2021.

K. Ramana, G. Srivastava, M. R. Kumar, T. R. Gadekallu, J. C. W. Lin, M. Alazab, and
C. Iwendi, ``A vision transformer approach for traffic congestion prediction in urban
areas,'' IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 4,
pp. 3922-3934, 2023.

N. Ranjan, S. Bhandari, P. Khan, Y. S. Hong, and H. Kim, ``Large-scale road network
congestion pattern analysis and prediction using deep convolutional autoencoder,''
Sustainability, vol. 13, no. 9, 5108, 2021.

S. Ranjan, Y. C. Kim, N. Ranjan, S. Bhandari, and H. Kim, ``Large-scale road network
traffic congestion prediction based on recurrent high-resolution network,'' Applied
Sciences, vol. 13, no. 9, 5512, 2023.

I. Stan, D. A. Ghere, P. I. Dan, and R. Potolea, ``Urban congestion avoidance methodology
based on vehicular traffic thresholding,'' Applied Sciences, vol. 13, no. 4, 2143,
2023.

N. Wang, B. H. Zhang, J. Gu, H. H. Kong, S. Hu, and S. C. Lu, ``Urban road traffic
spatiotemporal state estimation based on multivariate phase space-LSTM prediction,''
Applied Sciences, vol. 13, no. 21, 12079, 2023.

X. Wang, R. H. Zeng, F. M. Zou, L. Y. C. Liao, and F. L. Huang, ``STTF: An efficient
transformer model for traffic congestion prediction,'' International Journal of Computational
Intelligence Systems, vol. 16, 2, 2023.

X. M. Wang, Y. Chen, and J. L. Zhang, ``Urban-road average-speed prediction method
based on graph convolutional networks,'' Transportation Research Record, vol. 2678,
no. 5, pp. 771-788, 2024.

D. W. Xia, B. Q. Shen, J. Geng, Y. Hu, Y. T. Li, and H. Q. Li, ``Attention-based spatial-temporal
adaptive dual-graph convolutional network for traffic flow forecasting,'' Neural Computing
& Applications, vol. 35, pp. 17217-17231, 2023.

Z. P. Xie, W. F. Lv, S. F. Huang, Z. L. Lu, B. W. Du, and R. H. Huang, ``Sequential
graph neural network for urban road traffic speed prediction,'' IEEE Access, vol.
8, pp. 63349-63358, 2020.

X. Xing and X. Y. Li, ``Recommendation of urban vehicle driving routes under traffic
congestion: A traffic congestion regulation method considering road network equilibrium,''
Computers & Electrical Engineering, vol. 110, 108863, 2023.

Y. M. Xing, X. J. Ban, X. Liu, and Q. Shen, ``Large-scale traffic congestion prediction
based on the symmetric extreme learning machine cluster fast learning method,'' Symmetry-Basel,
vol. 11, no. 6, 730, 2019.

B. Yang, H. Zhang, M. X. Du, A. N. Wang, and K. Xiong, ``Urban traffic congestion
alleviation system based on millimeter wave radar and improved probabilistic neural
network,'' IET Radar Sonar and Navigation, vol. 18, no. 2, pp. 327-343, 2024.

K. Zhang, Z. X. Chu, J. P. Xing, H. G. Zhang, and Q. X. Cheng, ``Urban traffic flow
congestion prediction based on a data-driven model,'' Mathematics, vol. 11, no. 19,
4075, 2023.

T. R. Zhang, J. A. Xu, S. R. Cong, C. S. Qu, and W. B. Zhao, ``A hybrid method of
traffic congestion prediction and control,'' IEEE Access, vol. 11, pp. 36471-36491,
2023.

X. Y. Zheng, N. Huang, Y. N. Bai, and X. Zhang, ``A traffic-fractal-element-based
congestion model considering the uneven distribution of road traffic,'' Physica A
- Statistical Mechanics and Its Applications, vol. 632, no. Part 1, pp. 129354, 2023.

Z. J. Zheng, Z. L. Wang, S. Liu, and W. Ma, ``Exploring The spatial effects on the
level of congestion caused by traffic accidents in urban road networks: A case study
of Beijing,'' Travel Behaviour and Society, vol. 35, 100728, 2024.

Author
Tingting Zhao was born in 1981 in Lanzhou, Gansu Province, China. In 2003, she obtained
her bachelor's degree in computer science and technology from Lanzhou Jiaotong University.
In 2009, she obtained a master's degree in cartography and geographic information
systems from Lanzhou Jiaotong University. From 2003 to 2008, she served as a teaching
assistant in the Information Management and Information Systems program at the School
of Transportation, Lanzhou Jiaotong University. Since 2008, she has been a lecturer
in the same program. Her research interests include the application of knowledge graphs
and traffic flow prediction. She has published one book and authored over 15 articles.