• 대한전기학회
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • 한국과학기술단체총연합회
  • 한국학술지인용색인
  • Scopus
  • crossref
  • orcid

  1. (Dept. of Electrical and Electronics Engineering, Kangwon National University, Korea.)



Railway Safety Platform, Big Data Architecture, MQTT (Message Queue Transport Telemetry), Kafka, MongoDB, YOLOv5

1. Introduction

Operational errors in the railway information system, such as poor management of railroad tracks, train defects, and signal errors, can lead to train derailments and collision accidents. In particular, derailment accidents occur due to cracks and subsidence in railroad tracks, which cause many casualties in the event of an accident. Advanced safety technologies using IoT, artificial intelligence, and big data are rapidly developing in many industrial sites, and the railway industry is also attempting to apply them along with research and development of advanced safety technologies. As a result, the smart railway market size was valued at 28.9 Billion USD in 2022 and is expected to grow by 8.3% annually through 2027 [1]. However, despite these efforts, railway accidents are increasing every year, and train accidents such as derailments and collisions may result in large-scale human casualties, especially in places where outdated trains and railway infrastructure are operated.

According to the UIC (International Union of Railways) Safety Report, the number of serious train accidents was 1,643 in 2020 and 1,765 in 2021, an increase of 0.69% compared to the previous year. In 2021, the rate of significant railway accidents was 58% of train-to-person collision accidents, and obstacle collision accidents and derailment accidents that could lead to large-scale casualties accounted for 34% [2].

2. Related Works and Research Methodology

2.1 Literature review on big data architecture design

In order to prevent such railway accidents and industrial safety accidents, prior studies such as failure analysis and reliability evaluation methods such as FTA (Fault Tree Analysis), big data and artificial intelligence technologies, and big data architecture design have been conducted.

We will now introduce prior research and related technologies. First, in previous studies for big data railway architecture, the big data railway safety platform architecture was designed by dividing it into five parts: collection, API gateway, preprocessing, storage, and analysis [3].

Additionally, in the service layer-centered big data architecture research, the architecture was divided into four layers: the storage layer, processing layer, service layer, and ingestion layer for predictive maintenance of railroad points.

Furthermore, the main functions of the architecture are to collect data from external sources, process and retrieve collected data, and perform data aggregation, modeling, analysis, and visualization, with the architecture designed based on Hadoop and Apache NIFI [4]. FTA is a quantitative failure analysis and reliability evaluation method that uses FT (Fault Tree), which logically expresses the relationship with the causes of system failure to find vulnerable parts and improve system reliability [5].

FTA is a significantly reasonable failure and defect analysis method. If FTA is used as a standard for artificial intelligence and big data analysis, the reliability of defect analysis can be increased. MQTT (Message Queue Telemetry Transport) is a message transmission and reception framework for large-scale IoT communication of small devices standardized in 2016.

MQTT's publish-subscribe messaging pattern can communicate only through a broker. MQTT has the following three technical characteristics [6].

ⓐ Clients requesting a connection with the MQTT broker either explicitly disconnect after making a TCP/IP socket connection or remain connected until they are disconnected due to network conditions.

ⓑ MQTT's publish-subscribe messaging pattern can communicate only through a broker. Additionally, when a message is published on the set topic, the message can be published to the clients subscribing to the topic, and both one-to-one and one-to-many communication is possible.

ⓒ QoS has 3 levels, where 0 guarantees a maximum of one transmission, 1 guarantees at least one transmission, and 2 guarantees one reception.

Kafka was developed by Linkedin and is a distributed data streaming platform based on message queues that can publish, subscribe, store, and process data streams in real-time. Unlike conventional message transmission systems, Kafka manages messages as event queues in the file system rather than memory [7].

MongoDB is different from relational databases such as Oracle and MySQL, which store data in tables and have row-centered storage structures that access databases using SQL. MongoDB is a NoSQL with a document-centered storage structure, and data is stored as keys and values in Binary JSON format. It consists of a collection that matches a table, a document that matches a row, and a field that matches a column [8]. In a study of big data architecture for an IoT-based smart manufacturing system, MQTT and Kafka are combined to collect, relay, and store sensor data, and MongoDB, relational databases, and Elasticsearch are adopted as consumers [9]. Another deep learning-based network for real-time object detection is called YOLO (You Only Look Once). YOLO is a one-stage detection algorithm that performs classification and location identification simultaneously and has the advantage of being able to detect objects faster than two-stage detection algorithms based on R-CNN such as fast R-CNN and SPPNet [10].

In this study, a railway safety platform application model was presented using the IoT-based big data platform architecture. Additionally, using YOLOv5, an object detection algorithm, an experiment was conducted on how image data on a railroad track can be used in anomaly detection for safe railway operation, and the results of the experiment are presented.

Fig. 1. Big data platform architecture design process

../../Resources/kiee/KIEE.2024.73.3.567/fig1.png

2.2 Research Method

Referring to previous studies, this study applied the following research method to design the railway safety big data platform architecture. The research process is shown in Fig. 1.

First, the essential elements for big data platform design were defined in five areas: ① data collection area, ② transmission area, ③ storage area, ④ monitoring and control area, and ⑤ artificial intelligence analysis area.

Second, we investigated technological details and application cases to analyze whether the technologies in the five areas defined above are appropriate for IoT device communication and sensor data storage and analysis.

Third, we combined the technologies of each area to design the optimal railway safety big data platform architecture.

Lastly, based on the designed railway safety big data platform architecture, we presented an application model that identifies and classifies railroad track status images collected from trains through a deep learning algorithm.

3. Big Data Platform Architecture Design

3.1 Big Data Architecture Design Considerations

In order to design the big data railway safety platform architecture, the items ⓐ~ⓔ in Fig. 2 must be defined in advance. In other words, the architecture should be designed by considering matters related to data collection transmission and storage database analysis in all stages of the data life cycle.

Fig. 2. Stages of data processing

../../Resources/kiee/KIEE.2024.73.3.567/fig2.png

Additionally, when the six predefined tasks for design are completed, it is to be decided in advance which IoT device to use in each configuration area where data is processed and which communication method and database solution to select.

Major devices and solutions for designing big data architectures can be applied by dividing them by function and role for each processing stage of the data life cycle, as shown in Fig. 3.

Fig. 3. Essential elements for each step of an big data platform

../../Resources/kiee/KIEE.2024.73.3.567/fig3.png

3.2 MQTT and Kafka Comparison and Operation Mechanism

MQTT supports reliable message delivery in the communication environment of IoT devices and servers using low-cost, low-bandwidth, and unstable connections. On the other hand, Kafka supports fault tolerance, low latency, and high throughput and can effectively handle large-capacity real-time stream event data that is difficult to handle with traditional message brokers. In this respect, MQTT and Kafka can compensate for their mutual shortcomings in IoT device communication, as shown in the comparison table between MQTT and Kafka in Table 1.

Table 1 Comparision of MQTT and Kafka

Category

MQTT

Kafka

Broker Type

Message broker

Event broker

Data Location

In Broker

Memory

Disk

Features

Low cost

Low bandwidth,

Unstable N/W support

fault tolerance

low latency

high throughput

Purpose

IoT Usecase

Stable Event

Streaming Platform

Device Access

Over 10k

(Single node)

Relatively small

number IoT devices

IoT Connection

Keep alive, LWT

(Last Will and

Testament) support

Keep alive, LWT

not supported

Part to be

supplemented

Stream process,

Data integration

Connecting multiple IoT Devices unstable N/W

In MQTT (Message Queuing Telemetry Transport), when a publisher, such as an IoT device, transmits data in JSON format along with a topic to an MQTT broker, subscribers who have registered the topic with the MQTT broker in advance can receive the data, as shown in Fig. 4. Data received from MQTT brokers is stored in databases such as MongoDB through application programs such as Node.js. Communication between IoT devices and MQTT brokers uses TCP-based WebSocket or RESTful service.

Fig. 4. Operation mechanism of MQTT

../../Resources/kiee/KIEE.2024.73.3.567/fig4.png

Kafka consists of a producer that produces data, a consumer that consumes data, and a Kafka cluster that relays data using topics, as shown in Fig. 5 [11]. In addition, the Kafka cluster consists of several brokers with a Jupyter ensemble and partitions. Kafka can process event streams that cannot be processed on messaging platforms such as MQTT. Additionally, Kafka writes events around pre-registered topics and later operates to allow the consumer to read the topics of interest from the disk, unlike MQTT, which processes messages in memory.

Fig. 5. Structure and operation process of Kafka cluster (Apache Kafka Tutorial- Apache Kafka Cluster Architecture)

../../Resources/kiee/KIEE.2024.73.3.567/fig5.png

As for the physical locations of MQTT and Kafka, as shown in Fig. 6, MQTT processes data transmission and reception based on publisher and subscriber in IoT devices and edge areas, and Kafka clusters process distributed event stream data in data centers.

Fig. 6. Physical position of MQTT and Kafka

../../Resources/kiee/KIEE.2024.73.3.567/fig6.png

The result of analyzing these technical characteristics of MQTT and Kafka to collect various data generated from train operations through many IoT devices, connecting the MQTT server and Kafka cluster, as shown in Fig. 7, is considered the optimal combination.

Fig. 7. Combination of MQTT and Kafka

../../Resources/kiee/KIEE.2024.73.3.567/fig7.png

3.3 Document-oriented Database and NoSQL, Schemeless MongoDB

As shown in the comparison table in Table 2, MongoDB differs from Oracle and MySQL, which have a row-oriented storage structure that stores data in tables and accesses databases using SQL. MongoDB is a NoSQL with a document-oriented storage structure, and data is stored in the Binary JSON format of key and value. It also consists of a collection that matches the table, documents that match the rows, and fields that match the columns.

Table 2 Comparision of MQTT and Kafka

Category

Relational DB

MongoDB(NoSQL)

Database

Row oriented

Document oriented

(JSON; Key, Value)

Combination of data

Join

Embed

(Embedding the join

target data in the

document)

Database

structure

Table>Rows>Columns

Collection>

Documents>Fields

Scheme

Strict schema structure

Schemaless

(Soft schema)

Distributed

storage

Partitioning

(Primarily vertical

partitioning)

Sharding

(Distributed to different databases)

A key feature of MongoDB, which is designing the database in a way that documents are structured, is shown in Fig. 8 below.

MongoDB is a database suitable for sharding for load balancing, faster processing, and efficient backup and recovery strategies. The structure and operation mechanism of MongoDB are as shown in Fig. 9, where the client accesses the database through the mongos router and stores the data distributed in three shards. One shard forms a replica set by replicating data to prevent data loss, and the config server manages metadata required by mongos, such as the location of distributed data.

Fig. 8. Structure comparison of MongoDB and RDBM

../../Resources/kiee/KIEE.2024.73.3.567/fig8.png

Fig. 9. Structure and operation mechanism of MongoDB

../../Resources/kiee/KIEE.2024.73.3.567/fig9.png

3.4 Big Data Railway Safety Platform Final Architecture

In this study, the final design result of the architecture centered on MQTT, Kafka, and MongoDB, which are the main elements of a data publisher (producer), subscriber (consumer), and storage, is shown in Fig. 10. The architecture is divided into a total of six areas as follows.

ⓐ IoT Devices ⓑ MQTT Server ⓒ Kafka Cluster

ⓓ Information System of Relevant Institution

ⓔ Data Analysis & Visualization

ⓕ Risk Assessment and Analysis Technique

Fig. 11 shows the application model for AI big data platform architecture design for railway safety [12]. The information is collected and measured by a large number of sensors installed in the train, such as noise, vibration, video, and sound.

Fig. 10. The Layout of railway safety platform architecture

../../Resources/kiee/KIEE.2024.73.3.567/fig10.png

The data gathered is registered as topics in MQTT and transmitted to the Kafka cluster through MQTT CONNECT, and the consumers who register the topic in advance receive the data and perform statistical data analysis or artificial intelligence analysis.

Fig. 11. The layout of railway safety platform Architecture (H, Kim, "The Streaming Transformation and the Rise of Apache Kafka®, CONFLUENT, 2023)

../../Resources/kiee/KIEE.2024.73.3.567/fig11.png

4. Railroad Track Object Detection Application Model and Experiments

4.1 Railroad Track Object Detection Application Model

As shown in Fig. 12, this study proposes an application model that classifies the rail components and detects the damage status by learning and analyzing images of railroad tracks collected from CCTVs installed in front of the train using the YOLOv5 object detection algorithm.

Fig. 12. Application model of track object detection-based railway safety platform

../../Resources/kiee/KIEE.2024.73.3.567/fig12.png

4.2 Acquisition and Pre-processing of Learning Data for Application Model Verification

The learning data was provided with images and annotation files for high-speed railroad tracks from the AIHUB data hub portal operated by the Korean National Information Society Agency. The data includes 24,615 image files and annotation files. The provided data was divided into learning data, verification data, and test data as shown in Table 3, and the objects labeled in the image file were classified into 16 object classes, including eight normal components and eight abnormal components, as shown in Table 4.

Table 3 Training/experimental data for railroad track object detection

Category

Total

Training

Validation

Test

track data

(Image &

Annotation file)

24,615

19,692

2,461

2,462

Split ratio

1.0

0.8

0.1

`0.1

Table 4 Types of Rail Components (16 classification classes: 8 normal component condition and 8 abnormal component condition)

Class

1(9)

2(10)

3(11)

4(12)

5(13)

6(14)

7(15)

8(16)

Normal

rail

weld_zone

pandrol_

e-clip

fast_clip

system300-1

screw_

spike

Insulation_ block

tie

Abnormal

ab_rail

ab_weld_

zone

ab_pandrol_

e-clip

ab_fast_

clip

ab_

system300-1

ab_screw_

spike

ab_Insulation_block

ab_

tie

The distribution information of the objects of normal components and abnormal components of the railroad track is shown in Fig. 13, and the quantity of each object of abnormal components is shown in Table 5.

Fig. 13. Distribution of normal and abnormal components(Left: railroad tracks, Right : rail components)

../../Resources/kiee/KIEE.2024.73.3.567/fig13.png

Here, the number of abnormal object datasets is smaller than that of normal object data, but no data augmentation techniques have been used to correct bias.

Table 5 Information of abnormal object

No.

Abnormal object

Annotation

format

Quantity

1

ab_rail

polygon

1,891

2

ab_weld_zone

bounding box

468

3

ab_pandrol_e-clip

733

4

ab_fast_clip

3,057

5

ab_system300-1

1,037

6

ab_screw_spike

731

7

ab_Insulation_block

633

8

ab_tie

3,511

4.3 Railroad Track Object Detection Experimental Process

Fig. 14 shows the deep learning and inference process using the YOLOv5 algorithm and training data.

Fig. 14. Object detection experiment process using YOLOv5

../../Resources/kiee/KIEE.2024.73.3.567/fig14.png

First, the JSON annotation format of the collected original Pascal Voc format was parsed and converted to the YOLO format. In this process, normalization and coordinate transformation were performed, and object classes were classified into eight normal components and eight abnormal components. We then created a directory structure to enable YOLOv5 training and performed training through the YOLOv5 Large model. In addition, the annotation type of railroad tracks is a polygon for image segmentation, and the annotation type of other railroad components is a bounding box for object detection. No objects were detected in the result trained by combining both annotation types: Polygon and bounding box.

Therefore, by dividing the railroad tracks and the rest of the rail components, YOLOv5 learning, inference, and evaluation were performed independently. In this experiment, the hyperparameters were set to 30 epochs and 8 batch sizes.

Furthermore, we fine-tuned a YOLOv5 large model using railroad track images and annotation files as training data for custom object detection training and inference.

4.4 Learning results of railroad tracks and rail components detection using YOLOv5 algorithm

It took about 6.6 hours to learn over 30 epochs, using YOLOv5 as a computing resource equipped with the NVIDIA GeForce RTX 3060 graphic card. The performance of object detection and image classification algorithms in computer vision is evaluated by average precision (AP). If there are multiple object classes, the average mAP can be obtained by calculating the AP for each class and dividing the sum of all APs by the number of object classes, as shown in Equation (1). In this experiment, the precision of mAP@.50 is 0.919, and the precision of mAP@@.50-95 is 0.853.

(1)
$\begin{align*} m AP=\dfrac{1}{n}AP_{k}(AP_{k}=the \;AP\; of\; class\; k,\: \\ n= the\; number\; of\; classes) \end{align*}$

The results of the confusion matrix are shown in Fig. 15. In the confusion matrix, the predictive performance for abnormal tie components among eight normal components and eight abnormal components was 0.6, and the predictive performance for the remaining normal and abnormal components was 0.88 or higher.

Fig. 15. Confusion matrix of other normal components and abnormal components

../../Resources/kiee/KIEE.2024.73.3.567/fig15.png

Fig. 16. PR (Precision-Recall) curve of normal components and abnormal components

../../Resources/kiee/KIEE.2024.73.3.567/fig16.png

Fig. 17. Inference results of railroad tracks and rail components using learning model weight (best.pt)

../../Resources/kiee/KIEE.2024.73.3.567/fig17.png

The precision-recall curve in Fig. 16 is similar to the confusion matrix results. The inference results for the track image of the test data are shown in Fig. 17. In the left figure, the damaged rail was detected well, and in the right figure, it can be confirmed that the damaged fast clip components and tie components were accurately detected. Other normal components were also detected accurately.

5. Conclusion

The research results on the railway safety big data platform architecture design proposed in this study are as follows.

First, MQTT enables stable message delivery in IoT device communication environments with low cost, low bandwidth, and unstable connections. On the other hand, Kafka supports false tolerance, low latency, and high throughput. Additionally, it can effectively process large amounts of real-time stream event data that are difficult to process in message brokers. Thus, connecting an MQTT server and Kafka cluster using the complementary characteristics of these two brokers shall be a suitable network configuration for railway IoT device communication.

Second, to reliably store and utilize data collected from various IoT devices, it is necessary to consider several factors, such as load distribution, fast processing speed, backup, and recovery strategy. Therefore, MongoDB, which has flexible NoSQL features for sharding and a document-oriented architecture, is efficient database for storing unstructured or variable data collected from IoT devices installed on train or railroad.

Third, using MQTT, Kafka, and MongoDB as the core technical elements, we designed and proposed a big data platform architecture that can be utilized as a reference to designing railway big data platforms, as shown in Fig 9.

Fourth, we proposed an application model and experiments that identify railroad components and classify defective conditions using the YOLOv5 object detection algorithm on the railroad track condition images collected from CCTV installed in the train. As a result of the experiment on the application model, we found that the application model proposed in this study can be used as an empirical model for railroad safety big data to classify important rail components and detect defective conditions.

A standardized big data platform architecture is essential to reliably and efficiently collect and analyze the data necessary for railway safety from various IoT devices, such as video, sound, acceleration, and temperature sensors. This study is the first step in developing the railway safety big data platform, consisting of five open source-centered elements. Therefore, we expect the design of big data railway safety platform architecture to be the reference model for designing an IoT-based railway big data platform in the future.

Lastly, we plan to conduct additional follow-up research to verify and improve the architecture by implementing the design architecture as a platform and collecting and analyzing data by connecting various IoT devices.

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (2020R1A2C1101867)

References

1 
“Smart Railways Market by Offering Region - Global Forecast to 2027,” https://www.researchandmarkets.com/reports/5393505, January 2023.URL
2 
F. Davenne, “UIC Safety Report 2022(Significant Accidents 2021 Public Report),” UIC Safety Unit, October 2022.URL
3 
S. Lee, and R. Oh, “Deep Learning and Big Data Railway Safety Platform Architecture Design for Anomaly Detection in Railway Operation,” International Conference on Electrical Engineering & Computing Convergence and Applications, August 2023.URL
4 
G. Salierno, S. Morvillo, L. Leonardi, and G. Cabri, “An Architecture for Predictive Maintenance of Railway Points Based on Big Data Analytics,” Advanced Information Systems Engineering Workshops, vol. 382, pp. 29–40, 2020.DOI
5 
E. Ruijters, and M. Stoelinga, “Fault tree analysis: A survey of the state-of-the-art in modeling, analysis and tools,” Computer Science Review, vol. 15–16, pp. 29-62, 2015.DOI
6 
N. Naik, “Choice of effective messaging protocols for IoT systems: MQTT, CoAP, AMQP and HTTP,” 2017 IEEE International Systems Engineering Symposium (ISSE), pp. 1-7, 2017.DOI
7 
J. Kreps, N. Narkhede, and J. Rao, “Kafka: A distributed messaging system for log processing,” In Proceedings of the NetDB, vol. 11, pp. 1-7, no. 2011.URL
8 
A. Boicea, F. Radulescu and L. I. Agapin, “MongoDB vs Oracle—Database Comparison,” 2012 Third International Conference on Emerging Intelligent Data and Web Technologies, pp. 330-335, 2012.DOI
9 
M. Helu, T. Sprock, D. Hartenstine, R. Venketesh, and W. Sobel, “Scalable data pipeline architecture to support the industrial internet of things,” CIRP Annals, vol. 69, pp. 385-388, 2020.DOI
10 
Y. Zhang, Z. Guo, J. Wu, Y. Tian, H. Tang, and X. Guo, “Real-Time Vehicle Detection Based on Improved YOLO v5,” Sustainability, vol. 14, no. 19 2022.DOI
11 
“Apache Kafka Tutorial: Apache Kafka - Cluster Architecture,” https://www.tutorialspoint.com/apache_kafka/apache_kafka_ cluster_architecture.htm.URL
12 
H. Kim, “The Streaming Transformation and the Rise of Apache Kafka®,”https://www.cloocus.com/storage/2023/05/ %EC%BB%A8%ED%94%8C%EB%A3%A8%EC%96%B8%ED%8A%B8-%EA%B8%B0%EB%B3%B8%EC%86%8C%EA%B0%9C-%ED%99%9C%EC%9A%A9%EC%82%AC%EB%A1%80.pdfURL

저자소개

이승신(Seung-shin Lee)
../../Resources/kiee/KIEE.2024.73.3.567/au1.png

He received the B.S. degree in Computer Science from Soongsil University, Korea, in 2002, and the M.S degree in Urban and Regional Planning from Yonsei University, Korea, in 2013. He is currently working in the Data Convergence Department of Korea Road Traffic Authority. His research interest includes deep learning, big data analysis and IoT platform on railway environment.

오염덕(Ryum-duck Oh)
../../Resources/kiee/KIEE.2024.73.3.567/au2.png

He received the B.S., M.S. and Ph.D. degrees in Computer Science from Hongik University, Korea, in 1986, 1988, and 1993, respectively. He is currently a professor in the Department of Computer Software at Korea National University of Transportation. His research interest includes database, big data analysis and IoT platform on railway environment.