1. Introduction
As an essential national infrastructure, railway transportation has received significant attention from society for its safety [
1]. With the rapid development and popularization of high-speed rail technology, higher requirements are put forward for the speed and security of trains running on rail lines. In addition to the respective scheduling issues during train operation, it is also necessary to consider how to enhance the detection of road conditions during train operation [
2]. With the application of railway video intelligent monitoring systems and the development of a new generation of the fully automatic driving signal system, the realization of intelligent monitoring of rail lines has become a hot topic of research [
3,
4], such as track obstacle recognition [
5,
6], rail cracks detection [
7,
8], road condition foreign body intrusion [
9], and other issues. However, the factors that cause rail accidents are complex and changeable, such as bad weather, obsolete train tracks, malfunctions of electronic equipment, and the status of drivers.
In realizing the intelligentization and automation of railway transportation, the primary task is to predict the railway tracks in front during operation to provide trains with basic information about the environment ahead in time [
10]. In this way, the train can sense the track’s condition in advance, and adjust the speed in time, so as to avoid rail traffic accidents such as speeding and derailment in the curve. Simultaneously, the rail lines area is detected in advance to prevent foreign matter intrusion, which can help frame the detection range and reduce the amount of processing. In this way, the operation safety of the train can be ensured in real-time. At present, the detection of rail lines based on computer vision is the mainstream method of railway detection. The rail line detection algorithm based on computer vision can be divided into two directions: one is based on the image processing algorithms, using image edge detection and other algorithms to search for rail lines features and curve fitting. The other is based on deep convolutional neural networks, which have powerful semantic information extraction capabilities to obtain advanced feature information such as the edge, color, and texture of the rail lines and segment the railway tracks and background face of more complex images information.
Before the rise of deep learning, rail line detection mainly used traditional image processing technology, that is, based on the difference of a specific attribute of the entity pixel in the image in the field. This type of algorithm uses the change law of the entity pixel and the surrounding environment to determine the railway lines target in the image, so as to carry out line detection. As one of the early works, Zhong Ren et al. [
11] proposed a rail recognition algorithm based on prior knowledge. The critical technology of the algorithm is rail modeling and template matching. By matching, the position of the railroad tracks in the current picture is determined. Although this method has some drawbacks, such as susceptibility to environmental interference and low accuracy, it has set a precedent for rail line detection. Afterward, according to the characteristics of the rails in the monitoring images, Q Wang et al. [
12] proposed a rail line identification and detection method based on the Radon transform idea and the Bresenham straight lines detection algorithm. However, the applicability of this method is not strong, and it is only suitable for straight-line sections. Based on traditional image processing, Zhao Wu et al. [
13] added postprocessing methods such as segment merging, slope culling, single-frame comprehensive decision-making, segment rebuilding, and multi-frame recognition result fusion to improve the accuracy of rail recognition, but only for straight-line detection. Lei Zhang et al. [
1] studied the method of extracting rail tracks from infrared images, by obtaining the target area and edge of rail tracks through image segmentation, refining the extracted target area, and finally obtaining the curve based on the shape and location of railroad tracks. However, this method still needs a lot of improvement in both detection speed and detection accuracy. The proposed curve model directly influences the accuracy and computational complexity of the rail line detection algorithm. Kaleli [
14] and Badino et al. [
15] suggested extracting line features based on median filter and using dynamic programming to detect lines. Still, the model is susceptible to environmental interference, and the robustness needs to be improved. Although the complex curve model can fit more different boundary curves, it has a weak anti-jamming ability and is susceptible to noise interference. Recently, Yunze Wang et al. [
16] used a curvature map-based orbital recognition algorithm to identify near-distance orbits and then obtain seed points from near-distance orbits recognition results, based on local gradient information, to recognize long-distance trajectories improved seed area growth algorithm to introduce directions. The algorithm overcomes the shortcomings of the previous methods, but it needs to be improved in identifying multi-rail lines. At the same time, the accuracy and real-time of rail line detection algorithms based on traditional image processing still need further breakthroughs.
With the success of deep learning, researchers have also gradually investigated its application in dealing with rail line detection. Ziguan Wang et al. [
17] were among the first to use deep understanding in railway track detection. Their model is based on Mask R-CNN, which scans the picture and produces a candidate box containing the rails, calculates the position of the box containing the tracks, creates a mask covering the rails, and finally gets the position of the rails in the picture. They obtained photos from a surveillance video of a subway company and fabricated them into a dataset for training and evaluating their system. However, the presence of speculation in the final result of their output compromises the recognition effect. Moreover, they fail to release the accuracy and detection speed of their study, which hinder further comparisons. Recently, Xiaoyong Guan et al. [
5] used ResNet101 and Feature Pyramid Networks (FPN) as the backbone network. Input pictures can generate feature maps of various sizes, forming pyramids of feature maps at different levels, making the network further enhanced in extracting features. By making railway datasets, building network models, and training network parameters, the recognition and segmentation of rail area, metro train, and signal lamps can be realized. The network can adapt to the changes in metro train operation environment. Nevertheless, although the complex network structure ensures the accuracy of detection, it hinders the real-time performance of rail line detection.
In this paper, an algorithm based on state-of-the-art deep learning convolutional neural networks is proposed to overcome the deficiencies of the aforementioned detection methods. This algorithm is mainly used in local trains and city railways. First, the RailNet is designed to preprocess images, extracting the key information and output the binary segmentation maps, which is robust to unnecessary noise. The rail lines are segmented from the background, and the feature of tracks are preserved without interference from other objects [
18]. Afterward, the binary segmentation maps pass through the post-processing part of the RailNet, namely the sliding window detection algorithm. The algorithm is mainly composed of three steps: Inverse Perspective Transformation (IPT), Feature Point Extraction (FPE), and Rail Lines Curve Fitting. Moreover, the fitting results are mapped to the original images, and the rail lines are finally marked on the authentic images. An overview of the entire process of the algorithm can be seen in
Figure 1.
The main contributions of our algorithm are four-fold:
A novel lightweight deep learning network, RailNet, is proposed. The encoder-decoder structure of the RailNet ensures the accuracy of detection. The Depth Wise Convolution (DWconv) is introduced in the RailNet, which reduces the number of network parameters and eventually ensures real-time detection. Compared with the existing state-of-the-art methods of extracting features, the RailNet has solid detection speed and higher accuracy.
The Segmentation Soul (SS) module is creatively added to the RailNet structure, which can enhance the feature representation in the training phase and can be discarded in the testing phase. The SS module improves segmentation performance without any additional inference time.
A rail lines fitting algorithm based on sliding window detection is proposed as the post-processing part of the RailNet. The algorithm further improves the accuracy of detection. Simultaneously, the rail lines in the original image are accurately marked, and the mathematical expression and curvature of the tracks are calculated.
A dataset of rail lines, RAWRail, has been created for deep learning network training and testing. The dataset can be used for algorithm performance evaluation, which would help enrich the research and development of rail line detection.