Sensors

Editorial

Jump to: Research

3 pages, 155 KiB

Open AccessEditorial

Deep Learning for Object Detection, Classification and Tracking in Industry Applications

by Dadong Wang, Jian-Gang Wang and Ke Xu

Sensors 2021, 21(21), 7349; https://0-doi-org.brum.beds.ac.uk/10.3390/s21217349 - 5 Nov 2021

Cited by 19 | Viewed by 3232

Abstract

Object detection, classification and tracking are three important computer vision techniques [...] Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

Research

Jump to: Editorial

12 pages, 2734 KiB

Open AccessCommunication

Multi-Directional Scene Text Detection Based on Improved YOLOv3

by Liyun Xiao, Peng Zhou, Ke Xu and Xiaofang Zhao

Sensors 2021, 21(14), 4870; https://0-doi-org.brum.beds.ac.uk/10.3390/s21144870 - 16 Jul 2021

Cited by 11 | Viewed by 2222

Abstract

To address the problem of low detection rate caused by the close alignment and multi-directional position of text words in practical application and the need to improve the detection speed of the algorithm, this paper proposes a multi-directional text detection algorithm based on [...] Read more.

To address the problem of low detection rate caused by the close alignment and multi-directional position of text words in practical application and the need to improve the detection speed of the algorithm, this paper proposes a multi-directional text detection algorithm based on improved YOLOv3, and applies it to natural text detection. To detect text in multiple directions, this paper introduces a method of box definition based on sliding vertices. Then, a new rotating box loss function MD-Closs based on CIOU is proposed to improve the detection accuracy. In addition, a step-by-step NMS method is used to further reduce the amount of calculation. Experimental results show that on the ICDAR 2015 data set, the accuracy rate is 86.2%, the recall rate is 81.9%, and the timeliness is 21.3 fps, which shows that the proposed algorithm has a good detection effect on text detection in natural scenes. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures

Figure 1

13 pages, 4627 KiB

Open AccessCommunication

An Improved Character Recognition Framework for Containers Based on DETR Algorithm

by Xiaofang Zhao, Peng Zhou, Ke Xu and Liyun Xiao

Sensors 2021, 21(13), 4612; https://0-doi-org.brum.beds.ac.uk/10.3390/s21134612 - 5 Jul 2021

Cited by 6 | Viewed by 2739

Abstract

An improved DETR (detection with transformers) object detection framework is proposed to realize accurate detection and recognition of characters on shipping containers. ResneSt is used as a backbone network with split attention to extract features of different dimensions by multi-channel weight convolution operation, [...] Read more.

An improved DETR (detection with transformers) object detection framework is proposed to realize accurate detection and recognition of characters on shipping containers. ResneSt is used as a backbone network with split attention to extract features of different dimensions by multi-channel weight convolution operation, thus increasing the overall feature acquisition ability of the backbone. In addition, multi-scale location encoding is introduced on the basis of the original sinusoidal position encoding model, improving the sensitivity of input position information for the transformer structure. Compared with the original DETR framework, our model has higher confidence regarding accurate detection, with detection accuracy being improved by 2.6%. In a test of character detection and recognition with a self-built dataset, the overall accuracy can reach 98.6%, which meets the requirements of logistics information identification acquisition. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures

Figure 1

13 pages, 2184 KiB

Open AccessArticle

Interp-SUM: Unsupervised Video Summarization with Piecewise Linear Interpolation

by Ui-Nyoung Yoon, Myung-Duk Hong and Geun-Sik Jo

Sensors 2021, 21(13), 4562; https://0-doi-org.brum.beds.ac.uk/10.3390/s21134562 - 2 Jul 2021

Cited by 16 | Viewed by 2617

Abstract

This paper addresses the problem of unsupervised video summarization. Video summarization helps people browse large-scale videos easily with a summary from the selected frames of the video. In this paper, we propose an unsupervised video summarization method with piecewise linear interpolation (Interp-SUM). Our [...] Read more.

This paper addresses the problem of unsupervised video summarization. Video summarization helps people browse large-scale videos easily with a summary from the selected frames of the video. In this paper, we propose an unsupervised video summarization method with piecewise linear interpolation (Interp-SUM). Our method aims to improve summarization performance and generate a natural sequence of keyframes with predicting importance scores of each frame utilizing the interpolation method. To train the video summarization network, we exploit a reinforcement learning-based framework with an explicit reward function. We employ the objective function of the exploring under-appreciated reward method for training efficiently. In addition, we present a modified reconstruction loss to promote the representativeness of the summary. We evaluate the proposed method on two datasets, SumMe and TVSum. The experimental result showed that Interp-SUM generates the most natural sequence of summary frames than any other the state-of-the-art methods. In addition, Interp-SUM still showed comparable performance with the state-of-art research on unsupervised video summarization methods, which is shown and analyzed in the experiments of this paper. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures

Figure 1

27 pages, 11887 KiB

Open AccessArticle

SAVSDN: A Scene-Aware Video Spark Detection Network for Aero Engine Intelligent Test

by Jie Kou, Xinman Zhang, Yuxuan Huang and Cong Zhang

Sensors 2021, 21(13), 4453; https://0-doi-org.brum.beds.ac.uk/10.3390/s21134453 - 29 Jun 2021

Cited by 1 | Viewed by 1750

Abstract

Due to carbon deposits, lean flames, or damaged metal parts, sparks can occur in aero engine chambers. At present, the detection of such sparks deeply depends on laborious manual work. Considering that interference has the same features as sparks, almost all existing object [...] Read more.

Due to carbon deposits, lean flames, or damaged metal parts, sparks can occur in aero engine chambers. At present, the detection of such sparks deeply depends on laborious manual work. Considering that interference has the same features as sparks, almost all existing object detectors cannot replace humans in carrying out high-precision spark detection. In this paper, we propose a scene-aware spark detection network, consisting of an information fusion-based cascading video codec-image object detector structure, which we name SAVSDN. Unlike video object detectors utilizing candidate boxes from adjacent frames to assist in the current prediction, we find that efforts should be made to extract the spatio-temporal features of adjacent frames to reduce over-detection. Visualization experiments show that SAVSDN can learn the difference in spatio-temporal features between sparks and interference. To solve the problem of a lack of aero engine anomalous spark data, we introduce a method to generate simulated spark images based on the Gaussian function. In addition, we publish the first simulated aero engine spark data set, which we name SAES. In our experiments, SAVSDN far outperformed state-of-the-art detection models for spark detection in terms of five metrics. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures

Figure 1

14 pages, 2506 KiB

Open AccessArticle

A Sawn Timber Tree Species Recognition Method Based on AM-SPPResNet

by Fenglong Ding, Ying Liu, Zilong Zhuang and Zhengguang Wang

Sensors 2021, 21(11), 3699; https://0-doi-org.brum.beds.ac.uk/10.3390/s21113699 - 26 May 2021

Cited by 7 | Viewed by 2321

Abstract

Sawn timber is an important component material in furniture manufacturing, decoration, construction and other industries. The mechanical properties, surface colors, textures, use and other properties of sawn timber possesed by different tree species are different. In order to meet the needs of reasonable [...] Read more.

Sawn timber is an important component material in furniture manufacturing, decoration, construction and other industries. The mechanical properties, surface colors, textures, use and other properties of sawn timber possesed by different tree species are different. In order to meet the needs of reasonable timber use and product quality of sawn timber products, sawn timber must be identified according to tree species to ensure the best use of materials. In this study, an optimized convolution neural network was proposed to process sawn timber image data to identify the tree species of the sawn timber. The spatial pyramid pooling and attention mechanism were used to improve the convolution layer of ResNet101 to extract the feature vector of sawn timber images. The optimized ResNet (simply called “AM-SPPResNet”) was used to identify the sawn timber image, and the basic recognition model was obtained. Then, the weight parameters of the feature extraction layer of the basic model were frozen, the full connection layer was removed, and using support vector machine (SVM) and XGBoost classifier which were commonly used in machine learning to train and learn the 21 × 1024 dimension feature vectors extracted by feature extraction layer. Through a number of comparative experiments, it is found that the prediction model using linear function as the kernel function of support vector machine learning the feature vectors extracted from the improved convolution layer performed best, and the F1 score and overall accuracy of all kinds of samples were above 99%. Compared with the traditional methods, the accuracy was improved by up to 12%. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures

Figure 1

13 pages, 5261 KiB

Open AccessCommunication

FPGA-Based Acceleration on Additive Manufacturing Defects Inspection

by Yawen Luo and Yuhua Chen

Sensors 2021, 21(6), 2123; https://0-doi-org.brum.beds.ac.uk/10.3390/s21062123 - 18 Mar 2021

Cited by 11 | Viewed by 3426

Abstract

Additive manufacturing (AM) has gained increasing attention over the past years due to its fast prototype, easier modification, and possibility for complex internal texture devices when compared to traditional manufacture processing. However, potential internal defects are occurring during AM processes, and it requires [...] Read more.

Additive manufacturing (AM) has gained increasing attention over the past years due to its fast prototype, easier modification, and possibility for complex internal texture devices when compared to traditional manufacture processing. However, potential internal defects are occurring during AM processes, and it requires real-time inspections to minimize the costs by either aborting the processing or repairing the defect. In order to perform the defects inspection, first the defects database NEU-DET is used for training. Then, a convolution neural network (CNN) is applied to perform defects classification. For real-time purposes, Field Programmable Gate Arrays (FPGAs) are utilized for acceleration. A binarized neural network (BNN) is proposed to best fit the FPGA bit operations. Finally, for the image labeled with defects, the selective search and non-maximum algorithms are implemented to help locate the coordinates of defects. Experiments show that the BNN model on NEU-DET can achieve 97.9% accuracy in identifying whether the image is defective or defect-free. As for the image classification speed, the FPGA-based BNN module can process one image within 0.5 s. The BNN design is modularized and can be duplicated in parallel to fully utilize logic gates and memory resources in FPGAs. It is clear that the proposed FPGA-based BNN can perform real-time defects inspection with high accuracy and it can easily scale up to larger FPGA implementations. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures

Figure 1

15 pages, 15343 KiB

Open AccessArticle

AgriPest: A Large-Scale Domain-Specific Benchmark Dataset for Practical Agricultural Pest Detection in the Wild

by Rujing Wang, Liu Liu, Chengjun Xie, Po Yang, Rui Li and Man Zhou

Sensors 2021, 21(5), 1601; https://0-doi-org.brum.beds.ac.uk/10.3390/s21051601 - 25 Feb 2021

Cited by 49 | Viewed by 7420

Abstract

The recent explosion of large volume of standard dataset of annotated images has offered promising opportunities for deep learning techniques in effective and efficient object detection applications. However, due to a huge difference of quality between these standardized dataset and practical raw data, [...] Read more.

The recent explosion of large volume of standard dataset of annotated images has offered promising opportunities for deep learning techniques in effective and efficient object detection applications. However, due to a huge difference of quality between these standardized dataset and practical raw data, it is still a critical problem on how to maximize utilization of deep learning techniques in practical agriculture applications. Here, we introduce a domain-specific benchmark dataset, called AgriPest, in tiny wild pest recognition and detection, providing the researchers and communities with a standard large-scale dataset of practically wild pest images and annotations, as well as evaluation procedures. During the past seven years, AgriPest captures 49.7K images of four crops containing 14 species of pests by our designed image collection equipment in the field environment. All of the images are manually annotated by agricultural experts with up to 264.7K bounding boxes of locating pests. This paper also offers a detailed analysis of AgriPest where the validation set is split into four types of scenes that are common in practical pest monitoring applications. We explore and evaluate the performance of state-of-the-art deep learning techniques over AgriPest. We believe that the scale, accuracy, and diversity of AgriPest can offer great opportunities to researchers in computer vision as well as pest monitoring applications. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures

Figure 1

25 pages, 6767 KiB

Open AccessArticle

Meta-Transfer Learning Driven Tensor-Shot Detector for the Autonomous Localization and Recognition of Concealed Baggage Threats

by Taimur Hassan, Muhammad Shafay, Samet Akçay, Salman Khan, Mohammed Bennamoun, Ernesto Damiani and Naoufel Werghi

Sensors 2020, 20(22), 6450; https://0-doi-org.brum.beds.ac.uk/10.3390/s20226450 - 12 Nov 2020

Cited by 40 | Viewed by 3758

Abstract

Screening baggage against potential threats has become one of the prime aviation security concerns all over the world, where manual detection of prohibited items is a time-consuming and hectic process. Many researchers have developed autonomous systems to recognize baggage threats using security X-ray [...] Read more.

Screening baggage against potential threats has become one of the prime aviation security concerns all over the world, where manual detection of prohibited items is a time-consuming and hectic process. Many researchers have developed autonomous systems to recognize baggage threats using security X-ray scans. However, all of these frameworks are vulnerable against screening cluttered and concealed contraband items. Furthermore, to the best of our knowledge, no framework possesses the capacity to recognize baggage threats across multiple scanner specifications without an explicit retraining process. To overcome this, we present a novel meta-transfer learning-driven tensor-shot detector that decomposes the candidate scan into dual-energy tensors and employs a meta-one-shot classification backbone to recognize and localize the cluttered baggage threats. In addition, the proposed detection framework can be well-generalized to multiple scanner specifications due to its capacity to generate object proposals from the unified tensor maps rather than diversified raw scans. We have rigorously evaluated the proposed tensor-shot detector on the publicly available SIXray and GDXray datasets (containing a cumulative of 1,067,381 grayscale and colored baggage X-ray scans). On the SIXray dataset, the proposed framework achieved a mean average precision (mAP) of 0.6457, and on the GDXray dataset, it achieved the precision and F₁ score of 0.9441 and 0.9598, respectively. Furthermore, it outperforms state-of-the-art frameworks by 8.03% in terms of mAP, 1.49% in terms of precision, and 0.573% in terms of F₁ on the SIXray and GDXray dataset, respectively. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection, Classification and Tracking in Industry Applications)

► Show Figures