sensors-logo

Journal Browser

Journal Browser

3D Object and Scene Detection, Reconstruction, Segmentation Based on Advanced Sensing Technology

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Radar Sensors".

Deadline for manuscript submissions: closed (1 August 2022) | Viewed by 22238

Special Issue Editors


E-Mail Website
Guest Editor
Embedded Technology and Visual Processing Research Center, School of Computer Science and Technology, Xidian University, Xi’an 710071, China
Interests: 3D vision; scene understanding; robot; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Physics, Maths and Computing, Computer Science and Software Engineering, The University of Western Australia, 35 Stirling Highway, 6009 Perth, Australia
Interests: object recognition; face recognition; biometrics; computer vision; deep learning

E-Mail Website
Guest Editor
The School of Artificial Intelligence, Xidian University, Xi'an 710126, China
Interests: computer vision; 3D vision; scene understanding
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, 3D computer vision continues to grow at a dramatic pace thanks to major recent developments, including the availability of low-cost consumer 3D sensors (various types of 3D scanners, LiDARs, and RGB-D cameras), rapid advances in deep learning, and the public availability of large scale 2D image and 3D geometry datasets. From a research standpoint, the additional spatial information measurement from the 3D sensors provides great opportunities for many vision tasks, which traditionally rely on 2D information, often from a single viewpoint. It has been demonstrated in several recent papers that utilizing 3D information can simplify tasks from reconstruction, detection, and segmentation, to pattern recognition. Although remarkable progress has been achieved in this area, 3D data still pose challenges to the 3D computer vision community because of the typically high noise, irregularity, and poor quality of acquired 3D information. The need for more contributions is also motivated by a few emerging applications in the field of entertainment, education, medicine, architectural and urban design, engineering and robotics, fine arts, and cultural heritage.

The aim of this Special Issue is to showcase state-of-the-art results and to provide a cross-fertilization ground for stimulating discussions on the next steps in the area of 3D computer vision based on advanced sensing technology. We welcome contributions of novel work in 3D computer vision, as well as its applications in different areas.

Papers are solicited on, but are not limited to, the following and related topics:

(1) New 3D sensors and technologies for computer vision;

(2) 3D modelling and scene reconstruction

(3) 3D object detection, recognition and classification

(4) 3D object estimation and tracking

(5) 3D scene understanding

(6) Stereo vision and RGB-D image acquisition

(7) 3D point cloud processing

(8) Applications of RGB-D vision, e.g., robotics, augmented reality, and autonomous driving

Prof. Dr. Liang Zhang
Prof. Dr. Mohammed Bennamoun
Dr. Mingtao Feng
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • 3D sensors
  • 3D computer vision
  • 3D reconstruction
  • 3D object detection
  • semantic segmentation

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 4960 KiB  
Article
Intelligent Object Tracking with an Automatic Image Zoom Algorithm for a Camera Sensing Surveillance System
by Shih-Chang Hsia, Szu-Hong Wang, Chung-Mao Wei and Chuan-Yu Chang
Sensors 2022, 22(22), 8791; https://0-doi-org.brum.beds.ac.uk/10.3390/s22228791 - 14 Nov 2022
Cited by 2 | Viewed by 2191
Abstract
Current surveillance systems frequently use fixed-angle cameras and record a feed from those cameras. There are several disadvantages to such systems, including a low resolution for far away objects, a limited frame range and wasted disk space. This paper presents a novel algorithm [...] Read more.
Current surveillance systems frequently use fixed-angle cameras and record a feed from those cameras. There are several disadvantages to such systems, including a low resolution for far away objects, a limited frame range and wasted disk space. This paper presents a novel algorithm for automatically detecting, tracking and zooming in on active targets. The object tracking system is connected to a camera that has a 360° horizontal and 90° vertical movement range. The combination of tracking, movement identification and zoom means that the system is able to effectively improve the resolution of small or distant objects. The object detection system allows for the disk space to be conserved as the system ceases recording when no valid targets are detected. Using an adaptive object segmentation algorithm, it is possible to detect the shape of moving objects efficiently. When processing multiple targets, each target is assigned a color and is treated separately. The tracking algorithm is able to adapt to targets moving at different speeds and is able to control the camera according to a predictive formula to prevent the loss of image quality due to camera trail. In the test environment, the zoom can sufficiently lock onto the head of a moving human; however, simultaneous tracking and zooming occasionally results in a failure to track. If this system is deployed with a facial recognition algorithm, the recognition accuracy can be effectively improved. Full article
Show Figures

Figure 1

17 pages, 3629 KiB  
Article
An Accurate and Robust Method for Absolute Pose Estimation with UAV Using RANSAC
by Kai Guo, Hu Ye, Xin Gao and Honglin Chen
Sensors 2022, 22(15), 5925; https://0-doi-org.brum.beds.ac.uk/10.3390/s22155925 - 8 Aug 2022
Cited by 8 | Viewed by 2782
Abstract
In this paper, we proposed an accurate and robust method for absolute pose estimation with UAV (unmanned aerial vehicle) using RANSAC (random sample consensus). Because the artificial 3D control points with high accuracy are time-consuming and the small point set may lead low [...] Read more.
In this paper, we proposed an accurate and robust method for absolute pose estimation with UAV (unmanned aerial vehicle) using RANSAC (random sample consensus). Because the artificial 3D control points with high accuracy are time-consuming and the small point set may lead low measuring accuracy, we designed a customized UAV to efficiently obtain mass 3D points. A light source was mounted on the UAV and used as a 3D point. The position of the 3D point was given by RTK (real-time kinematic) mounted on the UAV, and the position of the corresponding 2D point was given by feature extraction. The 2D–3D point correspondences exhibited some outliers because of the failure of feature extraction, the error of RTK, and wrong matches. Hence, RANSAC was used to remove the outliers and obtain the coarse pose. Then, we proposed a method to refine the coarse pose, whose procedure was formulated as the optimization of a cost function about the reprojection error based on the error transferring model and gradient descent to refine it. Before that, normalization was given for all the valid 2D–3D point correspondences to improve the estimation accuracy. In addition, we manufactured a prototype of a UAV with RTK and light source to obtain mass 2D–3D point correspondences for real images. Lastly, we provided a thorough test using synthetic data and real images, compared with several state-of-the-art perspective-n-point solvers. Experimental results showed that, even with a high outlier ratio, our proposed method had better performance in terms of numerical stability, noise sensitivity, and computational speed. Full article
Show Figures

Figure 1

11 pages, 14963 KiB  
Article
Machine Learning-Based View Synthesis in Fourier Lightfield Microscopy
by Julen Rostan, Nicolo Incardona, Emilio Sanchez-Ortiga, Manuel Martinez-Corral and Pedro Latorre-Carmona
Sensors 2022, 22(9), 3487; https://0-doi-org.brum.beds.ac.uk/10.3390/s22093487 - 3 May 2022
Cited by 3 | Viewed by 2846
Abstract
Current interest in Fourier lightfield microscopy is increasing, due to its ability to acquire 3D images of thick dynamic samples. This technique is based on simultaneously capturing, in a single shot, and with a monocular setup, a number of orthographic perspective views of [...] Read more.
Current interest in Fourier lightfield microscopy is increasing, due to its ability to acquire 3D images of thick dynamic samples. This technique is based on simultaneously capturing, in a single shot, and with a monocular setup, a number of orthographic perspective views of 3D microscopic samples. An essential feature of Fourier lightfield microscopy is that the number of acquired views is low, due to the trade-off relationship existing between the number of views and their corresponding lateral resolution. Therefore, it is important to have a tool for the generation of a high number of synthesized view images, without compromising their lateral resolution. In this context we investigate here the use of a neural radiance field view synthesis method, originally developed for its use with macroscopic scenes acquired with a moving (or an array of static) digital camera(s), for its application to the images acquired with a Fourier lightfield microscope. The results obtained and presented in this paper are analyzed in terms of lateral resolution and of continuous and realistic parallax. We show that, in terms of these requirements, the proposed technique works efficiently in the case of the epi-illumination microscopy mode. Full article
Show Figures

Figure 1

17 pages, 3270 KiB  
Article
A New Method for Absolute Pose Estimation with Unknown Focal Length and Radial Distortion
by Kai Guo, Hu Ye, Honglin Chen and Xin Gao
Sensors 2022, 22(5), 1841; https://0-doi-org.brum.beds.ac.uk/10.3390/s22051841 - 25 Feb 2022
Cited by 9 | Viewed by 2636
Abstract
Estimating the absolute pose of a camera is one of the key steps for computer vision. In some cases, especially when using a wide-angle or zoom lens, the focal length and radial distortion also need to be considered. Therefore, in this paper, an [...] Read more.
Estimating the absolute pose of a camera is one of the key steps for computer vision. In some cases, especially when using a wide-angle or zoom lens, the focal length and radial distortion also need to be considered. Therefore, in this paper, an efficient and robust method for a single solution is proposed to estimate the absolute pose for a camera with unknown focal length and radial distortion, using three 2D–3D point correspondences and known camera position. The problem is decomposed into two sub-problems, which makes the estimation simpler and more efficient. The first sub-problem is to estimate the focal length and radial distortion. An important geometric characteristic of radial distortion, that the orientation of the 2D image point with respect to the center of distortion (i.e., principal point in this paper) under radial distortion is unchanged, is used to solve this sub-problem. The focal length and up to four-order radial distortion can be determined with this geometric characteristic, and it can be applied to multiple distortion models. The values with no radial distortion are used as the initial values, which are close to the global optimal solutions. Then, the sub-problem can be efficiently and accurately solved with the initial values. The second sub-problem is to determine the absolute pose with geometric linear constraints. After estimating the focal length and radial distortion, the undistorted image can be obtained, and then the absolute pose can be efficiently determined from the point correspondences and known camera position using the undistorted image. Experimental results indicate this method’s accuracy and numerical stability for pose estimation with unknown focal length and radial distortion in synthetic data and real images. Full article
Show Figures

Figure 1

17 pages, 10616 KiB  
Article
Rich Structural Index for Stereoscopic Image Quality Assessment
by Hua Zhang, Xinwen Hu, Ruoyun Gou, Lingjun Zhang, Bolun Zheng and Zhuonan Shen
Sensors 2022, 22(2), 499; https://0-doi-org.brum.beds.ac.uk/10.3390/s22020499 - 10 Jan 2022
Cited by 3 | Viewed by 1896
Abstract
The human visual system (HVS), affected by viewing distance when perceiving the stereo image information, is of great significance to study of stereoscopic image quality assessment. Many methods of stereoscopic image quality assessment do not have comprehensive consideration for human visual perception characteristics. [...] Read more.
The human visual system (HVS), affected by viewing distance when perceiving the stereo image information, is of great significance to study of stereoscopic image quality assessment. Many methods of stereoscopic image quality assessment do not have comprehensive consideration for human visual perception characteristics. In accordance with this, we propose a Rich Structural Index (RSI) for Stereoscopic Image objective Quality Assessment (SIQA) method based on multi-scale perception characteristics. To begin with, we put the stereo pair into the image pyramid based on Contrast Sensitivity Function (CSF) to obtain sensitive images of different resolution. Then, we obtain local Luminance and Structural Index (LSI) in a locally adaptive manner on gradient maps which consider the luminance masking and contrast masking. At the same time we use Singular Value Decomposition (SVD) to obtain the Sharpness and Intrinsic Structural Index (SISI) to effectively capture the changes introduced in the image (due to distortion). Meanwhile, considering the disparity edge structures, we use gradient cross-mapping algorithm to obtain Depth Texture Structural Index (DTSI). After that, we apply the standard deviation method for the above results to obtain contrast index of reference and distortion components. Finally, for the loss caused by the randomness of the parameters, we use Support Vector Machine Regression based on Genetic Algorithm (GA-SVR) training to obtain the final quality score. We conducted a comprehensive evaluation with state-of-the-art methods on four open databases. The experimental results show that the proposed method has stable performance and strong competitive advantage. Full article
Show Figures

Figure 1

16 pages, 21230 KiB  
Article
An Efficient Closed Form Solution to the Absolute Orientation Problem for Camera with Unknown Focal Length
by Kai Guo, Hu Ye, Zinian Zhao and Junhao Gu
Sensors 2021, 21(19), 6480; https://0-doi-org.brum.beds.ac.uk/10.3390/s21196480 - 28 Sep 2021
Cited by 12 | Viewed by 2041
Abstract
In this paper we propose an efficient closed form solution to the absolute orientation problem for cameras with an unknown focal length, from two 2D–3D point correspondences and the camera position. The problem can be decomposed into two simple sub-problems and can be [...] Read more.
In this paper we propose an efficient closed form solution to the absolute orientation problem for cameras with an unknown focal length, from two 2D–3D point correspondences and the camera position. The problem can be decomposed into two simple sub-problems and can be solved with angle constraints. A polynomial equation of one variable is solved to determine the focal length, and then a geometric approach is used to determine the absolute orientation. The geometric derivations are easy to understand and significantly improve performance. Rewriting the camera model with the known camera position leads to a simpler and more efficient closed form solution, and this gives a single solution, without the multi-solution phenomena of perspective-three-point (P3P) solvers. Experimental results demonstrated that our proposed method has a better performance in terms of numerical stability, noise sensitivity, and computational speed, with synthetic data and real images. Full article
Show Figures

Figure 1

16 pages, 4518 KiB  
Article
Phase Error Analysis and Correction for Crossed-Grating Phase-Shifting Profilometry
by Fuqian Li and Wenjing Chen
Sensors 2021, 21(19), 6475; https://0-doi-org.brum.beds.ac.uk/10.3390/s21196475 - 28 Sep 2021
Cited by 6 | Viewed by 1822
Abstract
Crossed-grating phase-shifting profilometry (CGPSP) has great utility in three-dimensional shape measurement due to its ability to acquire horizontal and vertical phase maps in a single measurement. However, CGPSP is extremely sensitive to the non-linearity effect of a digital fringe projection system, which is [...] Read more.
Crossed-grating phase-shifting profilometry (CGPSP) has great utility in three-dimensional shape measurement due to its ability to acquire horizontal and vertical phase maps in a single measurement. However, CGPSP is extremely sensitive to the non-linearity effect of a digital fringe projection system, which is not studied in depth yet. In this paper, a mathematical model is established to analyze the phase error caused by the non-linearity effect. Subsequently, two methods used to eliminate the non-linearity error are discussed in detail. To be specific, a double five-step algorithm based on the mathematical model is proposed to passively suppress the second non-linearity. Furthermore, a precoding gamma correction method based on probability distribution function is introduced to actively attenuate the non-linearity of the captured crossed fringe. The comparison results show that the active gamma correction method requires less fringe patterns and can more effectively reduce the non-linearity error compared with the passive method. Finally, employing CGPSP with gamma correction, a faster and reliable inverse pattern projection is realized with less fringe patterns. Full article
Show Figures

Figure 1

21 pages, 21383 KiB  
Article
Parallel Structure from Motion for Sparse Point Cloud Generation in Large-Scale Scenes
by Yongtang Bao, Pengfei Lin, Yao Li, Yue Qi, Zhihui Wang, Wenxiang Du and Qing Fan
Sensors 2021, 21(11), 3939; https://0-doi-org.brum.beds.ac.uk/10.3390/s21113939 - 7 Jun 2021
Cited by 7 | Viewed by 3717
Abstract
Scene reconstruction uses images or videos as input to reconstruct a 3D model of a real scene and has important applications in smart cities, surveying and mapping, military, and other fields. Structure from motion (SFM) is a key step in scene reconstruction, which [...] Read more.
Scene reconstruction uses images or videos as input to reconstruct a 3D model of a real scene and has important applications in smart cities, surveying and mapping, military, and other fields. Structure from motion (SFM) is a key step in scene reconstruction, which recovers sparse point clouds from image sequences. However, large-scale scenes cannot be reconstructed using a single compute node. Image matching and geometric filtering take up a lot of time in the traditional SFM problem. In this paper, we propose a novel divide-and-conquer framework to solve the distributed SFM problem. First, we use the global navigation satellite system (GNSS) information from images to calculate the GNSS neighborhood. The number of images matched is greatly reduced by matching each image to only valid GNSS neighbors. This way, a robust matching relationship can be obtained. Second, the calculated matching relationship is used as the initial camera graph, which is divided into multiple subgraphs by the clustering algorithm. The local SFM is executed on several computing nodes to register the local cameras. Finally, all of the local camera poses are integrated and optimized to complete the global camera registration. Experiments show that our system can accurately and efficiently solve the structure from motion problem in large-scale scenes. Full article
Show Figures

Figure 1

Back to TopTop