Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Daily Overview |
| Session | ||
ICWG II/Ia: Autonomous Sensing Systems and their Applications
Session Topics: Autonomous Sensing Systems and their Applications (ICWG II/Ia)
| ||
| External Resource: http://www.commission2.isprs.org/icwg-2-1a | ||
| Presentations | ||
8:30am - 8:45am
GCP Deployment and Recognition System based on Light-Marker UAV Wuhan University,China This paper addresses the heavy reliance on manual operations in control point acquisition for UAV photogrammetry and proposes an encoded control point deployment and recognition method based on a Light-Marker UAV (LMUAV). Conventional approaches rely on manual placement of control points and manual identification and measurement in images for aerial triangulation, resulting in low efficiency. To address this limitation, an LMUAV equipped with an LED array actively broadcasts its positional information as quaternary optical signals. The observing UAV performs coarse localization of the target region by integrating communication priors with the imaging model, followed by light spot segmentation and graph construction within the region of interest (ROI). Node correspondences are then recovered by constructing a template graph and an observation graph and applying Reweighted Random Walks (RRWM) graph matching. The matching robustness is further enhanced by incorporating directional point constraints and RANSAC-based geometric filtering. Based on the recovered correspondences, the encoded information is decoded through color recognition and validation, enabling automatic control point recovery. Experimental results in a cross-flight-line scenario with a single target UAV demonstrate that the proposed method achieves stable node matching and encoding–decoding, with a sequence-level accuracy of 76.32%, and a final effective decoding rate of 71.05%, while maintaining centimeter-level positioning accuracy, thereby validating its effectiveness for automatic control point acquisition in UAV mapping. 8:45am - 9:00am
6D Strawberry Pose Estimation: Real-time and Edge AI Solutions Using Purely Synthetic Training Data 1Fraunhofer IGD, Germany; 2Delft University of Technology, Netherlands Automated and selective harvesting of fruits is increasingly vital due to high costs and seasonal labor shortages in advanced economies. This paper explores 6D pose estimation of strawberries using synthetic data generated through a procedural pipeline for photorealistic rendering. We utilize the YOLOX-6D-Pose algorithm, a single-shot method leveraging the YOLOX backbone, known for its balance of speed and accuracy and its suitability for edge inference. To counter the lack of training data, we develop a robust and flexible pipeline for generating synthetic strawberry data from various 3D models in Blender, focusing on enhancing realism compared to prior efforts, thus providing a valuable resource for training pose estimation algorithms. Quantitative evaluations show that our models achieve comparable accuracy on both the NVIDIA RTX 3090 and Jetson Orin Nano across several ADD-S metrics, with the RTX 3090 offering superior processing speed. However, the Jetson Orin Nano is particularly effective for resource-constrained environments, making it ideal for deployment in agricultural robotics. Qualitative assessments further validate the model's performance, demonstrating accurate pose inference for ripe and partially ripe strawberries, although challenges remain in detecting unripe specimens. This highlights opportunities for future enhancements, particularly in improving detection for unripe strawberries by exploring color variations. Moreover, the presented methodology can be easily adapted for other fruits, such as apples, peaches, and plums, broadening its applicability in agricultural automation. 9:00am - 9:15am
A Comparison of Multi-View Stereo Methods for Photogrammetric 3D Reconstruction: From Traditional to Learning-Based Approaches Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, Enschede, The Netherlands Photogrammetric 3D reconstruction has long relied on traditional Structure-from-Motion (SfM) and Multi-View Stereo (MVS) methods, which provide high accuracy but face challenges in speed and scalability. Recently, learning-based MVS methods have emerged, aiming for faster and more efficient reconstruction. This work presents a comparative evaluation between a representative traditional MVS pipeline (COLMAP) and state-of-the-art learning-based approaches, including geometry-guided methods (MVSNet, PatchmatchNet, MVSAnywhere, MVSFormer++) and end-to-end frameworks (Stereo4D, FoundationStereo, DUSt3R, MASt3R, Fast3R, VGGT). Two experiments were conducted on different aerial scenarios. The first experiment used the MARS-LVIG dataset, where ground-truth 3D reconstruction was provided by LiDAR point clouds. The second experiment used a public scene from the Pix4D official website, with ground truth generated by Pix4Dmapper. We evaluated accuracy, coverage, and runtime across all methods. Experimental results show that although COLMAP can provide reliable and geometrically consistent reconstruction results, it requires more computation time. In cases where traditional methods fail in image registration, learning-based approaches exhibit stronger feature-matching capability and greater robustness. Geometry-guided methods usually require careful dataset preparation and often depend on camera pose or depth priors generated by COLMAP. End-to-end methods such as DUSt3R and VGGT achieve competitive accuracy and reasonable coverage while offering substantially faster reconstruction. However, they exhibit relatively large residuals in 3D reconstruction, particularly in challenging scenarios. 9:15am - 9:30am
Automatic detection models for building exterior wall cracks in drone imagery based on CNN and Transformer 1National Quality Inspection and Testing Center for Surveying and Mapping Products, China, People's Republic of; 2Hohai University, China, People's Republic of; 3State Grid Zhejiang Electric Power Co.,Ltd. Logistics Service Company, China, People's Republic of This study presents a comprehensive evaluation of six deep learning models for building exterior crack detection using UAV imagery. Our framework systematically compares Standard U-Net, ResNet34-UNet, UNet-Attention, UNet-Residual, HybridUNet, and TransUNet through rigorous ablation experiments. The models were trained on dedicated drone-captured crack imagery and evaluated using multiple loss functions and performance metrics. Results show that TransUNet achieves optimal performance (87.66% F1 Score, 90.43% Precision, 89.99% Recall) by leveraging Transformer-based global context modeling. Notably, the performance gap among all six models remains minimal (<0.5% F1 Score difference), suggesting limited returns from increased architectural complexity alone. F1 Loss demonstrates the most balanced performance across architectures, while Focal-Dice-Loss offers superior optimization stability. The study provides practical guidance for model selection: TransUNet with F1 Loss suits high-accuracy requirements, while simpler attention-enhanced U-Net variants offer cost-effective solutions for large-scale applications. These findings advance intelligent crack detection methodologies and emphasize balancing accuracy with computational efficiency for real-world structural health monitoring. 9:30am - 9:45am
Towards real-time UAV path replanning based on photogrammetry and learning-based approaches 1University of Campinas, Brazil; 2IFSULDEMINAS, Brazil Unmanned Aerial Vehicles (UAVs) have contributed to a wide range of applications, becoming faster and more sustainable nowadays. However, given the significant increase in the number of UAVs, concerns regarding operational safety have grown. Autonomous UAV path planning must ensure compliance with safety requirements. This study proposes a real-time path replanning method focused on ensuring compliance with regulations governing UAV operations. Considering no-fly zones (NFZs) defined by both static (buildings) and dynamic (people) obstacles, a low-cost and replicable solution was implemented in four main steps: 3D offline path planning using the A* algorithm and Digital Elevation Models; human detection in UAV imagery using the YOLO11m model; estimation of the person’s 3D coordinates using Monoplotting; and experiments of real-time path replanning. During flight execution, imagery acquired by the UAV is transmitted to a server and, if a person is detected, path replanning is performed. The replanned route is then sent to the UAV controller to be executed via an SDK-based application. For flights at reduced speeds, the proposed method demonstrated feasibility in a computational environment (replanning time of 2.79 s). Simulated flight execution using the DJI Mobile SDK was successful. However, when relying on data transmission over Wi-Fi, the replanning duration on a local server (17.96 s) remained unsuitable for real-time operations. As future work, alternative solutions should be explored to ensure real-time processing. Despite the challenges, this study contributes by validating the open and free DJI MSDK application for path execution in a simulated environment, integrated with a listener application. 9:45am - 10:00am
PC2Model: ISPRS benchmark on 3D point cloud to model registration 1Technische Universität Braunschweig; Institute of Geodesy and Photogrammetry, Germany; 2Department of Infrastructure Engineering, University of Melbourne, Australia; 3Civil & Construction Engineering, Oregon State University, USA Point cloud registration involves aligning one point cloud with another or with a three-dimensional (3D) model, enabling the integration of multimodal data into a unified representation. This is essential in applications such as construction monitoring, autonomous driving, robotics, and virtual or augmented reality (VR/AR).With the increasing accessibility of point cloud acquisition technologies, such as Light Detection and Ranging (LiDAR) and structured light scanning, along with recent advances in deep learning, the research focus has increasingly shifted towards downstream tasks, particularly point cloud-to-model (PC2Model) registration. While data-driven methods aim to automate this process, they struggle with sparsity, noise, clutter, and occlusions in real-world scans, which limit their performance. To address these challenges, this paper introduces the PC2Model benchmark, a publicly available dataset designed to support the training and evaluation of both classical and data-driven methods. Developed under the leadership of ICWG II/Ib, the PC2Model benchmark adopts a hybrid design that combines simulated point clouds with, in some cases, real-world scans and their corresponding 3D models. Simulated data provide precise ground truth and controlled conditions, while real-world data introduce sensor and environmental artefacts. This design supports robust training and evaluation across domains and enables the systematic analysis of model transferability from simulated to real-world scenarios. The dataset is publicly accessible at: https://zenodo.org/records/17581812. | ||

