Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Agenda Overview |
| Session | ||
WG II/2C: Point Cloud Generation and Processing
Session Topics: Point Cloud Generation and Processing (WG II/2)
| ||
| External Resource: http://www.commission2.isprs.org/wg2 | ||
| Presentations | ||
8:30am - 8:45am
Differentiable deep consistency for point cloud registration Technion - Israel Institute of Technology, Israel Point cloud registration is a key facilitator for scan alignment in mapping, autonomous driving, and robotic applications. Current pipelines increasingly adopt neural-based paradigms, where most research focuses on learning view-consistent descriptors for correspondence matching. Due to outliers, matching is typically followed by a geometric verification phase that assesses correspondences by enforcing distance or angular consistency to support transformation estimation. Although effective, this verification stage scales quadratically, creating a computational bottleneck that hampers efficient registration. More importantly, since matching and verification are usually optimized separately, the verification stage cannot guide the learned descriptors or foster their geometric awareness. To address both limitations, we introduce a novel end-to-end neural registration framework that unifies correspondence learning and verification within a single differentiable formulation. Specifically, we propose a new consistency-driven cross-attention module that dynamically correlates cross-scan neighborhoods to suppress inconsistent matches and reinforce inter-scan feature coherence. In doing so, it produces robust and discriminative descriptors without incurring the quadratic cost of explicit pairwise verification. Our formulation is readily applicable, and we demonstrate its seamless integration into the GeoTransformer and RoITr state-of-the-art architectures without additional supervision or post-processing. Results show that our method excels in challenging low-overlap scenarios, where competing methods often yield few correct correspondences or fail entirely. It consistently achieves superior inlier ratios and the lowest registration errors on 3DMatch, 3DLoMatch, and KITTI, improving registration recall by up to 2.6%. Beyond accuracy, it converges faster during training and achieves the quickest inference among state-of-the-art methods. 8:45am - 9:00am
Cross-source Point Cloud Registration in the Bird’s-eye Domain: Aligning Street-level LiDAR with High-resolution Aerial Orthoimagery 1Kakao Mobility, Republic of Korea; 2University of Seoul, Republic of Korea; 3Yonsei University, Republic of Korea Combining terrestrial Mobile Mapping System (MMS) point clouds with aerial photogrammetric data offers a practical route to comprehensive 3D urban models that integrate street-level geometric detail with wide-area coverage. However, direct 3D-to-3D registration between these data sources often fails because of large differences in viewpoint, point density, scale, and scene composition. This study presents an orthoimage-based registration framework that reformulates cross-source alignment in the Bird's-Eye-View (BEV) domain. After removing transient objects and extracting ground-level points from the MMS cloud, the data are rasterised into a synthetic orthoimage aligned in resolution and projection with a geo-referenced Unmanned Aerial Vehicle (UAV) orthoimage. A learned dense matcher establishes image correspondences, which are geometrically verified and lifted to 3D for coarse alignment, followed by tile-wise point-to-plane Iterative Closest Point (ICP) refinement and global trajectory regularisation via robust factor-graph optimisation. The aligned MMS and UAV point clouds are then integrated through reliability-driven voxel-level fusion. Experiments on a 3.7km urban corridor in Seoul demonstrate that the proposed framework achieves a 3D root-mean-square error of 6.19cm, indicating that BEV-domain orthoimage matching combined with local 3D refinement and trajectory regularisation provides a viable approach for large-scale MMS-UAV registration in dense urban environments. 9:00am - 9:15am
Automated Alignment Enhancement of Backpack Image-LiDAR Data in a Forest Environment Purdue University, United States of America In recent years, backpack mobile mapping systems (MMS) have shown great promise for under-canopy forest mapping. These systems integrate cameras, LiDAR sensors, and Global Navigation Satellite System/Inertial Navigation System (GNSS/INS) units to provide multi-modal geospatial data essential for modern forest applications that require both geometric and spectral information. However, transportation logistics and improper handling can degrade the system calibration. Moreover, canopy-induced GNSS signal outages will cause trajectory errors. The resulting misalignments between the image-LiDAR data necessitate the application of image–LiDAR registration. Such algorithms can be broadly classified as 2D-3D, 3D-3D, or 2D-2D, depending on the domain in which image-LiDAR features are identified. Due to the inherent modality differences, 2D–3D methods often struggle with feature matching. These methods typically require manual feature selection (Habib et al., 2005) or the availability of prominent features in urban environments (Liao et al., 2023). In contrast, 3D-3D methods rely on generating 3D image point clouds, which imposes strict requirements on image overlap (Yang et al., 2015). Although 2D–2D approaches are less demanding on image data (Hu et al., 2023), none have been applied in under-canopy forests, where establishing multi-modal correspondences remains challenging. To overcome these limitations, this study introduces a post-processing framework for automated image–LiDAR alignment enhancement for backpack MMS in forest environments. This method utilizes a 2D–2D image–LiDAR registration approach based on semantic tree-trunk features. 9:15am - 9:30am
A Marker-based Method for precise 3D Registration between CT-Data and photogrammetric Datasets 1TU Dresden, Germany; 2HTW Dresden, Germany In order to enable photogrammetric tracking of objects from a computed tomography (CT) dataset with a multi-camera system, a transformation between the CT data space and a photogrammetric reference frame is required, typically based on control points. To achieve a robust and precise registration between CT and photogrammetric datasets, this work proposes a marker-based approach. The main goal is to use a marker model that allows straightforward segmentation and control point estimation in CT voxel space, while also supporting reliable and precise control point estimation in the photogrammetric images. As a proof-of-concept, spherical markers were investigated, since they allow centre estimation in both domains. In the CT data, marker centres were determined by intensity-based thresholding followed by sphere fitting, while in the photogrammetric data they were estimated by intensity-based thresholding, edge detection, circle fitting, and multi-image spatial intersection. Two different marker models were tested. The results show that the proposed method is feasible and yields sub-millimetre standard deviations of unit weight for both marker types. However, since a sufficient stochastic model is not yet available, the reported accuracy measures may be optimistic and should therefore be interpreted with caution. Future work will address these limitations, in particular uncertainty modelling as well as remaining lighting and contrast issues. 9:30am - 9:45am
Advances in Historical Aerial Image Analysis: Boosting SfM Pipelines with Learned Models 1University of Zurich, Switzerland; 2University of Magallanes; 3University of British Columbia Scanned aerial images acquired with film cameras (hereafter referred to as historical images) over the past century is a unique source for deriving Digital Elevation Models (DEMs) and orthoimage to reconstruct past Earth’s surface and quantify long-term changes from glacier to landscape and urban development. The Historical Structure-from-Motion (HSfM) pipeline (Knuth et al., 2023) currently represents the state of the art to fully automatically generate these historical DEMs. However, struggles with inconsistent image quality, distortions, distinct geometries and above all is based on the commercial software Metahape. Therefore, we aim to: (1) develop a fully open-source solution in COLMAPs environment, (2) integrate learned models in different SfM-steps to better handle the complex properties that come with historical imagery, and (3) compare our output against HSfM. Our work is based on 180 historical aerial images acquired above the challenging terrain of Gran Campo Nevado Glacier. The results show that our photogrammetric workflow leads to a 0.26 px smaller mean reprojection error as well as roughly 9-times more tie-points for the sparse point cloud compared to the HSfM. The mean DEM difference with a reference DEM on stable terrain and the 95%-quantile DEM difference are also smaller in our experiments (0.71m vs. 10.10 m and 73.62 m vs. 99.03 m). Further tests of our workflow include employing alternative models for feature extraction, matching, and dense reconstruction as well as evaluating multitemporal approaches (as adopted in Knuth et al., 2023) to enable a more representative comparison. 9:45am - 10:00am
Trinocular Multi-Object 3D Reconstruction in Camera-Simulating virtual Environments for Knee Arthroplasty 1Jade University of Applied Sciences, Institute for Applied Photogrammetry and Geoinformatics, Oldenburg, Germany; 2Jade University of Applied Sciences, Institute for Technical Assistive Systems, Oldenburg, Germany In knee arthroplasty, computer-assisted navigation enhances the accuracy of prosthesis placement. However, current methods rely on invasively drilled locators to track the knee position during surgery, prolonging the healing process. For this reason, research is focused on markerless approaches capable of determining knee orientation and transferring preoperative planning into the surgical environment. This work presents a trinocular multi-object 3D reconstruction system designed for intraoperative acquisition of the knee surface, providing a foundation for marker less navigation. Due to the scarcity of real surgical data with ground truth, a synthetic dataset was created using Blender to simulate optical image acquisition of a virtual knee model under controlled camera and lighting conditions. The dataset enables a systematic evaluation of how camera motion and viewpoint affect pose estimation and 3D reconstruction accuracy. The results demonstrate that moderate camera deflection between 15° and 25° achieve the best balance between accurate camera pose estimation and surface reconstruction quality. The work confirms the potential of trinocular SLAM for robust bone surface tracking while also identifying the limitations of synthetic data, such as the absence of real-world visual variability. These results form the basis for future work on 3D reconstruction during dynamic knee movements and their tracking, as well as on the integration of markerless optical navigation systems into surgery. | ||

