JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at isprs2026@icsevents.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Daily Overview

Date: Tuesday, 07-July-2026

8:30am - 10:00am

WG II/2B: Point Cloud Generation and Processing
Location: 713A

8:30am - 8:45am

Multi-Source Fusion of Roof Skeletons, LiDAR and Street-View Imagery for Semi-Automated LoD-2 Building Modelling

Vaibhav Rajan¹, Sander Münster¹, Jonas Bruschke¹, Ferdinand Maiwald²

¹Digital Humanities, Friedrich-Schiller-Universität Jena, Germany; ²Chair of Optical 3D-Metrology, TUD Dresden University of Technology, Germany

LoD-2 building models are more informative and practically more useful than LoD-1 representations because they capture the roof structure that defines the essential three-dimensional form of a building. They are important for applications such as urban planning, environmental simulation, and digital heritage. Although recent roof shape extraction methods can derive vectorised 2D roof structures from very-high-resolution imagery, transforming these image-based representations into fully textured 3D buildings remains challenging. In this paper, we present a semi-automated LoD-2 reconstruction pipeline that integrates HEAT-derived roof geometry with airborne LiDAR, satellite and Google Street View imagery. The 2D outputs are reprojected into map coordinates, fused with LiDAR through a two-stage roof reconstruction strategy to derive roof shapes and combined with an adaptive, LiDAR-based ground base initialisation to create a complete 3D wireframe. Roofs are textured using VHR orthophotos while the walls are textured via a process of Street View panorama selection, geometric filtering, Mask2Former segmentation, and homography rectification. Across a large-scale evaluation on 1000 buildings, the proposed two-stage reconstruction strategy improves geometric agreement with the LiDAR reference data achieving a roof-surface RMSE of 0.445~m. The wall texturing process produces convincing facades when suitable panoramas are available. While minor challenges such as sensitivities to LiDAR outliers, incomplete roof geometry, and facade occlusions persist, this pipeline effectively bridges 2D roof parsing and textured LoD-2 model generation, providing a robust and scalable foundation for advancing toward fully automated workflows.

8:45am - 9:00am

BIM-to-Labelled Point Cloud : Automated Point Cloud Annotation from BIM Models using Bounding Boxes and Solid Geometry

Saad Boudarbala^1,2, Tania Landes², Hélène Macher², Thibault Bavoux¹

¹Futurmap Lyon, France; ²INSA-Strasbourg, France

This paper presents an automated framework for generating semantically labelled building point clouds from their corresponding BIM models. The proposed methodology aims to facilitate the creation of training datasets for deep learning–based indoor semantic segmentation. Two complementary labelling strategies are introduced. The first relies on bounding boxes (BBX) extracted from BIMelements to efficiently assign labels to points based on volumetric inclusion. The second approach uses solid geometry and a nearest-neighbour principle (SG-NN) to compute distances between BIM object meshes and the point cloud, enabling a more precise spatial correspondence. In addition, a room-based geometric grouping strategy is proposed to structure the annotated point clouds into spatial units compatible with common indoor segmentation datasets. The methods are evaluated through a qualitative analysis on several real building datasets of different typologies and acquisition conditions, as well as through a quantitative evaluation based on a manually segmented reference point cloud. Results show that the SG-NN approach achieves higher performance, with an average Recall of 92% and IoU of 88%, compared to 87% of Recall and %78 of IoU for the BBX approach. While the BBX approach provides faster processing, the SG-NN strategy achieves higher labelling accuracy, particularly for geometrically complex elements. The proposed workflow enables scalable dataset generation from Scan-to-BIM projects while significantly reducing manual annotation effort.

9:00am - 9:15am

Enhanced SegNet-based Building Extraction Framework via Image Segmentation and Point Cloud Fusion

Chi Tien Nguyen, Dinh Minh Bui, Somin Han, Changjae Kim

Department of Civil Engineering and Environment, College of Engineering, Myongji University

This paper presents an enhanced building extraction framework that combines deep learning-based image segmentation with photogrammetric point cloud refinement for urban roof detection. The method first applies a modified SegNet model to orthophotos from the ISPRS Vaihingen dataset to generate initial building masks. These results are then refined using geometric information from point clouds through ground filtering, clustering, and normal-guided region growing. By integrating spectral information from imagery with structural cues from 3D data, the proposed framework improves roof boundary delineation and reduces spurious detections. Experimental results on Areas 35 and 37 show that the method achieves strong overall performance, with a precision of 0.96, recall of 0.81, IoU of 0.78, and F1-score of 0.88. The findings indicate that point cloud refinement helps produce cleaner and more reliable building objects than image-based segmentation alone, especially in complex urban scenes. However, the approach remains sensitive to the density and quality of the point cloud. Overall, the study demonstrates that fusing orthophoto segmentation with point cloud processing is an effective strategy for more accurate and geometrically consistent building extraction.

9:15am - 9:30am

Application Of Multi-Source Photogrammetric Data For Fast Building Inventory

Anna Fryskowska-Skibniewska, Patryk Wróblewski, Klaudia Pasternak, Julia Gotowiec

Military University of Technology, Poland

The rapid expansion of urban areas and the continuous demand for their monitoring make remote sensing data a highly valuable tool for collecting large volumes of geospatial information in a relatively short time and with high repeatability. The main objective of this paper is to examine the potential offered by different types of geospatial data, as well as the relationships based on their scope, in comparison with measured reference data.

Architectural inventory tasks are useful not only for engineering projects but also for broader applications, such as environmental impact assessments, spatial planning, and related fields. This article introduces a rapid and cost-effective mixed-mode data collection framework for building inventory development, integrating terrestrial laser scanning, UAV imagery, and traditional ground measurements.

The paper will discuss the latest measurement technologies and their practical applications in building surveying, illustrated with a selected case study. The criteria for selecting appropriate measurement methods will also be analyzed, depending on the investor’s requirements and the intended use of the documentation.

This paper presents a set of techniques for updating the geometric information of buildings using laser scanning and imagery. It begins with an introduction to the fundamental concepts, terminology, and principles of 3D information. Subsequently, various measurement techniques are described, along with a discussion of potential sources of error and data incompleteness. The extracted geometric values are validated against independent survey data.

9:30am - 9:45am

Conjugate Feature-Guided Dense Stereo Matching for High-Precision Attribute-Enriched Urban Point Clouds

Yung-Ching Yang, Jen-Jer Jaw

National Taiwan University, Taiwan

Accurate 3D reconstruction of urban scenes from multi-view images is essential for city planning, digital twins, and autonomous navigation. Traditional dense image matching relies on low-level cues such as intensity or gradients, which often produce noisy or incomplete point clouds in complex urban environments. This study introduces an attribute-enriched dense matching framework that embeds both geometric features and semantic attributes from multi-view images to guide dense image matching.

The framework first extracts semantic labels and geometric feature correspondences to generate intermediate products: conjugate features, feature seeds, an attribute map, and an initialized disparity map. These elements provide reliable priors that constrain dense matching, reduce search ranges, and prevent mismatches across structural boundaries. Dense image matching then propagates these constraints, producing an attribute-enriched disparity map and point cloud in which each 3D point carries both geometric and semantic information.

Evaluated on urban datasets, the proposed approach improves corner and edge localization, enhances edge continuity, reduces outliers in low-texture areas, and preserves semantic and structural attributes throughout 3D scene reconstruction. By integrating feature-based initialization with attribute-enriched dense image matching, the method delivers more accurate, interpretable, and robust 3D urban reconstructions, supporting downstream tasks such as precise measurement, object recognition, and scene analysis.

9:45am - 10:00am

Efficient Extraction and Specification-Compliant Optimization of Railway Alignment Parameters from UAV LiDAR Point Clouds

Zhaochen Han, Xuming Ge, Min Chen, Han Hu

Faculty of Geosciences and Engineering, Southwest Jiaotong University

The rapid acquisition of high-precision parametric railway alignment is a fundamental prerequisite for intelligent railway construction and maintenance. Traditional measurement techniques and alignment fitting methods heavily rely on manual operations, often resulting in inefficiency, high costs, and insufficient accuracy control. To address these challenges, this study proposes an automated method for extracting and optimizing railway alignment from UAV-based LiDAR point clouds. Initially, track centerlines are extracted by leveraging the geometric smoothness of the railway and the structural characteristics of the track. A multi-constraint energy model integrating distance, orientation, and curvature is constructed to fit the geometric parameters of alignment elements, thereby providing high-quality initial values for subsequent alignment engineering parameter optimization. Finally, a global optimization strategy based on the simulated annealing algorithm is applied to jointly refine the engineering parameters of the standardized alignment composition, ensuring strict compliance with railway design specification. Experimental results demonstrate that the proposed method can efficiently and robustly extract high-precision alignment parameters with well-defined engineering semantics from complex railway point clouds, thereby providing reliable technical support for intelligent construction and full lifecycle management of railway systems.

8:30am - 10:00am

WG III/1I: Remote Sensing Data Processing and Understanding
Location: 713B

8:30am - 8:45am

OG-TPTV: A texture-preserving regularizer for hyperspectral image denoising

Zhangping Wu, Mi Wang

Wuhan University, China

Hyperspectral images (HSIs) are often severely degraded by mixed noise, such as Gaussian, stripe, and impulse noise during acquisition and transmission, which seriously impedes their subsequent applications. Therefore, HSI denoising is both crucial and challenging. In this work, we present a gradient-domain outlier-guided texture-preserved total variation (OG-TPTV) regularizer designed to remove mixed noise in HSIs. First, we utilize the mode-3 low-rank property of HSI gradient maps along the spectral dimension and apply a low-rank decomposition model to extract their spatial representation coefficients (SRCs). To improve the sparsity characterization of SRCs in the gradient subspace, an outlier-guided strategy is introduced. Specifically, we perform outlier detection on gradient maps to distinguish noise from texture structures and remove outliers to generate precise texture weighting maps. The resulting texture weight maps offer adaptive guidance for adjusting the strength of the sparsity constraints. Finally, a denoising method for HSIs is developed based on OG-TPTV. Extensive experiments on both synthetic and real HSIs demonstrate the superior denoising performance of our method.

8:45am - 9:00am

SpectralNet-X: Transformer-based Lossy Compression for Hyperspectral Satellite Data

Jannik Sheikh¹, Wolfgang Groß¹, Jannick Kuester¹, Andreas Michel¹, Martin Weinmann²

¹Fraunhofer IOSB, Germany; ²Karlsruhe Institute of Technology (KIT)

Hyperspectral satellite missions generate massive data volumes that are difficult to transmit and store under tight onboard resource constraints, making effective lossy compression a key enabling technology. We propose SpectralNet-X, a transformer-based autoencoder for spectral-only compression of spaceborne hyperspectral imagery at a fixed compression ratio of 16. The encoder maps each spectrum to a low-dimensional latent code using a 1D convolutional projection followed by stacked self-attention layers with rotary position embeddings, and aggregates information via cross-attention pooling. The decoder reconstructs full-band spectra through an upsampling stack and per-band affine calibration. To improve reconstruction fidelity and generalization, SpectralNet-X is first pretrained in a masked-signal reconstruction task inspired by SimMIM and then fine-tuned with a mixed objective that combines mean-squared error and spectral angle mapper (SAM) terms using a scheduled weighting scheme. We evaluate SpectralNet-X on the large-scale HySpecNet–11k benchmark and in a mission-realistic cross-sensor setting, where models trained on HySpecNet–11k are tested on PRISMA hyperspectral scenes. Across PSNR, SSIM, and SAM, and when compared to three different compression autoencoders, SpectralNet-X achieves the lowest angular reconstruction errors while maintaining competitive distortion metrics and substantially reducing the fraction of spectra with large SAM outliers. These results indicate that transformer-based spectral compression is a promising candidate for robust, mission-realistic onboard hyperspectral data reduction.

9:00am - 9:15am

Sensitivity of Deep Learning Validation to Spatial Scale–Sample Size Interactions in Hyperspectral Imaging

Yanfang Sun^1,2, Yuan Jiang^1,2, Yongze Song³, Rui Qu⁴

¹College of Civil Engineering, Taiyuan University of Technology, Taiyuan, China; ²Shanxi Key Laboratory of Civil Engineering Disaster Prevention and Control, Taiyuan,China; ³School of Design and the Built Environment, Curtin University, Perth, Australia; ⁴School of Computer Science and Technology, Aba Teachers College, Aba Zhou

Validating the performance of deep learning models in satellite imagery is essential for ensuring model generalizability, decision reliability, and spatial transferability—particularly in the context of hyperspectral images, which contain high-dimensional, spatially complex data. While it is well recognized that multiple spatial characteristics influence deep learning model performance, few studies have systematically examined how the interactions among these characteristics affect model validation sensitivity in hyperspectral contexts. This study aims to investigate how the interaction between spatial scale (e.g., surrounding 3, 5, 7 grids) and training sample size (e.g., 10%, 30%, 50% of all data) influences the validation accuracy and sensitivity of deep learning models. An innovative validation sensitivity index is developed to quantify the change in accuracy per unit of spatial scale and sample size, enabling a more refined assessment of model robustness. The index is applied to three representative hyperspectral datasets, covering diverse environmental and spectral conditions. Results show that spatial scale accounts for 0~21.0% accuracy variation, training sample size contributes 5.6~36.5% variation, but their interaction leads to 5.4~70.3% variation, indicating a nonlinear amplification enhanced effect. These findings may be explained by the compounded influence of data contextuality, spatial redundancy, and model overfitting dynamics. This study demonstrates the critical need to consider spatial interactions in validation design, offering new insights for enhancing the reliability of geospatial artificial intelligence (GeoAI) applications in remote sensing and spatial data science.

9:15am - 9:30am

Assessment of RTM-induced Surface Reflectance Differences between 6SV and VLIDORT under a Single Atmospheric-correction Framework

Seungwon Kim¹, Suyoung Sim¹, Jongho Woo¹, Sungwoo Park¹, Seungkyoo Lee¹, Chaeyun Kim¹, Huiji Yu¹, Kyung-Soo Han²

¹Division of Earth Environmental Science (Major of Spatial Information Engineering), Pukyong National University, Republic of Korea; ²Professor, Division of Earth Environmental Science (Major of Spatial Information Engineering), Pukyong National University, Republic of Korea

Surface reflectance is a foundational variable in optical remote sensing, as inaccuracies introduced during atmospheric correction can propagate and amplify across subsequent satellite-derived products. Nonetheless, the extent to which the choice of Radiative Transfer Model (RTM) affects reflectance retrieval has not been sufficiently examined. This study investigates how two widely used RTMs—6SV and VLIDORT—produce different surface reflectance outcomes when applied under consistent atmospheric and geometric conditions for the GEO-KOMPSAT-2B/GEMS instrument. To ensure comparability, both models were driven by identical GEMS aerosol properties and an equivalent LUT configuration.

The comparison shows that while the two RTMs reproduce broadly similar spatial patterns, systematic quantitative differences remain in the retrieved reflectance. These differences vary depending on atmospheric and viewing conditions, particularly under higher aerosol loading. A sensitivity analysis further indicates that aerosol amount and scattering characteristics, alongside viewing geometry, are key factors influencing the magnitude of RTM divergence.

Overall, this study provides a structured assessment of RTM-dependent variability in atmospheric correction and highlights the importance of model choice when interpreting or harmonizing surface reflectance products. The findings offer a basis for improving consistency in future GEMS-based retrievals and for advancing reliable surface reflectance generation in geostationary remote sensing.

9:30am - 9:45am

Attention-driven Cross-modal Self-supervised Learning for Label-efficient Hyperspectral-LiDAR DSM Classification

Jonathan Gonzalez Santiago¹, Wolfgang Gross¹, Karsten Schulz¹, Wolfgang Middelmann¹, Uwe Soergel²

¹Fraunhofer IOSB, Germany; ²Institute for Photogrammetry and Geoinformatics (ifp), University of Stuttgart, Germany

Remote sensing acquisition systems rely on a range of platforms, from drones to satellite missions, to record multimodal Earth surface data. This fact encourages the preparation of datasets with complementary properties, thereby increasing their discriminative potential. A common complementary combination is between Hyperspectral and LiDAR-generated digital surface model data. While engaging, this fusion poses challenges for specific applications. Multiple works fuse these modalities at the feature level using vector concatenation, maximization, or averaging. Although functional, these methods omit target interactions between the modalities. Another challenge in remote sensing is the quantity and quality of labels required by deep learning methods, which are expensive, error-prone, and difficult to scale. We address the challenges above by proposing a self-supervised processing framework based on cross-modal attention that effectively fuses features at multiple levels, thereby exploiting complementary information across data streams. Specifically, our method is founded on a pseudo-Siamese network that reweights each modality’s features with information from the other via a mirrored cross-modal attention. The network’s objective is to maximize the similarity between the feature representations of both streams. A fusion network builds a latent representation using the learned encoders and attention modules. Then, a k-Nearest Neighbor classifier categorizes each sample within the representation using ten labels per class. Our experiments show that our spatial- and channel-spatial cross-modal attention approaches outperform well-established fusion methods for label-efficient land cover classification across datasets. Our findings lay the groundwork for fusion methods that effectively exploit inter-stream data relationships to encourage complementarity.

9:45am - 10:00am

GAN-based pan-to-rgb Image Translation for remote sensing Data

Xiaowei He¹, Yingzi Xiong¹, Xiao Ling¹, Qinghong Sheng¹, Xiao Xu²

¹Nanjing University of Aeronautics and Astronautics, China, People's Republic of; ²Yangtze Delta Region Institute of Intelligent Sensing (Nantong)

Despite the rapid development of satellite sensors, acquiring high-resolution RGB images remains a challenge. In this paper, a GAN-based multiscale feature-based pan-to-rgb model is proposed to establish a novel framework for high-resolution, high-fidelity RGB images generation from remote sensing panchromatic images. The spatial structure, texture, and color of the results are consistent with the real images, and the colors are naturally realistic and vibrant. Multiscale features and symmetric luminance color decoders are utilized to overcome color desaturation, inaccuracy, and distortion in conventional algorithms. By combining CNNs for local feature modeling and transformers for global feature modeling, this approach learns pan-to-rgb mappings to produce high-resolution, high-fidelity RGB images in CIELAB space. Besides, the luminance distance loss and the color distance loss are utilized to prevent the coupling of luminance and color. We also conducted experimental validation on Gaofen-7 satellite data, and the results demonstrated that the FID, CF, and △CF indicators of the proposed algorithm improved by 2.90%, 11.77%, and 64.51%, respectively, compared to the comparison algorithms.

8:30am - 10:00am

WG I/6B: Orientation, Calibration and Validation of Sensors
Location: 714A

8:30am - 8:45am

Evaluation and performance assessment of a novel UAV-borne laser scanner system

Gottfried Mandlburger¹, Elisabeth Ötsch¹, Philipp Knopf²

¹TU Wien, Department of Geodesy and Geoinformation, Austria; ²Knopfhoch GmbH, Austria

Miniaturized UAV laser scanning systems have advanced rapidly over the past decade, especially in the low-cost sector. DJI entered this field with the Zenmuse L-series, integrating GNSS/INS with compact scanners. While the first-generation L1 showed moderate precision, the L2 improved notably through reduced beam divergence. In November 2025, DJI released the Zenmuse L3. In this contribution, we assess its performance.

The main upgrade from L2 to L3 lies in the LiDAR unit: L3 uses a single 1535 nm laser instead of multiple 905 nm diodes, offers a symmetric 0.25 mrad beam divergence, and supports pulse repetition rates from 350 kHz to 2 MHz. High PRR operation is limited to altitudes ≤50 m due to missing multiple-time-around resolution. Scan modes include linear, non-repetitive, and a new star-shaped pattern.

L2 and L3 were tested at three sites in Lower Austria covering a warehouse, power-lines, and forests. Flights were conducted at 80 m AGL (350 kHz) and, for the warehouse, 50 m AGL (2 MHz). Precision, strip consistency, point density, feature separability, and vegetation penetration were evaluated using the scientific software OPALS.

L3 data showed sharper edges, reduced noise, and higher separability, yielding spline-fit residuals of 0.9 cm versus 2.6 cm for L2 for reconstructing a double-threaded power-line. Ground point coverage in forests increased from 18 % (L2) to 51 % (L3). Strip height differences are around 2 cm for both sensors and L3 achieved sub-centimeter precision on sealed surfaces. Overall, L3 offers substantial gains in spatial resolution, precision, and vegetation penetration.

8:45am - 9:00am

Geometric and radiometric Calibration of a rotating multi-beam Lidar using a rotating tilted Platform

Heikki Hyyti, Matias Mäki-Leppilampi, Harri Kaartinen

Finnish Geospatial Research Institute FGI, Finland

Intrinsic calibration of rotating multi-beam lidars (RMBL) enables more precise measurements. We calibrated our sensor to improve its geometric and radiometric accuracy using a rotating tilted platform. The rotating mechanism widens the field of view of each lidar channel and allows all lasers of the sensor to measure the same areas in a room containing planar wall and floor sections. Therefore, we can collect measurements for geometric and radiometric calibration with minimal amount of calibration targets. Furthermore, we used data based numerical minimization to estimate the calibration parameters for all 128 lidar channels in our RMBL sensor. For the intrinsic geometric calibration of the sensor, we estimated the elevation and azimuth angles of each laser. For the radiometry, we estimated a linear model for each laser to correct the intensity measurement. For a linear model, two different known diffuse reflectance targets are sufficient for the radiometric calibration. We tested our methods in two different environments, in an office room and a longer corridor. We showed that the methods can improve the precision of the RMBL sensor significantly. Regarding geometry, we were able to reduce the error on average from 16.1 mm to 15.1 mm (6.2% improvement). For radiometry, we were able to improve the reflectance measuring accuracy on average from 9.5% errors down to -0.9% errors (91% improvement).

9:00am - 9:15am

Tightly-coupled joint Adjustment of static and kinematic Laser Scanning Data

Florian Pöppl, Philipp Amon, Nikolaus Studnicka, Martin Pfennigbauer, Andreas Ullrich

RIEGL Laser Measurement Systems GmbH, Austria

In recent years, laser scanning has evolved into a core surveying technology for 3D mapping, both statically from stationary scan positions (terrestrial laser scanning, TLS) and kinematically from moving platforms (kinematic laser scanning, KLS). Consequently, there is a growing demand for methods that efficiently and coherently support both static and kinematic data acquisition modes. This contribution presents a tightly-coupled approach for the co-registration of TLS and KLS data, which simultaneously integrates GNSS positions, inertial measurements, planar features extracted from both static and kinematic point clouds, and control information in a joint non-linear least-squares adjustment. This is neither just a transformation of the kinematic onto the static point cloud nor a simple correction of the trajectory in e.g., a strip adjustment, but rather a tightly coupled adjustment of static and kinematic data. This approach avoids the need for additional survey control for kinematic data by leveraging the static scan data as a proxy, enabling accurate georeferencing even in scenarios where the individual datasets cannot be reliably tied to control points. Results show that the co-registration notably improves the relative consistency of kinematic datasets with respect to a static reference. Such co-registration enables new use-cases for multi-modal data acquisition, such as change-detection in repeated kinematic data acquisitions with respect to a static reference dataset, or more flexible ways of integrating ground control in kinematic surveys.

9:15am - 9:30am

Position and Orientation from Asynchronous Lidar in GNSS Denied Environments

Craig Glennie, Francisco Haces-Garcia

University of Houston, United States of America

This study investigates the use of a distributed asynchronous lidar system for augmented position and orientation determination in Global Navigation Satellite Systems (GNSS) denied environments. An asynchronous lidar design is one in which the laser transmitter and detectors/receivers are disconnected and carried on separate platforms. This unique geometry offers observational redundancy that can be used to estimate the trajectory of the receiver platforms. The paper presents the results of simulation experiments, first examining single epoch solutions and then considers estimates of position and orientation along simulated flight trajectories. The results show that as long as the laser transmitter is operated above the GNSS denied environment, the system is able to simultaneously estimate position and orientation for multiple receiver drones, even for extended periods of GNSS outages. The accuracy of position and orientation estimation is dependent on the exact flight path and the number of lidar receivers in the solution, but with favorable geometry the accuracy of position estimation can approach that provided by a high precision GNSS solution.

9:30am - 9:45am

Extraction of Image-to-Lidar Correspondences and their Impact on Optimal Sensor Fusion

Kyriaki Mouzakidou, Aurélien Arnaud Brun, Jan Skaloud

Earth Sensing & Observation Laboratory (ESO), Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland

This work extends our initial proof-of-concept via emulations on the benefits of relative spatial constraints between imagery and lidar point clouds in a factor graph based optimization with satellite positioning (GNSS) and raw inertial readings (Mouzakidou et al., 2025). Here, we demonstrate practically the automatic extraction and integration of 2D-3D correspondences established in the 3D domain within rough natural terrain flown over by an aircraft with sensors of high quality. We show that considering cross-domain (i.e. 2D-3D) constraints enables the calibration of internal camera parameters and its boresight on job, i.e. within mapping flight configurations, where conventional approaches fail. The common optimization of raw IMU data with such constraints improves the respective agreements between the lidar and image dense clouds, achieving consistency at ground resolution level, which is not the case for the conventional (standard) processing of acquired data.

9:45am - 10:00am

GNSS-Constrained Motion Estimation for Robust Visual-Inertial-Odometry Initialization

Chunqi Dai, Sagi Filin

Technion - Israel Institute of Technology, Haifa, Israel

Visual-inertial odometry (VIO) plays a key role in modern navigation and mapping systems.

For their successful integration, an initialization phase, in which IMU-related bias factors are estimated, becomes a fundamental step.

Without one, the subsequent nonlinear estimation of the platform pose may fail to converge or completely diverge.

As reliance on visual and inertial information may exhibit instability due to error accumulation with time, incorporating absolute positioning information through global navigation satellite system (GNSS) measurements, may enhance its robustness and accuracy.

Accordingly, GNSS and visual-inertial initialization frameworks have been receiving growing attention in recent years where current strategies tend to follow a loosely-coupled formulation that first initializes the VIO trajectory, and then aligns it with GNSS measurements.

Such strategies are multi-stage, nonlinear, and computationally expensive, motivating us to introduce an alternative framework in which GNSS position is integrated with the raw visual-inertial measurements to form absolute translation constraints.

Accordingly, we achieve a closed-form, linear and globally consistent drift-free solution which is computationally efficient and requires neither 3D reconstruction nor nonlinear refinement, as common approaches do.

Testing our initialization formulation on benchmark multi-sensor datasets, results show that we outperform current baselines while exhibiting robustness in challenging scenarios.

8:30am - 10:00am

WG IV/9B: Spatially Enabled Urban and Regional Digital Twins
Location: 714B

8:30am - 8:45am

A BIM and LLM Framework for Automated Construction and Demolition Waste Management

Nolan Porther, Amirhossein Nourbakhshrezaei, Mojgan Jadidi

Lassonde School of Engineering, York University, Canada

Artificial Intelligence (AI) integration has become an essential of modern AEC workflows, yet it has failed to gain a position in waste management. This gap is particularly prominent given the urgent environmental and legal imperatives for the sector to mitigate its demolition outputs. Existing approaches to waste classification and diversion cost estimation rely on manual interpretation of project documentation, a process that is both resource-intensive and structurally incompatible with the machine-readable data environments established by Building Information Modelling (BIM). This paper presents a framework that bridges Industry Foundation Class (IFC) compliant BIM data and Large Language Model (LLM) capabilities to automate Construction and Demolition Waste (C&DW) classification and probabilistic cost optimisation. The framework utilizes IfcOpenShell to extract element geometry and material data, channeling this information into a Retrieval-Augmented Generation (RAG) pipeline. To ensure rigorous compliance during classification, a FAISS-indexed knowledge base grounds a locally deployed Llama3 model against the specific mandates of Province of Ontario, Canada regulation 102/94. Diversion cost scenarios are computed through a Bayesian cost module coupled to a multi-objective genetic algorithm (MOGA) optimiser. Th proposed approach is evaluated against a labelled dataset of 104 IFC type-and-material combinations, the RAG classifier. Performance thresholds were established a piori based on multi-class classification benchmarks and Bayesian cost model uncertainty tolerances. The framework achieved a macro-average F1 of 0.84 and overall accuracy of 88%, satisfying the minimum criteria for automated C&DW characterization under Ontario Regulation 102/94.

8:45am - 9:00am

Open Data for large-scale geospecific 3D Simulation for Security Applications - A Case Study

Dirk Frommholz

German Aerospace Center (DLR), Germany

This case study details the integration of official large-scale open 2D and 3D geospatial data of the city of Berlin, Germany, into the Virtual Battlespace 4 (VBS4) simulator for security applications. Realistic scenery with elements specific to the target area is obtained from a digital terrain model, true-ortho mosaic, and high-resolution land use/land cover layer rasterized from OpenStreetMap vector primitives. For the central Mitte borough with its government institutions and foreign embassies, almost 20000 buildings are prepared from textured CityGML data in an automatic multi-stage process. This process involves pre-wrapping the texture images, which are referenced by the semantic 3D models using non-canonical coordinates, and the rapid creation of compact atlases to reduce the bitmap count by three orders of magnitude. To ensure that the building meshes blend seamlessly into the terrain, vertical adjustment methods are discussed, and ground extrusion is implemented to approach the model's base surfaces from below. Data import into VBS4 happens through its Geo interface for the terrain, ortho, and land cover, while the buildings are compiled into an add-on with a custom workflow that involves reprojection, collision component setup, and damage behavior configuration. During interactive convoy training in the virtual environment, a high recognition value compared to the real landscape could be attested visually. Simulation exhibited acceptable frame rates, but required considerable computing resources.

9:00am - 9:15am

An Adaptive Digital Twin Framework Based on Online Learning for Smart Water Management in Campus Buildings

Gopika Rajan, Songnian Li

Toronto Metropolitan University, Canada

Water scarcity and increasing demand have made sustainable water management a global priority, reflected in UN SDG 6, which emphasizes water-use efficiency and reducing water scarcity. Smart Water Management (SWM) has emerged as an advanced, data-driven approach that leverages ICT and IoT systems to monitor, analyze, and optimize water use. Digital Twin (DT) technology enhances SWM by creating dynamic virtual replicas of physical systems to support predictive analytics and operational intelligence. While DTs are widely used in large-scale Water Distribution Networks, these implementations typically do not require detailed 3D modelling.

Campus-scale water systems present unique challenges due to the integration of external and interior water networks, variable building functions, and the need for detailed spatial representation. This study proposes a comprehensive DT framework for Smart Water Management at Toronto Metropolitan University. It integrates BIM, GIS, sensor data, and graph-based modelling to capture 3D interior utilities and enable real-time monitoring, hydraulic simulation, and network analysis. The framework adopts Tao et al.’s five-layer DT architecture and introduces the IFCGraph Model, which combines IFC multipatch geometry with a Neo4j knowledge graph for enhanced interoperability and topological analysis. Overall, the framework supports operational intelligence, proactive management, and scalable campus-level water system optimization.

9:15am - 9:30am

An OGC standards-based Urban Digital Twin platform supporting co-creation of Positive Energy Districts: Case study of the Nordbahnhof district in Stuttgart, Germany

Rushikesh Padsala^1,3, Amando Reber², Christina Simon-Philipp², Volker Coors¹

¹Centre for Geodesy and Geoinformatics, Stuttgart Technical University of Applied Sciences (HFT Stuttgart), Stuttgart, Germany; ²Centre for Sustainable Urban Development, Stuttgart Technical University of Applied Sciences (HFT Stuttgart), Stuttgart, Germany; ³Department of Building, Civil, and Environmental Engineering, Concordia University1515 St. Catherine St. West Montreal, QC, H3G 2W1 Canada

Urban Digital Twins (UDTs) are increasingly recognized as enablers of evidence-based planning and citizen engagement. While the involvement of civil society in planning the built environment is well established, its role and motivation in advancing the clean energy transition remain largely unexplored. This paper presents the development and application of an Open Geospatial Consortium (OGC) standards-based UDT platform for the co-creation of Positive Energy Districts (PEDs), as demonstrated through the Nordbahnhof district case study in Stuttgart. The platform integrates interoperable 3D city and energy data using CityGML 2.0 with its Energy ADE 3.0 extension, both compliant with OGC standards to ensure semantic consistency and cross-domain interoperability. SimStadt energy simulation results are stored in the Energy ADE schema within PostgreSQL/3DCityDB database. These data are published through an OGC Web Feature Service (WFS), while 3D city geometries are served as 3D Tiles. In the CesiumJS web-viewer, both services are linked via GML identifiers, enabling coordinated interaction between geometry and energy data for real-time visualization of the district-scale energy balance. The platform was tested with citizens, who learned about load profiles, photovoltaic (PV) potential, and energy efficiency while acting as “district energy planners.” Their responses/willingness to adopt PV and/or modify energy-use behavior were translated into slider inputs to visualize real-time energy-balance outcomes through the platform. Results demonstrate the potential of interoperable, OGC-compliant UDTs to connect data providers, planners, and citizens in a shared decision-support environment. The architecture’s open, modular design enables wider replication, promoting scalability and long-term municipal adoption for participatory energy-transition planning.

9:30am - 9:45am

Developing BIM-Based Data Analytics Dashboards for Sustainable Construction and Demolition Waste Management and Environmental Evaluation

Sara Abbasian, Mojgan Jadidi

Department of Civil Engineering, Lassonde School of Engineering, York Univeristy, Canada

Building Information Modeling (BIM) is increasingly mandated worldwide as part of the digital transformation of the construction industry. While widely used in design and construction, its potential for managing construction and demolition waste (C&DW) remains underexplored, despite demolition accounting for 70–90% of building-related waste and 30–40% of global solid waste. Revit models provide rich data but are computationally intensive and require specialist expertise, limiting their direct use for waste quantification and sustainability evaluation. This study develops a BIM-enabled data integration and visualization framework that automates waste estimation, material classification, and environmental evaluation by linking BIM data with heterogeneous datasets through Speckle connectors and Power BI dashboards. Supplementary datasets included material densities, expansion coefficients, recycling rates, and environmental factors such as CO₂ emissions and energy intensities. A case study of York University’s Bergeron Centre illustrates the framework’s effectiveness across three demolition stages. The non-invasive dismantling phase highlighted significant opportunities for material recovery, while semi-invasive deconstruction captured recyclable structural components with moderate landfill requirements. The final core demolition stage revealed the greatest potential for recycling, particularly in concrete and steel, though it also underscored the challenges of diverting large volumes of residual waste from disposal. By integrating BIM with environmental datasets and interactive dashboards, the system delivered holistic insights into recovery, landfill diversion, and CO₂ reduction. Findings confirm its scalability, accessibility, and value as a decision-support tool for sustainable demolition and circular economy objectives.

9:45am - 10:00am

Urban Intervention Effects on Land Surface Temperature: A Prototype EO-Based Simulation Framework for Urban Digital Twin Applications

Daniele Oxoli, Alberto Vavassori, Maria Antonia Brovelli

Dept. of Civil and Environmental Engineering, Politecnico di Milano, Milano, Italy

This contribution presents a prototype Earth Observation-based simulation framework to assess how large-scale urban interventions affect Land Surface Temperature (LST). Focusing on the Metropolitan City of Milan (Northern Italy), the framework integrates thermal (Landsat 8/9) and multispectral (Sentinel-2) satellite imagery with Local Climate Zone (LCZ) maps, urban morphology and material fraction layers. Random Forest regression models are trained to predict seasonal LST patterns. A simulation module, based on raster algebra, enables scenario testing by modifying predictor layers to reflect planned urban transformations, generating corresponding LST responses. The framework is conceived for integration into Urban Digital Twin platforms to support “what-if” scenario analyses for climate-resilient urban planning and adaptation.

8:30am - 10:00am

WG III/8H: Remote Sensing for Agricultural and Natural Ecosystems
Location: 715A

8:30am - 8:45am

Integrating multi-source remote sensing and soil attributes through ensemble learning for large-scale soil organic carbon estimation

Jayantrao Diliprao Mohite¹, Suryakant Ashok Sawant¹, Danielle Berard², Sonali Kulkarni¹, Ankur Pandit¹, Dineshkumar Singh¹

¹Tata Consultancy Services, India; ²EMILI, Manitoba, Canada

Accurate estimation of Soil Organic Carbon (SOC) is essential for sustainable land management, agricultural productivity, and climate change mitigation. This study presents a novel framework for SOC estimation using machine learning models and diverse predictors, including spectral bands, vegetation and soil indices, topographical features, soil texture components, and HSV-derived soil color proxies. SOC data from 180 samples collected between 2007 and 2020 across 21 fields in Manitoba, Canada, were used for model training and validation. Landsat 5, 7, and 8 data were utilized to extract spectral and soil indices, while SoilGrids and SRTM DEM provided texture and topographical features. Random Forest (RF), Extreme Gradient Boosting (XGB), and a BC-VW-based ensemble model were evaluated across five feature scenarios. The ensemble model achieved the highest accuracy, with an R² of 0.57, RMSE of 0.25, and RMSPE of 7.87%, outperforming individual models. SHAP-based feature selection identified Clay%, SWIR1, and Value (HSV) as the most critical predictors. Independent validation using data from 2021 and 2023 confirmed the model's robustness, with RMSPE values of 10.93% and 12.83%, respectively. This study demonstrates the importance of integrating soil-specific indices, texture, and color features with ensemble modeling to improve SOC predictions. The framework offers a scalable and reliable approach for large-scale SOC monitoring, contributing to sustainable agriculture and carbon sequestration efforts. The findings underscore the need for robust uncertainty analysis and independent validation, setting a benchmark for future SOC modeling studies.

8:45am - 9:00am

Leveraging Post-Rainfall Spectral Proxies and Multi-Sensor Imagery to Refine Soil Salinity Maps in Dryland Environments

Jamal-Eddine OUZEMOU^1,2, Ahmed LAAMRANI^1,2,3,4, Ali EL BATTAY^2,5,6, Joann Whalen^1,2, Abdelghani Chehbouni^1,2

¹Center for Remote Sensing Applications (CRSA), Mohammed VI Polytechnic University (UM6P), Ben Guerir 43150, Morocco; ²College of Agriculture and Environmental Sciences (CAES), UM6P, Ben Guerir 43150, Morocco; ³Department of Geography, Environment & Geomatics, University of Guelph, Guelph, ON N1G 2W1, Canada; ⁴Institut de Recherche sur les Forêts (IRF), Université du Québec (UQAT), Rouyn-Noranda, Québec, Canada; ⁵Center for Sustainable Soil Sciences (C3S), UM6P, Ben Guerir 43150, Morocco; ⁶Department of Natural Resource Sciences, McGill University, Ste-Anne-de-Bellevue, Québec, Canada

Soil salinization is a major form of land degradation in drylands, where closed hydrological systems, shallow water tables, and strong evaporative demand favor the recurrent buildup of salts at the surface. Accurate and spatially explicit salinity assessment is crucial for guiding agricultural management and land rehabilitation, yet conventional soil sampling remains spatially restrictive and most remote-sensing approaches insufficiently capture the hydrological and pedological processes that drive seasonal salt redistribution. This study evaluates whether post-rainfall spectral information can improve soil salinity mapping in a large endorheic depression in central Morocco (Sehb El Masjoune). A dataset of 121 ECe-measured topsoil samples was combined with multi-sensor optical imagery from Sentinel-2, Landsat-9, and PlanetScope. In addition to standard salinity, soil, vegetation, and moisture indices, two new post-rainfall predictors were developed: a Depression Proxy (DP), delineating moisture-retentive micro-depressions where salts accumulate, and a Soil Cluster Proxy (SCP), capturing soil textural and compositional contrasts from spectral responses. These predictors were integrated into Random Forest and Gradient Boosting Regressor models and evaluated using repeated cross-validation on Box–Cox-transformed ECe. The combination of DP and SCP with Sentinel-2 predictors yielded the highest performance (R² = 0.92; RMSE = 20.53 dS·m⁻¹), outperforming models relying only on spectral indices and topographic covariates. Seasonal salinity maps revealed strong intra-annual dynamics associated with rainfall events and subsequent evaporative concentration. The proposed DP–SCP framework offers transferable, physically interpretable predictors for dryland salinity assessment and provides a scalable step toward process-informed remote-sensing approaches supporting climate-resilient land-use planning.

9:00am - 9:15am

Enhancing Soil Nitrogen Mapping Using Reconstructed Water Vapor Bands in PRISMA Hyperspectral Imagery

Nadir El Bouanani^1,2, Ahmed Laamrani^1,3, Jamal-Eddine Ouzemou¹, Adam Bouhnidira², Hamd Ait Abdelali², Mohamed Bourriz^1,2,4, Francois Bourzeix², Abdelghani Chehbouni¹

¹CRSA, Mohammed VI Polytechnic University (UM6P), Campus Ben Guerir 43150, Morocco; ²Analytic Laboratory (Alab), UM6P, Campus Rabat 11103, Morocco; ³Department of Geography, Environment & Geomatics, University of Guelph, Guelph, ON N1G 2W1, Canada; ⁴Friedrich Schiller University Jena, Department of Geography, Jena 07743, Germany

Soil total nitrogen (TN) is a critical nutrient for sustainable agricultural management, yet large-scale mapping remains constrained by high laboratory analysis costs. Spaceborne hyperspectral remote sensing offers a promising alternative, but its effectiveness is limited by spectral gaps caused by atmospheric water-vapor absorption in nitrogen-sensitive NIR and SWIR regions. This study evaluates the contribution of reconstructing missing spectral domains to improve soil TN estimation from PRISMA hyperspectral imagery. A spectral gap-filling framework combining a conditional generative adversarial network (cGAN) with a self-supervised masked autoencoder pretraining strategy was developed to reconstruct reflectance spectra across water-vapor absorption intervals (950–990 nm, 1320–1500 nm, and 1780–2050 nm), achieving R² = 0.95 on PRISMA test data and R² = 0.91 against ASD FieldSpec III measurements. Applied to 1,037 samples across three Moroccan agricultural regions, incorporating reconstructed bands consistently improved TN prediction: R² increased from 0.83 to 0.89 in Al Haouz, 0.73 to 0.79 in Doukkala, with R² = 0.73 in Khouribga. Feature-selection analyses identified reconstructed water-vapor bands among the most informative predictors (1050–1450 nm, 1800–2100 nm, and 2300–2400 nm). These findings demonstrate that spectral gap filling enhances spaceborne hyperspectral data usability for operational soil TN monitoring and precision agriculture.

9:15am - 9:30am

Evaluation of a High-Resolution L-Band RPAS-Mounted Sensor for Soil Moisture Estimation

Hunter Rusk¹, Aaron Berg¹, Jaison Thomas Ambadan¹, Alexander McLaren¹, Laxhman Ramsahoi¹, Maik Wolleben²

¹University of Guelph, Canada; ²Skaha Labs, Canada

This study investigates the performance of a novel L-band passive microwave radiometer mounted on a Remotely Piloted Aerial System (RPAS) for high-resolution soil moisture retrieval. Soil moisture is a critical variable for predicting crop stress, scheduling field operations, and optimizing irrigation, yet traditional measurement approaches have limitations. Satellite radiometers provide broad spatial coverage but coarse resolution, while in situ sensors offer high accuracy with limited spatial representativeness. RPAS-based sensing offers an intermediate solution, enabling fine-scale mapping with flexible deployment. The sensor evaluated in this research, developed by Skaha Remote Sensing Ltd., measures brightness temperature (Tb) at 1.4 GHz, a frequency where soil emissivity varies strongly with moisture content.

Field campaigns were conducted from May to October 2025 at the Elora Research Station in Ontario, with weekly flights over plots containing different crops and tillage conditions. Concurrent ground measurements of soil moisture, leaf area index (LAI), and vegetation water content (VWC) supported evaluation of vegetation impacts. Statistical analyses, including Pearson correlation and linear regression, revealed the relationships between microwave emissions, soil moisture, and vegetation properties.

Results show a strong inverse relationship between microwave emissions and soil moisture, with vertically polarized signals exhibiting the highest sensitivity. Vegetation effects were crop-dependent due to the unique canopy structures. These findings demonstrate that RPAS-mounted radiometers can provide reliable, high-resolution soil moisture measurements and highlight the importance of crop geometry in interpreting microwave observations.

9:30am - 9:45am

Unmasking drought dynamics: a physically interpretable GMM-MST framework for high-resolution diagnostic monitoring

Jinhui Hu¹, Qiuwen Zhang², Changtao Deng³

¹Huazhong University of Science and Technology - Main Campus; ²Huazhong University of Science and Technology - Main Campus; ³Pearl River Water Resources Research Institute

Drought represents one of the most devastating natural hazards, causing billions in economic losses and threatening global food security. Conventional single-variable drought indices often fail to capture drought's multifaceted nature, while existing composite indices are frequently constrained by linear assumptions or operate as 'black boxes,' obscuring physical drivers. This study introduces the State-Space Gradient Drought Index (SSGDI), developed via a novel Gaussian Mixture Model–Minimum Spanning Tree (GMM–MST) framework that re-conceptualizes drought as a trajectory within a physical system. By modeling a 3D state-space composed of the Standardized Precipitation Index (SPI), Standardized Soil Moisture Index (SSMI), and Standardized Runoff Index (SRI) with a Gaussian Mixture Model (GMM), the framework learns distinct hydro-climatic archetypes; a Minimum Spanning Tree (MST) then imposes physically plausible connections among these archetypes to define the principal wet-to-dry gradient. The final SSGDI is derived from a data point's probabilistic position along this gradient and is complemented by a classification system that diagnoses the drought's physical type. Applied to the Central China Triangle, the framework successfully uncovered the hydro-climatic system's intrinsic, non-linear structure. Validation showed the SSGDI provides a significantly more robust measure, with SSGDI-6 achieving a spatially-averaged Pearson correlation of r = 0.80 against the PDSI benchmark—a marked improvement over any single component. The SSGDI framework bridges robust statistical aggregation with clear physical interpretation, offering a powerful tool that provides not just a severity score but a diagnostic narrative for proactive drought management.

9:45am - 10:00am

Applications of Coherent Fine Resolution Synthetic Aperture Radar Imagery for Mid-Season Corn Yield Prediction

Abbey Papadimitriou¹, John Lindsay¹, Aaron Berg¹, Qiaoping Zhang²

¹University of Guelph, Canada; ²ICEYE Oy, Finland

Synthetic Aperture Radar (SAR) has become a popular form of remotely sensed data for agricultural management due to its ability to acquire cloud-free images at extremely high temporal resolutions. A particularly useful product that can be derived from SAR imagery is coherence, which visualizes structural target changes over time based on phase decorrelation. In a crop management context, coherence is largely unexplored. This is in part due to the fine resolution image requirements that field-scale vegetation monitoring demands. Within agricultural fields, high image coherence should correlate to areas with minimal to no crop growth, whereas low image coherence should correlate to areas where crops are consistently growing. Based upon this hypothesis, our research investigates the applications an ICEYE fine spatial resolution X-band SAR imagery time series has for detecting low yielding regions within corn fields using coherent change detection.

8:30am - 10:00am

WG IV/1C: Spatial Data Representation and Interoperability
Location: 715B

8:30am - 8:45am

Hierarchical Polygon-to-Point Collapsing for Multi-Scale Representation Based on the Straight Skeleton and Dual Half-Edge Data Structure

Amin Gholami¹, Pawel Boguslawski¹, Martijn Meijers²

¹Wroclaw University of Environmental and Life Sciences, Institute of Geodesy and Geoinformatics, Grunwaldzka 53, 50-357 Wroclaw, Poland; ²GIS Technology, Faculty of Architecture and the Built Environment, Delft University of Technology, Julianalaan 134, 2628 BL Delft, The Netherlands

This paper presents a hierarchical method for collapsing a polygon to point within a structured multi-scale representation. The approach is based on the straight skeleton, which drives the shrinking process through event-based transformations such as edge and split events. These events define how the polygon changes during collapse and produce a hierarchy of intermediate geometric states between the initial polygon and the final point.

The resulting hierarchy is integrated into a Dual Half-Edge (DHE) structure, where the primal space represents successive geometric states and the dual space represents the hierarchical relations between them. This produces a connected 2D+1D representation in which the third dimension corresponds to scale rather than physical height. The resulting model is interpreted as a LoD Transition Space (LTS), allowing the full polygon-to-point transition to be represented continuously across scale.

The proposed framework contributes to model-based multi-scale representation by explicitly linking geometric transformation, topological change, and hierarchical structure within a unified representation. In addition to its relevance for vario-scale cartography and generalisation, the method also has potential applicability in domains where gradual geometric transformation is required, such as procedural modeling, animation, and related geometric applications.

8:45am - 9:00am

The Research on Renewal Theory and Method for the CGCS2000 Reference Framework

Zhihao Jiang, Ju Bai, Hao Yu, Yunlu Peng, Linlin Che, He Zhang

National Geomatics Center of China

The CGCS2000 (China Geodetic Coordinate System 2000) reference framework, which has been employed since July 1, 2008 is based on the ITRF97 reference framework and only meets the application requirements of China's regional. With the sustained development of China's economy and society, and the globalization of the applications of BeiDou navigation satellite system (BDS), there is a need to establish global CGCS2000 reference framework. This paper studies mathematical method for construction Global CGCS2000 reference framework, the theory and algorithm of two-step method with the inner constraints theory is analysed. The constraint conditions of coordinate reference are redefined according to the minimum standard of frame transition parameters and rate variation. As a result, the adjusted network enjoys the highest degree of fitting to the shape of the initial network and maintain the inherent purity of the coordinate network using different observation technologies, this research result can improve the basic theory of terrestrial reference framework determination, and provide scientific methods for the globalization of the CGCS2000.

9:00am - 9:15am

Open Source 3D Cadastre Visualisation Pipeline

Pavan Sai Goud Goddu, Sisi Zlatanova, Mohsen Kalantari

University of New South Wales, Australia

Interpreting multi-storey property rights is difficult when information is scattered across 2D plans and text or locked inside desktop projects. We present a web-based pathway that communicates strata lots and common property consistently across levels in a standard browser. Aligned with the 3D Cadastral Survey Data Model and Exchange (3D CSDM) of Australia, we propose an open-source, web-first approach. The method couples a lightweight browser viewer (level/tenure filters, plan overlay, search, readable legend) with an explicit conversion step that standardises common GIS inputs into a fixed core JSON profile, with limited official CSDM-aligned JSON-LD hooks applied only to selected keys that have exact matches in the published vocabularies. Using a New South Wales case study, we evaluated the viewer against ISO 9241-11 criteria (effectiveness, efficiency). Across repeated trials (cache disabled/enabled), mean page-open times were 0.60 s (Chrome) and 1.48 s (Edge); interaction averaged 50–60 FPS; level filters applied in 40–55 ms; all five tasks succeeded. Practically, this delivers fast, consistent 3D communication of lots and common property without installs, lowering access barriers for agencies and owners while aligning with 3D CSDM’s web-first direction. Next, we will finalise parity between Upload-and-View and the Reference Viewer and add a light in-viewer validation panel.

9:15am - 9:30am

Shadow Geometric Analysis Utilising CityGML Models and FME

Pawel Boguslawski¹, Malgorzata Jarzabek-Rychard¹, Stanislaw Biernat²

¹Wroclaw University of Environmental and Life Sciences, Poland; ²infoSolutions Sp. z o.o.

This research presents a methodology for conducting shadow geometric analysis, specifically the shadow boundary in an urban model. Input data include a georeferenced CityGML LoD2 and terrain model. Additional land cover data is used to exclude some parts of the model from analysis. Shadow computation is based on a sunray vector, which is computed based on the sun position on the given day and time. The geometry of original models are divided into parts classified as either exposed to the sun or shaded. It can be used for analytical purposes in other applications, such as urban planning, energy assessment, and photovoltaic potentiality analysis, by accurately identifying sunlit and shaded areas within 3D city models. The analysis is performed in the FME software package, which is a general-purpose ETL tool.

9:30am - 9:45am

Software Development for Producing Texture Images Mapped on a Building Surface of a 3D City Model Using Aerial Images

Ryuji Matsuoka, Masato Ishikawa, Tomoaki Inazawa, Yoshihiko Nakanishi, Futa Kawamata, Masahito Takada, Takuya Danjo

Kokusai Kogyo Co., Ltd., Japan

It is desirable that a 3D city model at level of detail 2 (LOD2) has texture images mapped on building surfaces. Owing to the cost of image collection, it would be the best way to use aerial images for texture mapping at present. Although aerial oblique images provide higher-resolution texture images, using aerial oblique images has a major issue of occlusion. Accordingly, we develop software for texture mapping to a 3D city model using aerial nadir and oblique images, aiming to minimize the impact of occlusion. The software designed to be used in ordinary operation includes the features of automatically detecting occlusions on building surfaces within images by utilizing the geometry of a 3D city model and automatically selecting appropriate oblique and nadir images for texture mapping. The major feature of the developed software is its ability to process grid by grid on a building surface. The validation experiment results confirm the software's satisfactory performance in practice. Moreover, the experiment results indicate that the performance of the software depends on the ability of a 3D city model to represent buildings. Since we have recognized that it would be effective if each pixel of a texture image has its own resolution, we plan to modify the software so that each pixel can have its own resolution.

9:45am - 10:00am

Automatic detection and condition assessment of agricultural plastic greenhouses using deep learning and aerial rgb images

Davoud Omarzadeh¹, Mehran Alizadeh Pirbasti², Hamed Bahrevar³, Hoda Khalaghi⁴, Gavin McArdle², Bahram Salehi⁵

¹Institut d’Estudis Espacials de Catalunya (IEEC), Barcelona, Spain.; ²School of Computer Science, University College Dublin, Dublin, Ireland.; ³University of Tabriz, East Azerbaijan, Iran.; ⁴Universitat Autònoma de Barcelona, Barcelona, Spain.; ⁵State University of New York College of Environmental Science and Forestry (SUNY ESF), Department of Environmental Resources Engineering, Syracuse, USA.

Rapid urbanization in developing countries such as Iran has intensified pressure on agricultural land, highlighting the need for sustainable and efficient food production systems. Agricultural Plastic Greenhouses (APGs) have become a scalable alternative by enabling year-round cultivation and optimized land utilization. However, their rapid expansion necessitates continuous monitoring to evaluate structural integrity and environmental impacts, including soil degradation, plastic waste accumulation, and water consumption. This study presents a deep learning-based framework for the automated detection and condition assessment of APGs using 0.5~m resolution Google Earth imagery across four major agricultural regions in Tehran County: Pakdasht, Qarchak, Pishva, and Varamin. The proposed pipeline integrates YOLOv11 for precise APG segmentation with a U-Net architecture employing a MobileNetV2 backbone for classifying damaged and intact structures. Out of 158,912 analyzed image tiles, 6,835 contained APGs, covering an estimated area of 18.73~km\textsuperscript{2}. Among these, 1,863 damaged structures were identified, predominantly located in Pakdasht and Pishva. Around 20\% of the annotated greenhouses were verified on-site, improving labeling reliability, and the relatively standardized design of APGs in Iran suggests the model could generalize to similar regions, with minor fine-tuning using local samples if necessary. GIS-based spatial analysis further delineated potential plastic waste risk zones, supporting targeted environmental management. Comparison with government statistics and Sentinel-2 imagery from 2021 and 2024 revealed a continued shift toward greenhouse farming in response to declining cropland availability. The proposed framework provides a scalable and replicable tool for periodic APG monitoring, facilitating data-driven policymaking and sustainable agricultural planning.

8:30am - 10:00am

IvS1: Recent Advances in Iceberg Monitoring and Tracking
Location: 716A

8:30am - 8:45am

Ocean Target Discrimination in SAR Imagery through Machine Learning: Towards a Fully Automated Approach

Maria Yulmetova, Murilo Silva, Pradeep Bobby, Kelley Dodge

C-CORE, Canada

Accurate discrimination of ocean targets using satellite images is crucial for marine safety, environmental monitoring, dark vessel detection, and search and rescue operations. Artificial intelligence technologies are rapidly advancing as state-of-the-art solutions for computer vision problems, including satellite imagery target classification. This research assesses the capability of machine learning (ML) for ocean target discrimination using SAR images. Unlike other studies focusing on binary iceberg-ship classification, this paper goes a step further to investigate the opportunity for multi-class discrimination between icebergs, ships, and false alarms, both within and outside sea ice. The proposed approach enables the fully automated elimination of false alarms while accurately classifying icebergs and ships. As part of a research initiative, the first large dataset of ocean targets was compiled and utilized to train an ML model. The targets were detected in RADARSAT Constellation Mission (RCM) images over Canadian waters. During the evaluation phase, the model achieved classification accuracies of 93% for binary classification and 95% for three-class discrimination. The robustness of the fully automated approach was further validated through an additional test, yielding an overall accuracy of 91%. Moreover, the system exhibited high reliability in reducing false alarms, correctly identifying 96% of them. The implementation of the developed algorithms significantly enhances the efficiency of target detection and classification processes, thereby reducing the workload of human analysts. Such advancements are especially significant in light of the rapidly increasing volume of satellite data and the growing demand for automated, scalable solutions in maritime surveillance.

8:45am - 9:00am

Is Pre-Training Enough? Towards Multi-Task Foundation Models for Sea Ice Classification

Javier Noa Turnes¹, Muhammed Patel¹, Fernando J. Pena Cantu¹, Linlin Xu², David A. Clausi¹

¹University of Waterloo, Canada; ²University of Calgary, Canada

Synthetic aperture radar (SAR) is the primary data source for operational sea ice monitoring, providing coverage independent of illumination or weather conditions. However, annotation scarcity and the domain gap between sea ice and land based scenes hinder the direct reuse of existing pretrained models. Recent studies \cite{Allen2023,Wang2025} point toward self-supervised learning (SSL) as a way to leverage abundant unlabeled SAR imagery. In particular, masked autoencoders (MAE) \cite{He_2022_CVPR} have shown promise in remote sensing contexts by reconstructing masked inputs and learning transferable representations. We investigate whether MAE pre-training is sufficient to yield a foundation model transferable across multiple downstream sea ice tasks: concentration (SIC), stage of development (SOD), and floe size (FLOE).

9:00am - 9:15am

Automated Iceberg Detection in RADARSAT Constellation Mission (RCM) Imagery

Abigail Dalton, Lynn Pogson, Mélanie Lacelle, Benjamin Deschamps

Environment and Climate Change Canada (Canadian Ice Service), Canada

Since the 1980s, the Canadian Ice Service (CIS) has provided iceberg information for navigation in the North Atlantic. Following the breakup of the Milne Ice Shelf on Northern Ellesmere Island in 2020 and increasing risk to ships navigating bergy waters in the Canadian Arctic Archipelago and Beaufort Sea, CIS has initiated two projects with the goal of improving their operational iceberg monitoring program. The first combines RCM imagery and in-situ observations to evaluate the applicability of existing automated detection and modelling methods for monitoring icebergs and ice islands drifting in open water in the western Arctic. The second explores the use of high-resolution RCM imagery (5M and 16M) for emergency response iceberg monitoring.

9:15am - 9:30am

Automatic Segmentation of SAR imagery Using Mixture Models

Zahra Jafari^1,2, Pradeep Bobby^1,2, Rocky Taylor¹, Ebrahim Karami¹

¹Memorial University of Newfoundland; ²C-Core, Canada

Synthetic Aperture Radar (SAR) image segmentation underpins target detection, land cover classification, and environmental monitoring, yet remains challenging due to speckle, non-Gaussian backscatter statistics, and outliers. This paper presents a comparative evaluation of mixture-model–based segmentation tailored to SAR, with a focus on Radarsat Constellation Mission (RCM) imagery. We propose a segmentation algorithm that selects one of three statistical mixture models—Rayleigh, Gamma, or Lognormal—to model SAR backscatter and produce soft (posterior) segmentations, followed by posterior thresholding and optional MRF‑ICM post‑processing to enhance spatial coherence and suppress speckle-induced errors. We compare against traditional threshold-based methods (CFAR, multi-threshold Otsu) and conventional mixture-model labeling that designates the largest-scale component as the target.

On RCM data, the Rayleigh Mixture Model (RMM) is the strongest: at target pixels, the posterior probability of the largest-mean component is typically very close to 1, allowing a single Rayleigh component to capture the main body of the iceberg reliably. Unlike threshold-based baselines that yield hard segmentations, our Mixture Model (MM) approach outputs soft posteriors, enabling principled HH/HV fusion and downstream machine learning (ML). These results underscore the promise of RMM for robust iceberg detection; future work will integrate Rayleigh-based posterior features with lightweight ML classifiers to further improve performance across sensors and conditions.

9:30am - 9:45am

Cross - Sectional Morphology of Sea Ice features from IPS observations across the Newfoundland and Labrador shelf

Alka Dash¹, Rocky Taylor¹, Ian D. Turnbull²

¹Memorial University of Newfoundland, Canada; ²C-Core, St. John's, Canada

Sea ice on the Newfoundland and Labrador shelf can create major risks for ships and offshore structures. This study uses Ice Profiling Sonar and upward looking ADCP data from three moorings on the Northeast Newfoundland Shelf to examine the cross sectional morphology of important sea ice features. The data were converted from time series to spatial draft profiles using measured ice drift. From these profiles, level ice, keel features, and floes were extracted and compared across the three locations.

The results show that level ice and keels form clearly different morphological populations. Keels are generally deeper, narrower, rougher, and more peaked, while level ice is wider, smoother, lower in relief, and more rectangular in cross section. Maximum draft, mean draft, width, relief range, aspect ratio, rectangularity, and roughness provide the clearest separation between the two classes. The study also examines floe size to better understand how local ice features form. Small floes contain a higher proportion of keel features, while medium, big, and vast floes are more strongly dominated by level ice, although this pattern varies by site. NENS3 shows a higher keel fraction across floe size classes than NENS2, suggesting stronger and more persistent deformation. These findings provide new regional information for sea ice characterization and ice interaction studies.

8:30am - 10:00am

Forum2A: The Future of Space- based Earth Observation
Location: 716B

8:30am - 10:00am

ICWG III/IVb: Remote Sensing Data Quality
Location: 717A

8:30am - 8:45am

MAPSRNet: Task-Oriented Super-Resolution Network for Building Detection in Urban Area

Yuwei Cai, Zhimeng He, Meiliu Wu, Brian Barrett

University of Glasgow, United Kingdom

High-resolution (HR) satellite imagery is essential for urban monitoring and disaster management, but its use is constrained by high cost and limited accessibility. Super-resolution (SR) offers an efficient alternative by reconstructing high-quality images from low-resolution (LR) inputs, making large-scale geospatial analysis more feasible. We propose the Multi-Attention Pyramid Super-Resolution Network (MAPSRNet), which delivers two main innovations:

1. A multi-attention model that integrates a Pyramid Vision Transformer for long-range spatial dependencies with a cross-channel Involution+ module to enhance feature interactions, generating SR images with superior structural preservation and sharper boundaries.

2. The first SR network to surpass the performance of original HR images in downstream tasks, demonstrated through building detection with a ConvNeXtV2 backbone and U-Net decoder. MAPSRNet reduces false positives and negatives and, across multiple datasets, exceeds HR performance in IoU, F1-score, and overall accuracy.

Extensive experiments on the Massachusetts building dataset, the Wuhan University building dataset, and the Waterloo building datasets confirm that MAPSRNet consistently outperforms representative SR methods in both image fidelity (PSNR, SSIM) and task-level metrics. Its ability to preserve fine structural details, suppress background noise, and learn resolution-invariant features through multi-resolution training makes the reconstructed images more task-aware than raw HR data. Beyond buildings, this flexibility suggests strong potential for generalization to other land-cover classes such as roads, vegetation, and water bodies.

These results establish MAPSRNet as a cost-effective alternative to HR acquisitions and a milestone in task-driven SR research, advancing both image reconstruction and downstream geospatial analysis.

8:45am - 9:00am

Automated Monitoring of Geolocation Consistency in Micro-satellite SAR Imagery

Angel Caroline Johnsy¹, Eyrin Kim², Qiaoping Zhang¹, Valentyn Tolpekin¹, Michael Wollersheim¹

¹ICEYE, Finland; ²Stanford University, USA

High revisit-rate SAR constellations generate large volumes of imagery that require consistent geolocation accuracy to support applications such as change detection and interferometry. However, variations in orbit determination, attitude knowledge, and external factors such as Global Navigation Satellite System (GNSS) interference can introduce geolocation errors that vary across acquisitions, making large-scale validation challenging. This study presents an automated approach to detect and quantify geolocation offsets in ICEYE SAR imagery by aligning orthorectified scenes with reference images using feature-based matching and correlation-based refinement. The method is validated against independently derived absolute geolocation measurements from corner reflector calibration sites in the United States, Canada, Australia, and Poland. Evaluation across 726 acquisitions demonstrates strong agreement with reference measurements, achieving an overall root-mean-square error (RMSE) of 1.39 m, with RMSE values of 1.18 m for Spotlight mode and 1.93 m for Stripmap mode. Operational applicability is demonstrated through large-scale acquisition campaigns, including nationwide Stripmap coverage over Japan and coherent image stack analysis. The results show that the proposed method can reliably estimate geolocation offsets, detect anomalies, and monitor geometric consistency across large SAR archives, providing a practical and scalable solution for automated geolocation quality control in micro-satellite SAR constellations.

9:00am - 9:15am

Calibrated U-Net with HELIX-Based Label Enrichment for Ageing-Aware Spatio-Temporal Urban Change Detection

Sarah Hauser^1,2,3, Stephanie Dachsberger², Andreas Schmitt^2,3, Stefan Hinz¹

¹Karlsruher Institut für Technologie (KIT), Germany; ²Geoinformatics Department, Munich University of Applied Sciences (HM); ³Institute for Applications of Machine Learning and Intelligent Systems (IAMLIS)

Urbanisation and land-use change increase the demand for temporally consistent urban maps from high-resolution Earth observation imagery. A key obstacle is label ageing: benchmark annotations are often years older than current true orthophotos (TOP), causing semantic and geometric mismatches (e.g., demolished/new buildings, shifted vegetation boundaries) that degrade supervised learning, calibration, and transfer. This paper presents a probabilistic, quality-aware segmentation framework based on a compact U-Net. Legacy annotations are converted into edge-adaptive soft labels to encode boundary uncertainty. A HELIX-derived per-pixel supervision quality score Q is computed and integrated as a weight in a Q-weighted Kullback--Leibler objective with an agreement-focal component, reducing the influence of unreliable or outdated regions. Global temperature scaling is then applied to obtain calibrated per-class probability fields with comparable confidence magnitudes. Experiments on ISPRS Potsdam and Vaihingen combined with recent (2024) TOPs evaluate temporal transfer (archival supervision vs. updated imagery of the same area) and spatial transfer (cross-city application). Finally, calibrated probability fields are used to derive probabilistic semantic transitions and temporal reliability scores, supporting uncertainty-aware mapping of urban change such as construction, sealing, and vegetation loss.

9:15am - 9:30am

The survivorship bias in remote sensing

Laurent Polidori¹, Saeid Pirasteh²

¹UFPA, Brazil; ²Shaoxing University, China

Survivorship bias refers to the fact that conclusions are drawn from a non-representative sample limited to cases that have survived a selection process. This article shows that this bias affects scientific literature, which tends to select successful experiments and hide failures. Remote sensing, like other data-driven sciences, is affected by survivorship bias, making it difficult to have a clear idea of the data's and methods' actual potential and limitations. A typology of failure causes is proposed to encourage critical reading of the bibliography, and perspectives are outlined to overcome survivorship bias by improving practices within the academic and industrial remote sensing communities.

9:30am - 9:45am

A dynamically weighted framework for adaptive reference-based super-resolution

Chae-Eun Kim¹, Junhwa Chi²

¹Department of Data Engineering, Pukyong National University, Busan, Republic of Korea; ²Major of Big Data Convergence, Division of Data Information Sciences, Pukyong National University, Busan, Republic of Korea

Satellite remote sensing is inherently constrained by a fundamental spatio-temporal trade-off by physical sensor limitations. Super-Resolution (SR) techniques are required to overcome these constraints and obtain high-resolution time-series data. However, Single Image Super-Resolution (SISR) provides insufficient information for robust restoration. To address this, Reference-Based Super-Resolution (Ref-SR), which utilizes a high-resolution (HR) reference (Ref) image, has been investigated. Nonetheless, Ref-SR introduces the challenge of reference misuse, stemming from the temporal mismatch (or inconsistency) between the target low-resolution (LR) image (e.g., clouds, seasonal changes) and the Ref image (often a long-term median composite). To address this reference misuse problem, this study proposes an adaptive Ref-SR framework that incorporates a similarity weight map derived from the LR and Ref information. This weight map is computed solely from the pixel-wise similarity between the LR and Ref inputs, requiring no ground truth HR, and functions as a gating mechanism. This allows the network to dynamically control Ref reliability, guiding it to suppress Ref influence in mismatched regions and leverage its textures in similar ones. Validation experiments using Sentinel-2 data (LR 240m, Ref/HR 60m) demonstrate that the proposed method achieves significant performance improvements over SISR in both spatial (Peak Signal-to-Noise Ratio, Structural Similarity Index) and spectral (Spectral Angle Mapper, Error Relative Global Dimensionless Synthesis) metrics. Furthermore, qualitative analysis confirms that the framework effectively suppresses artifacts caused by the blind injection of Ref textures in inconsistent areas. This framework could contribute to the future fusion and quality enhancement of heterogeneous LR sensor data, such as GOCI-II.

9:45am - 10:00am

Ground Based Observation for Validation (GBOV): Extension Of The Analysis Ready Validation Data Service

Christophe Lerebourg¹, Rémi Grousset¹, Jean-Sébastien Carrière¹, Jadu Dash², Somnath Paramanik², Finn James², Zaib Un Nisa², Ana Pérez-Hoyos³, Darren Ghent⁴, Jasdeep Anand⁴, Ritika Shukla⁴, Jan-Peter Muller⁵, Rui Song⁶, Marco Clerici⁷, Nadine Gobron⁷

¹ACRI-ST, France; ²University of Southampton; ³Albavalor; ⁴University of Leicester; ⁵Blue Sky Imaging; ⁶EarthRayView; ⁷7EC-JRC

The Copernicus Land Monitoring Service (https://land.copernicus.eu) has been providing geophysical data derived from Earth Observation (EO) at a global scale for several decades. This global dataset includes temperature and reflectance, vegetation, soil moisture, snow and water bodies variables. To ensure the quality of these dataset, yearly validation assessment is performed. The collection and processing of ground data for the purpose of validating Copernicus products represents in itself a huge task. In 2018, the European Commission (EC) has established a new service to ensure the independent production of these data: Ground-Based Observations for Validation (GBOV) https://gbov.land.copernicus.eu). The prime objective of GBOV has been for the last 8 years, to provide high-quality validation data for seven Copernicus Land Monitoring Service core products:

• Top Of Canopy Reflectance (TOC-R),

• Albedo (ALB),

• Leaf Area Index (LAI),

• Fraction of Absorbed Photosynthetically Available Radiation (FAPAR),

• Fraction of Vegetation Cover (FCOVER)

• Surface Soil Moisture (SSM) and

• Land Surface Temperature (LST).

In its third phase, new product have been included to support the growing Copernicus land products portfolio, namely:

•GPP and NPP

•Phenology

•Evapotranspiration

GBOV includes three components in the service:

•Component 1: consists of using data from existing in situ networks to generate EO validation datasets. Multi-year ground-based observations of high relevance for EO are collected from these global networks.

•Component 2: consists of upgrading existing monitoring sites with new instrumentation or establishing entirely new monitoring sites to close thematic or geographic gaps.

•Component 3: deals with data distribution of the validation dataset to the user community.

10:00am - 10:30am

Morning Coffee Break
Location: Exhibition Hall "E"

10:00am - 5:30pm

Exhibition
Location: Exhibition Hall "F"

Showcase Theatre

ISPRS Summer School Design Competition Winners
IEEE GRSS IDEA Three-Minute Thesis (3MT®) Competition

10:30am - 12:00pm

Plenary Session 2
Location: Exhibition Hall "G"

Keynote 1: Jean-Claude Piedboeuf (CSA)
Awards Ceremony:

The Jack Dangermond Award
The Gottfried Konecny Award

Keynote 2: Dr. Eleni Paliouras (ESA)

12:00pm - 1:30pm

Lunch
Location: Exhibition Hall "E"

1:30pm - 3:00pm

WG III/1B: Remote Sensing Data Processing and Understanding
Location: 713A

1:30pm - 1:45pm

Multi-modal semantic segmentation for open vocabulary interactions with remote sensing images

Jinkun Dai, Tao Peng, Yuhang Xue, Xianping Ma, Yuanxin Ye

Southwest Jiaotong University, Chengdu 611756, China

Semantic segmentation of multi-modal remote sensing imagery plays a pivotal role in land use/land cover (LULC) mapping, environmental monitoring, and precision earth observation. Current multi-modal approaches mainly focus on integrating complementary visual modalities (e.g., optical and synthetic aperture radar (SAR) imagery), yet neglect the incorporating of non-visual textual data a rich source of knowledge that can bridge semantic gaps between visual patterns and real-world concepts. To address this limitation, we propose TSMNet, a text supervised multi-modal open vocabulary semantic segmentation network that synergistically integrates textual supervision with visual representation for open-vocabulary semantic segmentation. Unlike conventional multi-modal segmentation frameworks, TSMNet introduces a dual-branch text encoder to extract both scene-level semantic and object-level label information from various textual data, enabling dynamic cross-modal fusion. These text-derived features dynamically interact with visual embeddings through the proposed text-guided visual semantic fusion module, enabling domain-aware feature refinement and human-interpretable decision-making. Moreover, integrating text opens pathways for open-vocabulary semantic segmentation, enabling systems to recognize and classify unseen categories through natural language descriptions, thereby overcoming the rigid constraints of predefined class taxonomies. To verify our method, we innovatively construct two new multi-modal datasets, and do a lot of extensive experiments are carried out to make a comprehensive comparison between the proposed method and other state-of-the-art (SOTA) semantic segmentation models. Results demonstrate that TSMNet achieves superior segmentation accuracy while exhibiting robust generalization capabilities across diverse geographical and sensor-specific scenarios. This work establishes a new paradigm for explainable remote sensing analysis, demonstrating that textual knowledge integration significantly enhances model generalizability.

1:45pm - 2:00pm

Meta-Prompting with Open-Source Language Models for Zero-Shot Scene Classification in Remote Sensing

Antonis Promponas¹, Eirini Baltzi¹, Valsamis Ntouskos², Konstantinos Karantzalos¹

¹Remote Sensing Lab, National Technical University of Athens, Greece; ²Department of Engineering and Sciences, Universitas Mercatorum, Rome, Italy

Zero-shot visual recognition with vision-language models (VLMs) has shown strong generalization to unseen categories in natural-image benchmarks, yet its effectiveness in remote-sensing (RS) imagery remains less explored. In this paper, we investigate whether meta-prompting with large language models (LLMs) can improve zero-shot scene classification in RS by automatically generating semantically rich class descriptions. Building on the Meta-Prompting for Visual Recognition (MPVR) framework, we evaluate three open-source LLMs, Mixtral-8x7B, Qwen 2.5 7B, and LLaMA 3.1 8B, as prompt generators across five RS benchmark datasets. The resulting descriptions are encoded with several VLMs, including CLIP, MetaCLIP, RemoteCLIP, and CLIP-LAION-RS, and compared against generic single-template and handcrafted domain-specific prompting baselines. Our results show that LLM-generated prompts are competitive with, and in several cases improve upon, manually designed templates, while revealing that the gains depend on both the dataset and the visual backbone. Overall, the study highlights the potential of open-source LLMs as scalable prompt generators for zero-shot remote-sensing recognition and provides insight into the transferability of meta-prompting beyond natural-image domains.

2:00pm - 2:15pm

Knowledge graph enhanced for zero-shot semantic segmentation in remote sensing imagery

Wubiao Huang¹, Huchen Li¹, Shuai Zhang¹, Haibing Liu¹, Zizhen Chen¹, Shihan Chen¹, Fei Deng^1,2

¹School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; ²Hubei Luojia Laboratory, Wuhan 430079, China

Zero-shot semantic segmentation (ZSSS) is a crucial task in remote sensing image understanding, yet existing methods still suffer from limited generalization to unseen classes. To address this issue, we propose a Knowledge Graph (KG) enhanced ZSSS framework, which introduces explicit hierarchical and relational information into class embeddings to achieve more structured and semantically consistent representations. Specifically, a KG class encoder is designed, consisting of the class enhanced query (CEQ) and class enhanced embedding (CEE) modules, which extract class-relevant subgraphs from a self-constructing Remote Sensing Semantic Class Knowledge Graph (RSSCKG) and generate knowledge-enriched embeddings through a text encoder. Experiments on three public remote sensing datasets demonstrate that the proposed method consistently improves performance across seven state-of-the-art ZSSS frameworks. The integration of KG-based embeddings yields significant gains in the evaluation metrics, with particularly strong improvements on unseen classes, while maintaining accuracy on seen classes. Compared with enhancement strategies based on large language model (LLM) generated descriptions, the proposed KG class encoder exhibit superior semantic separability and stability. These results validate the effectiveness, generalization, and scalability of the proposed framework for ZSSS in remote sensing imagery.

2:15pm - 2:30pm

Segmentation-driven statistics-aware workflow for detailed scene description of UAV images using Mistral and LORA fused model

Bhargav Parulekar, Anandakumar M Ramiya

Indian Institute of Space Science and Technology, Thiruvananthapuram, Kerala, India

In the era of explainable AI, rapid data processing, analysis, and generation have become essential. Over the past few years, many approaches have been developed to process such heavy data and present it in an explainable manner, including in the field of remote sensing. One of such applications is remote sensing scene description. Many established workflows and models exist, but these models either fail to incorporate essential geospatial information or suffer from hallucination. We present a hybrid multimodal captioning methodology that tightly couples semantic segmentation outputs (via a LoRA-adapted Segment Anything Model) with a small, high-quality LLM- Mistral to produce descriptive, interpretable, and data-grounded scene captions. Rather than relying on direct image-to-text pipelines, our approach first extracts structured scene statistics (class proportions), spatial context (quadrant dominance and object localization), and color fingerprints (dominant colors per semantic class). These structured signals are converted into compact, factual prompts that the LLM consumes to generate coherent, informative, and verifiable captions. A comparison with the established Florence-2 model in terms of quantitative description demonstrates a significant improvement, with the Precision Vocabulary Index increasing from 0.077 to 0.232 due to the proposed workflow.

2:30pm - 2:45pm

Evaluating the Adaptation Potential of SAM2 for Glacier Segmentation in Severe Weather

Bindusara Nagathihalli Lokesh, Laura Camila Duran Vergara, Hans-Gerd Maas, Anette Eltner

Dresden University of Technology, Germany

Ground based time lapse cameras provide continuous, high frequency observations of glacier dynamics; however, automated analysis of these image streams remains challenging due to fog, snowfall, lens contamination, and variable illumination. This study investigates the potential of adapting the foundation segmentation model Segment Anything Model 2 (SAM2) for glacier segmentation from ground-based monitoring. To enable integration into automated pipelines, SAM2 is configured in image mode with a learned prompt generation strategy, while fine-tuning is restricted to the prompt encoder and mask decoder. In addition, the internal Intersection over Union (IoU) prediction head is utilized as a confidence estimator to assess segmentation reliability. Experimental results demonstrate that the adapted model achieves stable segmentation under moderate environmental variability, while degrading under severe visibility loss. This stability is consistent across model scales and input resolutions. The confidence estimation further provides a meaningful signal for identifying uncertain predictions, supporting reliability-aware processing in downstream workflows.

2:45pm - 3:00pm

Reasoning-guided ego-path segmentation for autonomous trains using vision–language models

Mohammadjavad Ghorbanalivakili, Ashley Varghese, Gunho Sohn

York University, Canada

Autonomous train perception must identify the train’s valid path under complex railway geometry, particularly at merging and diverging switches where multiple candidate tracks coexist. Existing approaches are primarily trained as purely visual predictors and typically do not provide justification for route selection, despite the fact that valid paths depend on structured cues such as blade–stock contact, rail gaps, and track continuity. In this work, we adapt the Large Language Instructed Segmentation Assistant (LISA) to railway ego-path perception and formulate the task as reasoning-guided segmentation: given a forward-facing railway image and a natural-language query, the model predicts the valid ego-path mask and, when prompted, generates a textual explanation grounded in visible switch geometry. Our approach integrates railway-specific prompting, a tailored annotation scheme, and efficient finetuning, along with semantic segmentation supervision to support general scene understanding. Experiments on a RailSem19-based evaluation set show improved ego-path segmentation performance over the original LISA checkpoint and increased robustness to prompt variation, while qualitative results indicate that the model can produce plausible, though not always consistent, reasoning. Notably, these capabilities emerge despite the reasoning-specific dataset consisting of only 54 samples, highlighting the data efficiency of the approach. These results highlight the potential of vision-language models for more interpretable railway perception, while also underscoring the need for stronger supervision and evaluation in safety-critical settings. Code and reasoning segmentation data are available at https://github.com/mvakili96/Railway_Perception_FoundationModel.

1:30pm - 3:00pm

WG II/9B: Vision Metrology
Location: 713B

1:30pm - 1:45pm

Quantization-Aware Training for Efficient Object Detection on FPGAs: Case Studies

Xuanshu Luo, Gabor Fogarasi, Alan Syrgak, Paul Walther, Martin Werner

Technical University of Munich, Germany

Deploying object detection models for resource-constrained remote sensing applications necessitates on-board model inference capabilities. While Field Programmable Gate Arrays (FPGAs) offer massive parallelism as energy-efficient hardware platforms, model quantization remains essential to further balance computational efficiency with detection accuracy. Compared to post-training quantization methods that involve multiple-stage development with consistent dependency on domain datasets, quantization-aware training (QAT) integrates quantization constraints into training, providing a simpler pipeline for model compression. However, QAT introduces quantization errors to which smaller objects are more vulnerable. To address this issue, we propose object-scale-aware (OSA) regularization that amplifies quantization error penalties for smaller targets. Our approach is validated through two case studies: bird detection at airports and aerial-view building detection. We perform 8-bit QAT on YOLOX series models using the MVA2023 dataset and the Bavarian Building Dataset for the respective studies. Our method achieves up to 50.2 times inference acceleration with minimal accuracy loss on Xilinx Kria KV260 FPGAs compared to full-precision models. The ablation study and detection examples further demonstrate the effectiveness of OSA regularization in small object detection.

1:45pm - 2:00pm

Evaluation of Visual Place Recognition Methods for Image Pair Retrieval in 3D Vision and Robotics

Dennis Haitz¹, Athradi Shritish Shetty¹, Michael Weinmann², Markus Ulrich¹

¹Karlsruhe Institute of Technology, Germany; ²Delft University of Technology, Netherlands

A broad evaluation of state-of-the-art Visual Place Recognition methods is presented. The evaluation focuses on tasks where a fast image pair retrieval is of high importance, such as image-driven scene registration, SLAM or Structure-from-Motion correspondence search. This implies, that the focus of the study is geared away from typical Visual Place Recognition and towards scenarios of interest in computer vision and robotics. A sophisticated evaluation pipeline for retrieval and runtime performance is presented. Prepared datasets based on widely used benchmarks from different domains are utilized, such as indoor-SLAM, outdoor object-centric as well as autonomous navigation in urban and sub-urban areas.

2:00pm - 2:15pm

MVM-IOD: An Industrial Object-Centric Benchmark Dataset for the Evaluation of 3D Reconstruction Methods

Robert Langendörfer, Markus Hillemann, Markus Ulrich

KIT, Germany

3D object reconstruction, camera pose estimation, and novel view synthesis in industrial applications are challenging tasks, as errors are costly while the timewindow for solving these tasks is often limited. The complexity of typical industrial objects further complicates these tasks. Different datasets that can be used to evaluate current methods on these tasks exist, however, most of them do not depict realistic industrial scenarios. We introduce the Machine Vision Metrology Industrial Object Dataset (MVM-IOD)

that addresses this lack of datasets. The hardware setup to acquire the dataset consists of a camera, mounted upside down due to space restrictions, at the end effector of an industrial robot arm. Images of typical industrial objects are captured systematically, by moving the camera on a hemisphere around the objects. MVM-IOD contains the camera poses, the acquired RGB images, and the 3D point cloud of 9 objects and 2 background choices resulting in 18 scenes, which allows evaluation of all image based

methods that compute a 3D reconstruction, camera poses, and/or novel views. Based on our dataset, we extensively evaluate current state-of-the-art 3D reconstruction and camera pose estimation methods, such as Structure from Motion, Multi-View Stereo, Visual Geometry Grounded Transformer (VGGT), π3, as well as 2D Gaussian Splatting and report our findings to create a baseline for future research.

2:15pm - 2:30pm

A Critical Synthesis of Uncertainty Quantification and Foundation Models for Semantic Segmentation

Steven Landgraf, Joceline Hinz, Markus Ulrich

Karlsruhe Institute of Technology, Germany

Foundation models are increasingly breaking what seemed to be impossible not long ago by enabling unprecedented accuracy and

cross-domain generalization. Yet their lack of interpretability, tendency to be overconfident, and sensitivity to real-world domain

shifts pose critical challenges for safety- and mission-critical applications. Uncertainty quantification (UQ) offers a principled way

to address these issues, but its integration into segmentation foundation models has yet to be explored. In this paper we present the

first systematic evaluation of UQ methods applied to a foundation model for semantic segmentation. We fine-tune a lightweight

DPT decoder on top of the pretrained SAM2 encoder to establish a simple yet competitive baseline and benchmark four representative

UQ approaches – Monte Carlo Dropout, Deep Sub-Ensemble, Test-Time Augmentation, and Evidential Deep Learning – across

Cityscapes, NYUv2, and two challenging out-of-domain settings. Our analysis compares segmentation accuracy, calibration, uncertainty

quality, and inference time, revealing clear trade-offs between predictive performance, reliability, and computational cost.

These results highlight both the promise and the current limitations of uncertainty-aware foundation models, pointing to the need

for future work that jointly optimizes accuracy, robustness, and efficiency for real-world deployment.

2:30pm - 2:45pm

The Impact of CutMix on Reliability and Robustness in Semantic Segmentation

Steven Landgraf, Markus Ulrich

Karlsruhe Institute of Technology, Germany

Ensuring not only high accuracy but also reliable and robust predictions is critical for the deployment of semantic segmentation

models in safety-critical applications such as autonomous driving. Despite the widespread use of CutMix – a simple yet powerful

data augmentation strategy – its effect on the reliability and robustness in dense predictions tasks remains unexplored. Motivated

by recent findings that semi-supervised segmentation methods, where CutMix is a core component, can severely degrade reliability,

this study isolates and systematically analyzes the influence of CutMix on segmentation accuracy, calibration, and uncertainty

quality. We evaluate two representative architectures, the CNN-based DeepLabV3+ and the transformer-based SegFormer, across

both in-domain and out-of-domain scenarios. Our results show that CutMix has only a minor impact on segmentation accuracy

but consistently improves the reliability, particularly under distribution shifts. These improvements indicate that CutMix primarily

enhances the trustworthiness of the model’s calibration and uncertainty rather than the raw segmentation prediction itself. This

distinction is crucial for safety-critical deployment, where reliable confidence estimates are as important as raw performance.

2:45pm - 3:00pm

Uncertainty Quality of VGGT: An Analysis on the DTU Benchmark Dataset

Markus Hillemann, Robert Langendörfer, Steven Landgraf, Markus Ulrich

Karlsruhe Institute of Technology, Germany

Visual Geometry Grounded Transformer (VGGT) has already attracted a great deal of attention in a short period of time, not least due to the Best Paper Award at CVPR-2025. Similar to DUSt3R and MASt3R, VGGT aims to bring about a paradigm shift by replacing established methods like bundle adjustment and feature matching with a simple, unified, feed-forward neural network that predicts camera poses, depth maps, and dense 3D structure directly from multiple images of a scene in a few seconds. A key aspect is its ability to process an arbitrary number of views consistently in a single forward pass without any post-processing or iterative optimization. For photogrammetry, this opens new possibilities for real-time, scalable, and accessible 3D reconstruction. In this context, not only high reconstruction accuracy but also high-quality uncertainty estimates are crucial, as they foster trust and enable robust quality assurance. This paper therefore investigates the quality of VGGT’s uncertainty predictions. The analysis identifies an effective confidence threshold for filtering VGGT’s raw output and demonstrates that enhancing uncertainty quality holds strong potential for improving the accuracy of its 3D reconstructions.

1:30pm - 3:00pm

WG I/3: Multispectral, Hyperspectral and Thermal Sensors
Location: 714A

1:30pm - 1:45pm

First Field Validation of a New VNIR/SWIR-Based Six-Band Multi-Camera System for UAVs over Winter Wheat

Alexander Jenal¹, Fabian Reddig², Andreas Bolten², Leon Vehlken², Hubert Hüging³, Thuy Huu Nguyen³, Jens Bongartz¹, Georg Bareth²

¹Application Center for Machine Learning and Sensor Technology (AMLS), University of Applied Sciences Koblenz, Germany; ²Institute of Geography, GIS & Remote Sensing Group, University of Cologne, Germany; ³Institute of Crop Science and Resource Conservation (INRES), University of Bonn, Germany

Shortwave infrared (SWIR) imaging from uncrewed aerial vehicles (UAVs) remains rare despite strong sensitivity to canopy water and protein. We present the first field validation of a six-band VNIR/SWIR multi-camera system designed for plot-scale monitoring of winter wheat using mid-sized UAVs. The payload utilized narrow bandpass filters (910, 980, 1100, 1200, 1510, and 1650 nm; FWHM 10–12 nm) and was operated at an altitude of approximately 30 meters above ground level, achieving a ground sampling distance of approximately 4 cm. Empirical line calibration, employing in-scene gray panels, was validated against material-distinct panels and spectroradiometer measurements. The spectral response functions were approximated using Gaussian convolution due to the narrow passbands. Five bands (980–1650 nm) exhibited excellent performance: empirical line model fits achieved R² values approaching 1.000 (RMSE = 0.003–0.009), independent panel validation demonstrated near-unity slopes (R² = 0.998–0.999; RMSE = 0.005–0.013), and plot-level canopy measurements (n=36) maintained strong agreement between camera and spectroradiometer (slopes = 0.943–1.079; R² = 0.58–0.85; RMSE = 0.010–0.023). Two SWIR normalized ratio indices exhibited robust cross-sensor agreement: NRI[1100,1200] (R² ≈ 0.93) and NRI[1650,1510] (R² ≈ 0.90). The 910 nm channel displayed systematic errors (slope = 0.442±0.040 for plots; MAPE ≈ 33%) due to identified out-of-band leakage from incomplete long-wave blocking, leading to its exclusion from accuracy claims. Mitigation strategies include higher optical density short-pass blocking and system-level spectral response function verification. The filter-reconfigurable payload provides quantitative reflectance and robust SWIR indices at the plot scale by integrating panel-anchored empirical line modeling with bandpass-aware harmonization, thereby advancing operational SWIR monitoring capabilities for precision agriculture applications.

1:45pm - 2:00pm

PanX.4: A Gyrocopter‑Borne Six‑Band VNIR Multicamera System for Sentinel-2‑Aligned Multitemporal Vegetation Monitoring

Alexander Jenal¹, Felix Kröber^2,5, Christopher Frank³, Lina Krisztian⁴, Markus Metz⁴, Ribana Roscher⁵, Jens Bongartz¹

¹Application Center for Machine Learning and Sensor Technology (AMLS), University of Applied Sciences Koblenz, Germany; ²Institute of Bio- and Geosciences, Forschungszentrum Jülich, Germany; ³CISS TDI GmbH, Germany; ⁴mundialis GmbH & Co. KG, Germany; ⁵Institute of Geodesy and Geoinformation, University of Bonn, Germany

This contribution presents PanX.4, a gyrocopter-borne six-band VNIR multicamera system developed within the KIBI project on AI-based identification and classification of protected plant communities (mFUND, FKZ 19F2276) to support cross-scale monitoring at Natura 2000 sites. The system is designed for spectral alignment with Sentinel-2 MSI bands B02–B06 and B08 and is integrated into a tri-sensor airborne suite on the FlugKit carrier platform together with a high-resolution RGB camera and a complementary six-band VNIR–SWIR imaging system. Using system-level spectral response characterization and spectral band adjustment factor (SBAF) analysis based on 1,057 ECOSTRESS spectra, the study quantifies the harmonization quality between PanX.4 and Sentinel-2A, S2B, and S2C. All bands achieved R² > 0.99, while comparative screening of alternative spectral configurations showed that careful band design is critical, particularly in the red-edge region. An additional inter-satellite sensitivity analysis further indicates that harmonization should account for band-dependent differences between Sentinel-2 units when multitemporal airborne and satellite observations are combined. To support multitemporal habitat monitoring, the paper also analyzes 86,947 first-mowing observations from 2017 to 2024 and derives a three-window acquisition concept synchronized with pre-mowing, post-regrowth, and senescence phases. This creates an operationally relevant framework for planning repeated airborne campaigns that can support validation, boundary refinement, and future machine-learning workflows for habitat classification. The contribution therefore establishes the sensor-design, spectral-harmonization, and temporal-planning basis for Sentinel-2-consistent airborne monitoring at sub-meter resolution. Operational airborne image products and in-flight validation are beyond the present contribution and form the next step for future deployment.

2:00pm - 2:15pm

Atmospheric correction of aerial imagery using satellite-derived reflectance data

Alexane Nghien, Manchun Lei, Mathieu Brédif

Univ Gustave Eiffel, Géodata Paris, IGN, LASTIG

Atmospheric correction of large-scale aerial imagery remains a major challenging, mainly due to the difficulty of accurately estimating atmospheric parameters within the images. This study proposes a novel atmospheric correction method based on satellite-derived Surface Reflectance (SR). The method is a semi-empirical linear correction approach that leverages Pseudo-Invariant Features (PIFs) as reference points. Experimental results show that, the proposed method achieves performance comparable to radiative transfer models approach when accurate atmospheric parameters are available, and provides more reliable corrections when such parameters are uncertain or unavailable.

2:15pm - 2:30pm

Abundance Estimation Methods in Spectral Unmixing for Real Data

Daniele Cerra, Miguel Pato, Emiliano Carmona

German Aerospace Center (DLR), Germany

Spectral unmixing estimates the fractional abundances of materials, having associated spectra called endmembers, in pixels acquired by imaging spectrometers. Validation of abundance estimation methods typically relies on synthetic data or comparisons to results obtained by other algorithms. This study considers results of typical abundance estimation algorithms on the DLR HySU (HyperSpectral Unmixing) benchmark dataset, which contains actual imaging spectrometer data acquired over several arrangements of known-size material patches for physically traceable validation. Abundance estimates are compared against measured target areas in pixels with different degrees of mixtures. We evaluate least squares and sparse unmixing methods across different noise scenarios on real data, and by contaminating the library through addition of non-relevant endmembers. Additionally, as a way to approximate hard sparsity constraints, we enforce cardinality constraints on endmember subsets, identifying those minimizing abundance errors relative to the full library. Results suggest that fully constrained least squares yields usually the best results, but struggles in cases of highly mixed pixels. Finally, we test quantization of abundance values as a way to enforce sparsity in non-negative least squares with limited but encouraging results. Overall, the increase in accuracy of results enforcing sparse solutions support the use of computationally efficient sparse unmixing methods in practical scenarios, part of which may become feasible if quantum computing capabilities improve in the future.

2:30pm - 2:45pm

Operational Band-to-Band Correction and Attitude Refinement of Pelican-2: dual-panchromatic Attitude Restitution and selective Bundle Adjustment with preliminary Application to Earthquake Displacement and DEM Generation

Saif Aati, Antonio Martos, Eric L. Peters, Frank Warmerdam, Graham Mills, Adam Weber, Luna Gray, Minh Radel

Planet Labs PBC

The Pelican satellite constellation, first launched by Planet Labs in 2025, continues the high-resolution imaging capability established by the SkySat program. The change to pushbroom sensor in Pelican presents new geometric challenges: satellite attitude variations and platform instabilities during acquisitions can produce band misregistration and geolocation errors that degrade downstream products. This paper presents an operational workflow developed for Pelican imagery, validated on Pelican-2, a technology demonstration satellite. The approach exploits the dual-panchromatic focal plane configuration to independently measure satellite wobble to greater accuracy than on onboard attitude sensors, combined with selective bundle adjustment and B-spline spatial correction to achieve sub-pixel band alignment without dense ground control points. Validation on 963 Pelican-2 scenes demonstrates sub-pixel band-to-band registration accuracy (RMSE < 0.12 px) and 4 m CE90 geolocation accuracy. Applications illustrate the potential for operational geoscience workflows: earthquake surface displacement mapping of the March 2025 Myanmar M7.7 rupture detects 4.0 m co-seismic offsets on the Sagaing Fault with minimal post-processing, and digital surface model generation from an opportunistic multi view acquisition yields preliminary elevation products free of jitter artifacts, demonstrating operational feasibility for constellation-scale processing.

Initial applications showcase operational potential: earthquake surface displacement mapping detects 4.0 m co-seismic offsets from the March 2025 Myanmar M7.7 rupture with minimal post-processing; digital surface model generation yields elevation products free of jitter artifacts. Results establish feasibility for constellation-scale processing and inform next-generation Pelican development.

1:30pm - 3:00pm

WG III/8B: Remote Sensing for Agricultural and Natural Ecosystems
Location: 714B

1:30pm - 1:45pm

Estimating the leaf area index of urban trees using terrestrial LiDAR and the PATH method: sensitivity analysis and comparison with optical and direct methods

Camille Taufflieb¹, Nathalie Breda², Ronghai Hu³, Tania Landes¹, Vincent Lecomte¹, Françoise Nerry⁴, Georges Najjar⁴

¹1 Université de Strasbourg, CNRS, INSA Strasbourg, ICube Laboratory UMR 7357, Photogrammetry and Geomatics Group, 67000, Strasbourg, France; ²Université de Lorraine, AgroParisTech, INRAE, UMR Silva, Nancy, France; ³College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China; ⁴Icube Laboratory (UMR 7357), University of Strasbourg, Strasbourg, France

Urban trees play a crucial role in mitigating urban heat islands through shading and transpiration, processes directly linked to Leaf Area Index (LAI). However, estimating LAI for individual urban trees remains challenging due to their geometric and temporal heterogeneity.

This study evaluates the PATH (Path length distribution) method, a terrestrial laser scanning (TLS) based approach, to estimate LAI for three urban tree species in Strasbourg, France. The PATH method models foliage area volume density from point clouds, accounting for non-random foliage arrangements and woody structure contributions, unlike traditional optical methods.

TLS campaigns were conducted in three streets at three phenological. The sensitivity of PATH to geometric reconstruction parameters was assessed to optimize LAI estimation. Results show that envelope geometry significantly influences PAI estimates, with concave shapes (of at least 3000 facets) yielding more accurate values, while leaf angle distribution has minimal impact.

The obtained LAI estimates varied by species, reflecting species-specific crown densities. PATH-derived PAI was compared to LAI-2000 optical sensor measurements and direct LAI obtained by leaf collection. PATH estimates aligned more closely with true LAI than LAI-2000, especially during early leaf expansion, though discrepancies arose due to branch pruning and polycyclic flushing. The study highlights the importance of adapting scanning protocols and PATH parameters to species-specific morphology.

In conclusion, this work highlights the potential of TLS-based methods for providing robust PAI estimates for urban trees. Future research will link these species-specific estimates to urban microclimate benefits.

1:45pm - 2:00pm

Evaluation of Machine Learning Methods for Estimation of Leaf Chlorophyll Content (LCC) Across 15 Soybean Cultivars During Early Reproductive Stage

Carli Kriek¹, Philemon Tsele¹, George Chirima², Adolph Nyamugama²

¹Department of Geography, Geoinformatics and Meteorology, University of Pretoria, Pretoria 0002, South Africa; ²Agriculture Research Council Natural Resource & Engineering (NRE), Pretoria, 0001, South Africa

South Africa is the leading soybean producer in Africa, contributing approximately 35% of the continent’s total production. Soybean is important for national food security and agricultural sustainability–– serving as a key nitrogen-fixing crop that support soil fertility and economic growth. Whilst monitoring biochemical parameters such as leaf chlorophyll content (LCC) is essential for assessing the soya bean health, cultivar-level variability can complicate the use of remote sensing–based approaches. This study evaluates the performance of four machine-learning algorithms, XGBoost, Random Forest, Partial Least Squares Regression, and Artificial Neural Network, using unmanned Aerial Vehicle based data across 15 soybean cultivars during the early reproductive phase. Results show that model performance is strongly cultivar dependent. Tree-based models achieved the highest accuracy, with XGBoost and Random Forest reaching RMSE values as low as 2.9 µmol m⁻² for PHIP62T16R and R² values up to 0.96 for RA655R, while ANN and PLSR performed substantially worse for cultivars with more complex spectral responses, such as PAN1555R. Residual results from generalised models revealed systematic over- and under-prediction in several cultivars, indicating that models developed using pooled data are unable to fully account for cultivar-specific spectral differences. Variable-importance analyses identified red-edge, NIR, and greenness-enhancing indices as the most influential predictors of LCC, highlighting their strong sensitivity to canopy structure and chlorophyll variation. Overall, the study shows that cultivar-specific, ensemble-based modelling delivers stronger predictions of chlorophyll in soybean. Incorporating cultivar information and using stratified model calibration improves the reliability of UAV-based chlorophyll monitoring in heterogeneous soybean canopies.

2:00pm - 2:15pm

Potential of very high Resolution Pléiades Neo Satellite Data to monitor Crop Traits

Georg Bareth¹, Christoph Hütt¹, Alexander Jenal², Andreas Bolten¹, Fabian Reddig¹, Leon Vehlken¹, Annika Klee¹, Jan Wolf¹, Hannah Firl¹, Hubert Hüging³

¹Institute of Geography, GIS & Remote Sensing Group, University of Cologne, Germany; ²AMLS, University of Applied Sciences Koblenz, Remagen, Germany; ³INRES - Crop Sciences, University of Bonn, Germany

The monitoring of crop traits on a landscape scale is of key interest in the context of precision farming and food production. Many studies use moderate-resolution satellite data like Sentinel-2, Landsat for crop monitoring. However, enhanced spatial resolution is improving monitoring quality significantly. In this context, commercial but expensive very high resolution (VHR) satellite data from Ikonos, Quickbird, Formosat-2, and WorldView-2 have been successfully applied for crop monitoring over the last two decades.

The focus is on the research question “Can Pléiades Neo data quantify plot-scale variation in dry biomass and N uptake?” and on developing an analysis workflow which could support precision farming on a landscape scale using VHR satellite data.

In this contribution, we propose the application of pansharpened Pléiades Neo satellite data for the monitoring of crop traits like dry biomass and N uptake - in our study for winter wheat. The very high spatial resolution of 0.3 m even allows to investigate field experiments with plot sizes of several m2 and therefore, would be suitable for crop phenotyping.

2:15pm - 2:30pm

Development of a transferrable hybrid retrieval model for mapping sweet potato chlorophyll at matured growth stage using ultra high-resolution UAV data

Philemon Tsele¹, Abel Ramoelo^2,1, Lucy Moleleki¹, Sunette Laurie³, Whelma Mphela¹, Natasha Tshuma¹

¹University of Pretoria, South Africa; ²South African National Space Agency, South Africa; ³Agricultural Research Council, South Africa

Smallholder farmers play a critical role in the growing of underutilized crops, such as sweet potato. Obtaining accurate maps of sweet potato biophysical variables is essential for farmers to assess and monitor crop health at different growth stages. Integrating radiative transfer model (RTM) data with vegetation indices (VIs) based on unmanned aerial vehicle (UAV) data, may have the potential for accurately estimating leaf chlorophyll concentration (LCC) across multiple crop varieties. Firstly, in this paper we developed and tested varying hybrid retrieval models by combining PROSAIL RTMs with broadband, narrowband and leaf-pigment VIs applied to 2-cm resolution UAV imagery, to retrieve LCC over 20 sweet potato varieties at 120 days i.e. matured growth stage. Secondly, the best hybrid retrieval model was transferred to a different site which contain similar sweet potato varieties at matured growth stage for the estimation of sweet potato LCC. Results show that the most accurate retrievals of LCC were achieved by integrating a larger database containing 11000 PROSAIL simulated reflectance samples with broadband indices, particularly the enhanced vegetation index (EVI) with coefficient of determination (R2) of 0.85, root mean squared error (RMSE) of 5.93 µg/cm2, and relative RMSE (RRMSE) of 9.87%. Furthermore, when transferred to a different site containing similar sweet potato varieties at matured growth stage, this model achieved 60% agreement with field LCC measurements and responded fairly well by capturing LCC variability. These findings have significant implications in sweet potato breeding programmes for developing new cultivars.

2:30pm - 2:45pm

Principal component analysis of UAV-derived vegetation indices and laboratory tissue nutrients for crop health assessment

Oluibukun Gbenga Ajayi^1,2, Fatima Abiola Ogunlesi³

¹Namibia University of Science and Technology, Namibia; ²University of Pretoria, South Africa; ³Federal University of Technology, Minna

Remote sensing and laboratory assays can improve field-scale crop assessment and management. This exploratory pilot study analyses relationships between leaf tissue nutrients and UAV-derived normalised difference vegetation index (NDVI) using seventeen paired samples collected across a mixed crop trial. Tissue measures for nitrogen, phosphorus and potassium were standardised and entered into principal component analysis to reduce pairwise correlation and extract orthogonal nutrient axes. The first principal component explained 54.79% of variance, the second explained 34.10%, together accounting for 88.9%. Principal component scores for the first two axes were used in linear and polynomial regression models to predict NDVI. Model skill was assessed on training data and with leave-one-out cross-validation, and bootstrap resampling produced empirical confidence intervals for component loadings. Linear models built on principal components provided the most stable cross-validated performance, while polynomial expansions improved training fit but generalised poorly. These findings indicate that a low-dimensional nutrient representation can predict NDVI with reasonable stability and that combining spectral and biochemical data supports spatially explicit nutrient assessment. The study recommends expanded and stratified sampling, reflectance calibration and targeted spectral bands for follow-up studies, and external validation before wider applications.

2:45pm - 3:00pm

Multiscale Multispectral–Hyperspectral Data for Estimating Coffee Yield Using Machine Learning Algorithms

George Deroco Martins, Lucas Carvalho, Filipe Silva, Rayssa Barbosa, Maria Cecília Santos

Federal University of Uberlândia, Brazil

This study integrates multispectral (UAV) and hyperspectral (ground-based) remote sensing data to estimate coffee (Coffea arabica) yield using machine learning algorithms. Forty field plots were analyzed with multispectral Mavic 3M imagery and hyperspectral Blue Wave spectroradiometer data. Spectral indices such as NDVI, NDRE, GNDVI, CIRE, and PRI were correlated with yield, revealing distinct responses across spectral domains. Neural networks achieved the best predictive performance (R = 0.93; RMSE = 7.9%), followed by SVM models (R = 0.90). The Red Edge and Green bands were most sensitive to productivity variations in multispectral data, while hyperspectral narrowband indices provided superior correlations with canopy physiological traits. The integration of both datasets highlights the complementary strengths of spatially extensive multispectral imagery and the spectral precision of hyperspectral sensing. This multiscale approach enables more accurate and operational yield estimation for perennial crops and supports the development of precision agriculture protocols for coffee production systems.

1:30pm - 3:00pm

ICWG III/IVa-B: Disaster Management
Location: 715A

1:30pm - 1:45pm

Mapping flood footprints: a review of remote sensing approaches for quantifying physical asset information extraction

Wei Jiang¹, Quan Liu¹, Xiaohui Ding², Zhiguo Pang¹, Denghua Yan¹, Yizi Shang¹, Rong Li¹, Akiyuki Kawasaki³

¹China Institute of Water Resources and Hydropower Research, Beijing, 100038, China; ²School of Software and Internet of Things Engineering, Jiangxi University of Finance and Economics, Nanchang, 330032, China; ³Institute for Future Initiatives, The University of Tokyo, Tokyo, Japan

Flooding stands as one of the world's prominent natural hazards, which exerts severe threats to sustainable socioeconomic development. Physical asset information in flood disasters refers to the location, quantity, and damage severity of exposed elements within the affected area. Rapid and accurate extraction of such information is crucial for flood disaster emergency management. To achieve this goal, a remote sensing-based framework for extracting physical asset information in flood disasters is proposed in this paper. This framework summarizes extraction methods for flood damage to typical asset types including cropland, buildings, and roads, and comparatively analyzes the advantages and limitations of multi-source remote sensing data, geographic data, and social media data in physical asset information extraction. This study further investigates the differences between statistical analysis, shallow learning methods, deep learning, and transfer learning approaches, with respect to three key dimensions, namely extraction accuracy, scenario applicability, and computational efficiency. Future research should focus on: (1) Development of operational technologies for flood emergency response and disaster mitigation; (2) multi-source data fusion and dynamic simulation based on digital twin technology; (3) intelligent mining of multi-modal information and development of generalized extraction models driven by foundation models, with the aim of providing technical support for rapid flood emergency response.

1:45pm - 2:00pm

Rapid flood damage assessment in detention basins using multi-source remote sensing: a case study of the 2023 dongdian flood event in china

Wei Jiang¹, Quan Liu¹, Denghua Yan¹, Wenlong Song¹, Zhiguo Pang¹, Xiaohui Ding²

¹China Institute of Water Resources and Hydropower Research, China, People's Republic of; ²School of Software and Internet of Things Engineering, Jiangxi University of Finance and Economics, Nanchang, 330032, China

Rapid flood damage assessment is essential for emergency response and post-disaster recovery. Following catastrophic flooding in the Haihe River Basin on July 28, 2023, the Dongdian flood detention basin was activated on August 1, with inundation persisting until early October. This study integrates satellite remote sensing, UAV imagery, and field surveys to develop a rapid multi-source approach for comprehensive flood loss assessment. The methodology comprises: (1) extraction of inundation characteristics (spatial extent, depth, duration); (2) classification of exposed assets (agricultural land, forests, residential and industrial areas); (3) comprehensive damage and economic loss evaluation. Results show that 301.49 km² (79.55% of the basin) was inundated from August 1 to October 5, 2023, with an average depth of 2.64 m. The central-western zone sustained the most severe damage, with prolonged residential inundation. Complete corn crop failure occurred, and agricultural-forestry production suffered near-total losses. Direct economic losses exceeded 17.5 billion yuan. Compared to traditional field methods, this approach demonstrates superior efficiency and accuracy, providing scientific support for flood risk management in detention basins.

2:00pm - 2:15pm

Shoreline extraction and coastal change detection from satellite SAR using thresholding-based methods

Corienne Erasmus^1,3, Shelley Haupt², Bolelang Sibolla^1,2

¹Department of Geography, Geoinformatics and Meterology, University of Pretoria, Pretoria, South Africa; ²Next Generation Enterprises and Institutions, Council for Scientific and Industrial Research, Pretoria, South Africa; ³AOS-SAMOS, Department of Oceanography, University of Cape Town, Rondebosch 7700, South Africa

Coastal environments provide various economic, ecological and societal benefits. Coastal erosion which is the gradual loss of sediment over time, poses a significant threat to South Africa’s coastline. The monitoring and detection of coastal erosion is essential for the effective management of coastal environments. One way to quantify coastal erosion is the delineation of coastal boundaries. Remote sensing techniques such as Synthetic Aperture Radar offers a unique opportunity to extract shoreline positions over large areas of the coast. Furthermore, thresholding and edge detection methods have been successfully used to extract land-water boundaries. In this study, C-band SAR data was used to derive backscatter coefficients for three different areas of interest in the Eastern Cape province in South Africa over a ten year period. The coastal erosion and accretion trends were calculated from the results indicated that the Linear Regression Rate (LRR) for the three different study area showed various coastal erosion seasonality trends. The shoreline LLR ranged between -0.01 and -3.28 m/year for the Cape Recife area and -0.17 and -4.78 m/year for the Nelson Mandela Bay beach front. The overall pattern was erosion during the winter months and accretion during the summer months. In contrast, for the Kings Beach area, there was a consistent accretion trend where the LRR values ranged between 0.94 and 1.68 m/year. The findings confirm that SAR remote sensing is suitable for detecting and monitoring coastal changes in three different coastal environments.

2:15pm - 2:30pm

Enhancing Oil Spill Interpretation Through Multisensor Fusion and Temporal Reconstruction: A Case Study Near the Strait of Gibraltar

Tom Avikasis Cohen, Dror Angel, Anna Brook

University of haifa, Israel

Oil spills in confined maritime corridors often evolve faster than any single satellite mission can observe. This often complicates the interpretation of individual images and create gaps in understanding how a spill progresses between satellite overpasses. This study examines whether combining Sentinel-1 and Sentinel-2 observations can provide a more coherent picture of its development of a spill event, using the case of an oil spill occurred near the Strait of Gibraltar in late August 2022 after a collision between the OS35 and the Adam LNG.

The preliminary analysis evaluated each sensor separately. Sentinel-1 highlighted changes in surface roughness, while Sentinel-2 revealed reflectance anomalies linked to modified optical properties of the water. Since neither dataset on its own offered a complete account of the surface conditions, a fusion procedure was applied to the closest pair of post-event images. The fused map displayed sharper boundaries and more spatial detail than the radar scene alone, offering a clearer outline of the affected area. To address the temporal mismatch between acquisitions, intermediate surfaces were also reconstructed for both sensors, producing estimated representations of the marine conditions at dates not directly observed.

Taken together, the fused and reconstructed products formed a more continuous sequence of the spill’s evolution, capturing both its fragmentation and its short-term reorganisation. Although the approach does not replace dedicated operational monitoring, it demonstrates that combining complementary satellite data can reduce ambiguity in single-sensor interpretation and strengthen situational awareness in regions where surface conditions change quickly and unpredictably.

2:30pm - 2:45pm

Windstorm hazard index development for malaysia

Nur Hidayah Zakaria¹, Nur Atiqah Hazali², Siti Aekbal Salleh^3,4, Nurul Amirah Isa¹, Nini Nurdiana Johari¹, Mohd Badrul Hafiz Che Omar¹, Arnis Asmat⁴, Andy Chan⁵

¹Faculty of Asia Built Enviroment and Surveying, Universiti Geomatika Malaysia (UGM), Malaysia; ²Geospatial Science & Technology College (GSTC), Malaysia; ³Institute for Biodiversity and Sustainable Development (IBSD),Universiti Teknologi MARA; ⁴Center of Studies Surveying Science and Geomatics, Faculty of Built Environment, Universiti Teknologi MARA (UiTM) , Malaysia; ⁵Southampton Solent University, England

Windstorms in Peninsular Malaysia have increased in both frequency and severity, posing growing risks to communities, infrastructure, and the national economy. Despite these escalating threats, the region currently lacks a comprehensive, location-specific index capable of evaluating and categorizing windstorm hazards for effective planning and mitigation. This study develops a Windstorm Hazard Index (WHI) tailored to Peninsular Malaysia to assess spatial patterns of windstorm risk and support evidence-based decision-making. Four objectives were addressed: (1) identifying key environmental and geographical factors influencing windstorm occurrences; (2) quantifying these parameters using windstorm records from 2008–2018, numerical simulations generated via WRF-ARW, and urban morphology modelling using Envi-MET; (3) formulating the WHI through the integration of Analytic Hierarchy Process (AHP) and Principal Component Analysis (PCA); and (4) validating the index using documented windstorm events from 2020–2024.The WHI categorizes the peninsula into six hazard levels ranging from very low (0.1–0.5) to extreme (0.901–1.0). Southern and central states, including Negeri Sembilan and Pahang, generally exhibited very low hazard levels, while Kelantan and Terengganu showed moderate risk. High-risk zones were concentrated in northern and coastal regions such as Penang, Kedah, and Perlis, with extreme-risk areas detected in parts of Kedah and Perlis. Results indicate that wind speed, temperature, humidity, precipitation, land use, and urban density strongly influence windstorm intensity, particularly in coastal and densely built environments. Validation confirmed the WHI’s reliability, as extreme-risk classifications aligned with recorded damage patterns. Overall, the WHI serves as a robust framework for regional hazard assessment and disaster-resilient infrastructure development across Peninsular Malaysia.

2:45pm - 3:00pm

FRI-R: A Data Driven Flood Risk Index for Resilience Decision-Making

Bandana Kar¹, Margaret Glasscoe²

¹ResIntSoft LLS, United States of America; ²University of Colorado, Boulder, United States of America

Flooding is one of the most frequent and costliest hydro-meteorological hazards, impacting every nation and causing significant societal and economic disruption. Despite the abundance of Earth Observation (EO) datasets and hydrodynamic models available to map, monitor, and forecast flood events, decision-makers and first responders often struggle to translate these resources into actionable insights. To bridge this gap, we’ve developed the Flood Risk Index for Resilience (FRI-R), a data-driven machine learning model designed to support resource planning, emergency response, and downstream analytics. FRI-R is powered by the Model of Models (MoM), an operational, open-source ensemble framework that integrates outputs from hydrologic models and EO data from optical imagery. Leveraging historical MoM outputs, FRI-R analyzes the spatial and temporal patterns of past flood events and classifies sub-watersheds from high to low risk based on flood frequency and duration, offering a dynamic lens into vulnerability hotspots. MoM has proven effective in disseminating early flood warnings. Building on this success, FRI-R is designed to enable targeted interventions for at-risk populations and critical infrastructures, thereby empowering communities and decision-makers to proactively mitigate and improve long-term resilience.

1:30pm - 3:00pm

WG II/3C: 3D Scene Reconstruction for Modeling & Mapping
Location: 715B

1:30pm - 1:45pm

CityLangSplat: Integrating CityGML Semantics into 3D Language Gaussian Splatting for Urban Scene Understanding

Qilin Zhang^1,2,3, Jinyu Zhu¹, Olaf Wysocki⁴, Boris Jutzi^1,3

¹Technical University of Munich; ²Munich Center for Machine Learning; ³Karlsruhe Institute of Technology; ⁴University of Cambridge

Combining visual semantics with language representations has made 3D interpretation more flexible and intuitive. Recent advances in Gaussian Splatting extend this to efficient 3D language fields supporting open-vocabulary queries. However, existing approaches show limited generalization in large urban scenes, especially for detailed building segmentation. Semantic 3D city models such as CityGML, by contrast, provide hierarchical and geometry-aligned structural semantics that complement appearance driven visual cues. We introduce CityLangSplat, which integrates CityGML semantics into 3D Language Gaussian Splatting for urban environments. CityLangSplat rasterizes CityGML into pixel-aligned semantic maps, extracts vision-language features from SAM-derived segments and CityGML regions, and compresses both sources into a shared latent space via a lightweight autoencoder. 3D Gaussians are then optimized with a coverage-aware loss that balances accurate, building-focused CityGML supervision with broader SAM supervision, enabling geometry-aligned open-vocabulary reasoning in urban scenes. Experiments on TUM2TWIN and ZAHA datasets show consistent gains over LangSplat, with relative improvements of 22.9% in 2D and 15.1% in 3D evaluation while preserving real-time rendering. CityLangSplat provides a practical framework for combining semantic city models with language-embedded 3D Gaussian Splatting for geometry-aligned urban scene interpretation. Code will be released at https://github.com/zqlin0521/CityLangSplat.

1:45pm - 2:00pm

RoofVIP benchmark dataset: 2D roof planar polygons and very high-resolution digital orthophoto pairs

Chaikal Amrullah, Daniel Panangian, Guneet Mutreja, Youssef Abdelhedi, Ksenia Bittner

German Aerospace Center (DLR), Germany

Accurate building roof modeling is fundamental to urban analytics, digital twins, and 3D city reconstruction; however, progress in deep learning–based reconstruction is constrained by the limited availability of diverse, high-resolution datasets with detailed geometric annotations. This study introduces the RoofVIP dataset, a large-scale benchmark featuring very high-resolution RGB orthophotos paired with 2D roof vectors that capture diverse urban morphologies across Munich, Germany. Following Level of Detail (LoD) 2.0 principles, RoofVIP encompasses a wide range of roof geometries and architectural complexities, enabling evaluation of both segmentation- and vectorization-based reconstruction methods. Two paradigms are examined: a two-step segmentation-based approach (Cascade Mask R-CNN, Mask R-CNN, SOLOV2, YOLACT) and a one-step direct vector prediction approach (HEAT, PolyRoof). ImageNet-pretrained region-based models, particularly Mask R-CNN and Cascade Mask R-CNN, achieve the highest segmentation accuracy, effectively delineating complex roof boundaries while revealing limitations on small or irregular structures. Geometry-based models show complementary strengths, with HEAT emphasizing topological regularity and PolyRoof focusing on geometric precision. Although performance is lower than on simpler datasets such as HEAT and Roof Intuitive, RoofVIP exposes challenges related to geometric diversity and scale variation, serving as a rigorous benchmark. The dataset includes predefined training, validation, and test splits, enabling consistent benchmarking across methods. By providing a challenging and diverse geometric landscape, RoofVIP aims to advance geometry-aware deep learning approaches and support scalable, high-fidelity 3D urban modeling. The dataset is publicly available through the project page at https://chaikalamrullah.github.io/RoofVIP/.

2:00pm - 2:15pm

Evaluating 3D Scene Representations for Aerial Photogrammetry across Diverse Cityscapes

Shihan Chen¹, Zhaojin Li², Qingsong Yan¹, Haibing Liu¹, Huchen Li¹, Wubiao Huang¹, Fei Deng^1,3

¹School of Geodesy and Geomatics, Wuhan University, Wuhan, China; ²Technology and Engineering Center for Space Utilization, University of Chinese Academy of Sciences, Beijing, China; ³Hubei Luojia Laboratory, Wuhan, China

The proliferation of continuous Neural Radiance Field (NeRF) and 3D Gaussian Splatting (3DGS) has shifted the paradigm of 3D aerial reconstruction from relying solely on geometric stereo matching to inverse rendering optimization. However, while these emerging rendering-based frameworks excel in synthesizing photo-realistic novel views, their capability to extract accurate surfaces in complex aerial scenarios remains ambiguous compared to traditional methods. To establish a clearer understanding, this study presents a comprehensive evaluation of five representative frameworks spanning traditional Structure from Motion (SfM), purely Signed Distance Field (SDF) representations, unstructured 3D Gaussians, hybrid voxel-Gaussians, and strictly explicit sparse voxels. By systematically standardizing identical computational environments, inputs, and unified mesh-extraction pipelines on both real-world airborne LiDAR datasets and synthetic cityscapes, we assess their performance regarding visual fidelity, geometric accuracy, and resource efficiency. The experimental results reveal that while traditional MVS produces the highest overall geometric precision by strictly enforcing multi-view rigid geometry, it is prone to failures in texture-less regions. Among rendering-based representations, a fundamental trade-off exists: highly flexible, unstructured 3DGS achieve highest visual scores but degrade the underlying geometric surfaces; conversely, explicitly structured techniques demonstrate distinct superiority in regularizing topological coherence and floating artifact suppression. Furthermore, we observe that integrating structured voxels avoids the severe memory bottlenecks associated with extracting geometries from chaotic unorganized primitives. These empirical findings emphasize that for large-scale aerial photogrammetry, integrating explicit spatial structuralization into differentiable rendering pipelines is imperative for achieving scalable operations and bridging the geometric accuracy gap with traditional methods.

2:15pm - 2:30pm

Development of a 3D City Model-Based System for Pre-Flight Evaluation and Optimization of Aerial Image Acquisition Plans

Lixian Zhao, Saki Kato, Kenta Imai, Maki Itazu, Aya Matsui

Kokusai Kogyo Co., Ltd., Japan

In dense urban environments, aerial image acquisition often suffers from occlusions and redundant data due to the lack of quantitative evaluation tools at the flight-planning stage. To address this issue, this study develops a flight-planning support system that enables pre-acquisition visibility analysis for both terrain and building surfaces using existing 3D city models. The system performs ray-casting simulations based on user-defined flight parameters to quantify and visualize occluded and visible regions before flight, allowing planners to evaluate data quality and optimize image acquisition efficiency. Experiments were conducted using real flight plans with two representative aerial cameras: the Leica CityMapper-2 for multi-directional texture mapping and the Vexcel UltraCam Eagle 4.1 for nadir-based topographic mapping. The results show that the system effectively visualizes occlusions on roofs and walls, predicts building lean in nadir imagery, and assesses the influence of overlap ratios on ground visibility. These analyses enable users to design more cost-effective and geometrically consistent flight plans by identifying redundant overlaps and ensuring sufficient coverage for DSM and true-orthophoto generation. The proposed framework provides a quantitative and objective approach to improving the transparency and reliability of aerial survey planning, and it offers a foundation for integrating visibility simulation with subsequent photogrammetric workflows such as surface reconstruction and texture mapping.

2:30pm - 2:45pm

Image LiDAR based change detection and updating for urban 3D reconstruction

Teng Wu, Bruno Vallet

Univ Gustave Eiffel, Géodata Paris, IGN, LASTIG, F-77454 Marne-la-Vallée, France

There is a high demand for accurate and up-to-date territorial digital twins for human activities, but their production and updating costs remain prohibitive for many applications. Their generation relies on acquiring LiDAR and/or image data over the territory of interest. Each modality has its advantages: LiDAR is more accurate but more costly, while images are noisier but less costly and more easily accessible. Combining these two technologies to produce and update digital twins is thus a promising avenue.In this paper, we propose a pipeline based on 3D change detection to update a LiDAR point cloud using newer aerial imagery. First, triangle meshes are generated from LiDAR data and image-based dense matching. Then, 3D ray tracing is used to detect changes. After removing the changed parts, the point clouds are fused to update the scene.The proposed method is demonstrated on two datasets in France.The code will be open source on Github: https://github.com/whuwuteng/ChangeUpdateJN.

2:45pm - 3:00pm

SF-Recon: Simplification-Free Lightweight Building Reconstruction via 3D Gaussian Splatting

Zihan Li, Tengfei Wang, Wentian Gan, Hao Zhan, Xin Wang, Zongqian Zhan

School of Geodesy and Geomatics, Wuhan University, China PR.

Lightweight building surface models are crucial for digital city, navigation, and fast geospatial analytics, yet conventional multi-view geometry pipelines remain cumbersome and quality-sensitive due to their reliance on dense reconstruction, meshing, and subsequent simplification. This work presents SF-Recon, a method that directly reconstructs lightweight building surfaces from multi-view images without post-hoc mesh simplification. We first train an initial 3D Gaussian Splatting (3DGS) field to obtain a view-consistent representation. Building structure is then distilled by a normal-gradient–guided Gaussian optimization that selects primitives aligned with roof and wall boundaries, followed by multi-view edge-consistency pruning to enhance structural sharpness and suppress non-structural artifacts without external supervision. Finally, a multi-view depth–constrained Delaunay triangulation converts the structured Gaussian field into a lightweight, structurally faithful building mesh. Based on a proposed SF dataset, the experimental results demonstrate that our SF-Recon can directly reconstruct lightweight building models from multi-view imagery, achieving substantially fewer faces and vertices while maintaining computational efficiency.

1:30pm - 3:00pm

IvS2: Canadian Advances in Geospatial AI for Intelligent and Resilient Mobility
Location: 716A

1:30pm - 1:45pm

Toward a Unified Geospatial Intelligence Framework Utilizing Edge Computing, IoT, and Multimodal Generative AI for Climate Risk Mitigation and Adaptive Evacuation Planning

Truong Thanh Hung Nguyen, Hung Cao

Analytics Everywhere Lab - University of New Brunswick, Canada

Climate-induced hazards are increasing in frequency and complexity, creating a pressing need for real-time, adaptive, and spatially aware decision-support systems. Existing climate monitoring and evacuation planning approaches often rely on centralized analytics and static geospatial products, which limit their ability to respond to rapidly evolving conditions. This research introduces a Unified Geospatial Intelligence Framework that integrates Edge Computing, Internet of Things (IoT) sensor networks, and Multi-Generative AI (GenAI) models to enhance climate risk mitigation and adaptive evacuation planning. The framework is conceptualized as an extension of the Intelligence Everywhere paradigm, which promotes pervasive, context-aware intelligence across distributed sensing and computational environments.

The framework fuses satellite imagery, UAV data, environmental IoT streams, mobility traces, and other geospatial sources into a multi-layer analytics ecosystem. IoT and edge nodes perform decentralized, low-latency inference for early hazard detection, ensuring resilience even under degraded network conditions. Multi-GenAI models—including generative geospatial models, large language models, and graph neural networks—provide predictive hazard analytics, uncertainty quantification, and scenario simulation to support proactive decision-making.

An adaptive evacuation module integrates real-time transportation data, connected vehicles, and mobility models to dynamically optimize evacuation routes as conditions evolve. Mobile platforms, such as drones and emergency vehicles, act as intelligent edge nodes, enriching situational awareness and enabling distributed coordination.

The proposed framework advances geospatial AI and disaster informatics by demonstrating how pervasive intelligence can significantly improve hazard detection, evacuation efficiency, and climate resilience.

1:45pm - 2:00pm

A Theoretical Framework for Environmental Similarity and Vessel Mobility as Coupled Predictors of Marine Invasive Species Pathways

Gabriel Spadon¹, Vaishnav Vaidheeswaran¹, Claudio DiBacco²

¹Faculty of Computer Science, Dalhousie University, Halifax - NS, Canada; ²Fisheries and Oceans Canada, Bedford Institute of Oceanography, Dartmouth - NS, Canada

Marine invasive species spread through global shipping and generate substantial ecological and economic impacts. Traditional risk assessments require detailed records of ballast water and traffic patterns, which are often incomplete, limiting global coverage. This work advances a theoretical framework that quantifies invasion risk by combining environmental similarity across ports with observed and forecasted maritime mobility. Climate-based feature representations characterize each port's marine conditions, while mobility networks derived from Automatic Identification System data capture vessel flows and potential transfer pathways. Clustering and metric learning reveal climate analogues and enable the estimation of species survival likelihood along shipping routes. A temporal link prediction model captures how traffic patterns may change under shifting environmental conditions. The resulting fusion of environmental similarity and predicted mobility provides exposure estimates at the port and voyage levels, supporting targeted monitoring, routing adjustments, and management interventions.

2:00pm - 2:15pm

Congestion-aware multi-agent reinforcement learning for wildfire evacuation routing

Bahareh Raei, Reza Safarzadeh, Xin Wang

University of Calgary, Canada

Wildfires are increasing in frequency and severity, placing growing pressure on communities and emergency management systems. When evacuations are ordered, large populations must move simultaneously over road networks never designed for such concentrated demand, particularly in small towns with only a few access corridors where delays or closures can sharply increase exposure to roadway hazards.

Evacuees often rely on everyday navigation apps that compute a fastest route for each driver. Although effective for routine travel, these systems optimise individual convenience rather than collective performance. When widely used during an emergency, they concentrate traffic onto the same nominally optimal links and offer little ability to reflect fire progression, road closures, or rapidly evolving congestion. As a result, standard navigation tools can unintentionally channel evacuees toward capacity-limited roads near advancing fire fronts.

This paper introduces a congestion-aware multi-agent reinforcement learning framework for wildfire evacuation. Operating on an OpenStreetMap-derived road graph and parcel-level building data for Lytton, British Columbia, each road junction hosts a Q-learning agent that learns exit-directed navigation policies and, during deployment, adjusts its decisions using penalties based on real-time edge usage and mapped fire zones. The framework formulates parcel-based evacuation as a distributed decision process and incorporates evolving congestion through traffic-aware batch routing. Through a detailed case study, we demonstrate substantial reductions in peak edge loading and fire-zone incursions compared with fastest-path routing while maintaining competitive travel distances.

2:15pm - 2:30pm

Exploring Bus Stop Passenger Ridership Using explainable Machine Learning

Ge Cui

University of New Brunswick, Canada

Over the past decade, promoting sustainable urban transportation has become increasingly important in North America due to growing populations and rising traffic congestion. Public transit, particularly bus systems, plays a critical role in reducing reliance on private vehicles. This study examines bus stop ridership in Fredericton, Canada, considering several explanatory variables, including public transit infrastructure, socio-economic factors, and local amenities. XGBoost was used to model the relationship between ridership and these variables, and SHAP was applied to quantify the contribution of each feature for enhancing interpretability. Results indicate that higher levels of bus service, specifically the number of bus routes and service frequency, are the most influential factors, showing strong positive associations with ridership. Other transportation infrastructure features, such as the availability of shelters, also have a positive impact. The findings suggest that strategically locating bus stops near high-amenity areas and well-planned bus transfer hubs can attract more passengers. Additionally, distributing bus hubs more evenly could help alleviate the exceptionally high volume at the current bus hub at Kings Place. By combining XGBoost and SHAP, this study provides both accurate predictions and transparent insights, supporting urban planners in optimizing public transit systems and promoting sustainable mobility.

2:30pm - 2:45pm

Advancing Geospatial Analysis with Foundation Models and LLMs in ArcGIS

Mohamed Ahmed

Esri Canada, Canada

Foundation models and large language models (LLMs) are rapidly transforming geospatial artificial intelligence, yet their effective use in operational remote sensing and GIS workflows remains insufficiently defined. Although these models offer strong generalization capabilities, a key challenge is translating them into robust, domain-relevant tools that support practical analysis and decision-making. This presentation addresses that gap by showing how foundation models and LLMs can be integrated into ArcGIS workflows to improve the extraction, interpretation, and use of information from Earth observation imagery and unstructured geospatial content.

Using examples based on models such as the Segment Anything Model (SAM), Prithvi, and other foundation models for image segmentation and Earth observation analysis, the session demonstrates how these architectures can support feature extraction, land-cover classification, hazard mapping, and related remote sensing tasks with reduced reliance on large labelled datasets. In parallel, the presentation examines how LLMs extend geospatial analysis beyond imagery through natural-language interaction, geospatial reasoning, entity extraction, and the synthesis of spatially relevant information from unstructured sources.

A central focus of the session is the adaptation of general-purpose models to geospatially specific problems. The presentation therefore highlights efficient fine-tuning strategies, including Low-Rank Adaptation (LoRA), as practical mechanisms for customizing foundation models to local environments, imagery characteristics, and application domains without the computational burden of full retraining. Through applied examples in ArcGIS, the session illustrates how these models can be combined into scalable workflows that reduce manual effort, accelerate analysis, and enhance the quality and usability of geospatial outputs for research and operational practice.

1:30pm - 3:00pm

Forum2B: The Future of Space- based Earth Observation
Location: 716B

1:30pm - 3:00pm

Forum7A: Entrepreneurship in the Industry 4.0 Geospatial Landscape
Location: 717A

1:30pm - 3:00pm

InS3: Industry Tech Session
Location: 717B

1:30pm - 5:00pm

General Assembly (tentative session)
Location: 701A

3:00pm - 3:30pm

Afternoon Coffee Break
Location: Exhibition Hall "E"

3:30pm - 5:15pm

ThS12: TLS-based Deformation Analysis
Location: 713A

3:30pm - 3:45pm

Complementing and validating uncertainty of terrestrial laser scanning via interval analysis

Reza Naeimaei, Steffen Schön

Institut für Erdmessung (IfE), Leibniz University Hannover, Hannover, Germany

Terrestrial laser scanning (TLS) enables dense spatial sampling; however, millimeter-level deformation analysis is limited by uncertainty rather than resolution, as inter-epoch differences can arise from actual change or residual systematic effects. Classical methods capture random variability under distributional assumptions but do not guarantee bounds for persistent systematic effects.

This paper presents a complementary interval-based framework that provides reliable, distribution-free bounds for TLS uncertainty and integrates seamlessly with least-squares workflows.

Starting from a measurement and instrumental correction model for high-end panoramic scanners, deviations of effective parameters are propagated to TLS observations and represented as interval radii at the observation level. We then extended the Least-Squares Adjustment, which linearly maps observation-level interval bounds to residuals and parameter estimates, providing conservative first-order enclosures alongside stochastic covariances.

Validation without a trusted nominal is addressed via a residual-based strategy that exploits two-face (Face 1/Face 2) acquisitions. This paper proposes a framework to validate intervals without existing nominal values. It begins with challenges and also guides addressing these challenges to ensure fair validation and test the proposed method on real TLS data. Overall, the proposed framework provides guaranteed bounds for remaining effects, improves discrimination between actual deformation and systematic effects, and offers actionable diagnostics for TLS-based monitoring.

3:45pm - 4:00pm

Point-based, profile-based and 3D point cloud-based vibration monitoring of structures: comparisons based on a lab experiment

Oliver Geißendörfer¹, Victoria Rosa², Hans-Berndt Neuner², Christoph Holst¹

¹Technical University of Munich, Germany; ²Technical University of Vienna, Austria

The safety and longevity of civil infrastructure rely on robust structural health monitoring (SHM), yet conventional methods are

often constrained by the high cost and impracticality of contact-based sensors. On the other hand, existing non-contact technologies

typically specialize in either static geometric mapping or spatially limited dynamic vibration analysis, leading to fragmented data

and complex post-processing. This research presents a unified non-contact methodology that addresses this challenge by simul-

taneously acquiring high-resolution 3D geometry time-series vibrational data using a single Light Detection and Ranging (LiDAR)

device. For this purpose, we compare point-based measurements using a total station, an iPhone along with a profile-based LiDAR

and 3D LiDAR point clouds for an experimental analysis. Sensor observations are recorded and analyzed at the same location

on the experimental surface showing flexibility in input dimensionality as well as robustness in resulting scalograms. The core

of the analysis is our developed method, a directional wavelet transform, a signal processing technique uniquely suited handling

non-stationary signals as multidimensional unstructured data. This method enables the characterization of oscillations across the

unstructured 3D surface, a capability beyond traditional modal analysis with one-dimensional time-frequency localization, but using

LiDAR point cloud time series. The result is a richer and more integrated understanding of structural behavior, capable of revealing

vibration behavior in high spatial detail. The study demonstrates that spatio-temporal LiDAR data contains embedded dynamic

information, offering a more comprehensive and efficient way to assess the health and integrity of a structure in the future.

4:00pm - 4:15pm

From tensor-product to truncated hierarchical B-splines: Enhancing spatial Resolution in space-continuous Deformation Analysis based on 3D point clouds

Elisabeth Ötsch, Hans Neuner

TU Wien, Department of Geodesy and Geoinformation, Austria

The quasi-continuous capturing of our environment by terrestrial laser scanning (TLS) in form of 3D point clouds provides the basis for numerous spatial analyses, including space-continuous deformation analysis. In times of aging infrastructure and climate

change-induced, cumulative mass movements, statistically-sound methods for determining areal deformations are becoming increasingly important. However, the lack of reproducibility of absolute point positions between consecutive scans and the resence of measurement noise demand approaches that retrieve credible comparison statements. The representation of point clouds by geometric surfaces supports noise reduction and serves as basis for successive analysis. Tensor-product B-spline surfaces have proven to be particularly versatile geometric representations to derive spatially consistent deformation estimates. This paper extends this concept by investigating the use of truncated hierarchical B-splines for statistically sound deformation analysis. We show that deformation is detectable when partition of unity is preserved through truncation. In a simulated environment, significant deformations between two point clouds were successfully detected. Results indicate that coarse surface representations lead to type-1 errors and underestimated deformation magnitudes, whereas more refined surface representations yield consistent deformation estimates, providing a potential termination criterion for adaptive model refinement.

4:15pm - 4:30pm

Towards a Framework for Benchmarking Dense 3D Displacement Estimation Approaches for Geomonitoring Using Long-Range TLS Data

Nicholas Meyer, Tomislav Medic, Andreas Wieser

Institute of Geodesy and Photogrammetry, ETH Zurich, Switzerland

Accurate and spatially dense 3D displacement estimation can contribute to a better understanding of geomorphological processes, while long-range terrestrial laser scanning (LR-TLS) has emerged as a promising technique for generating such observations. However, selecting the most effective algorithms for dense 3D displacement estimation remains challenging due to the lack of benchmarking. This study introduces an open and extensible benchmarking framework for 3D displacement estimation and provides an initial validation through a systematic comparison of representative 2D projection-based and 3D point cloud--based methods for estimating 3D displacements from LR-TLS scans. The evaluation includes 252 combinations of algorithmic and hyperparameter configurations, covering cross-correlation, optical flow, and salient feature tracking approaches, as well as the 3D displacement estimation method F2S3. All methods were benchmarked on a single common LR-TLS dataset, using sparse GNSS and manually derived displacements as ground truth. Results show that F2S3 achieves the highest agreement with the ground truth, while the top-performing configurations of the 2D approaches reach comparable accuracy, albeit slightly lower than that of F2S3. Our findings further highlight key sensitivities of current methods to parameter choices and data characteristics. The presented open and extensible evaluation framework enables reproducible performance assessment and could provide a foundation for future large-scale benchmarking and further development of 3D displacement estimation techniques for LR-TLS data.

4:30pm - 4:45pm

Joint Stone Segmentation and Feature Driven Deformation Analysis at Water Dams

Annika Tobies, Judith Foth, André Cornelißen, Eike Koller, Lasse Klingbeil, Heiner Kuhlmann

Institute of Geodesy and Geoinformation, University of Bonn, Germany

Structural health monitoring of water dams is crucial to ensure their long-term safety and operational reliability. Traditional geodetic techniques, although precise, are limited to sparse observation points and cannot capture spatially heterogeneous deformations. Laser scanning enables comprehensive, area-wide acquisition, overcoming this limitation. Subsequent deformation analysis often relies on comparisons along the local surface normal, which are limited in detecting in-plane movements. To address this, this study presents an approach that combines image-based stone segmentation with point-cloud-based deformation analysis to estimate both in-plane and out-of-plane displacements across masonry dam surfaces. Individual stones are detected in unmanned aerial vehicle (UAV) imagery using a deep learning segmentation model (Mask R-CNN) and subsequently projected into corresponding point clouds acquired by terrestrial laser scanning (TLS) and UAV laser scanning. By establishing consistent stone correspondences across multi-epoch point clouds via centroid-based matching and local iterative closest point (ICP) alignment, the proposed method enables deformation analysis on a stone-by-stone level. Simulated deformations were applied to TLS- and UAV-based point clouds of a dam to evaluate the method. Results demonstrate that the approach achieves sub-centimeter accuracy for the TLS and low-centimeter accuracy for the UAV point cloud, as measured by the RMSE between the estimated and true deformation. Our approach outperforms conventional model-to-model comparison methods, such as Multiscale Model to Model Cloud Comparison (M3C2), for in-plane displacements. The integration of image segmentation and geometric analysis provides a powerful framework for full-field deformation monitoring of masonry structures, supporting the detection of instabilities and improving dam safety.

4:45pm - 5:00pm

Reducing Non-rigidity in TLS Point Clouds Induced by Inhomogeneous Systematic Errors Using Free-form Surface Modeling

Yihui Yang¹, Corinna Harmening², Daniel Czerwonka-Schröder³, Christoph Holst¹

¹Chair of Engineering Geodesy, TUM School of Engineering and Design, Technical University of Munich, Germany; ²Geodetic Institute, Karlsruhe Institute of Technology, Germany; ³Department of Geodesy, Bochum University of Applied Sciences, Germany

In geodetic monitoring, terrestrial laser scanning (TLS) point clouds are typically assumed to be accurate and true-to-scale, implying that data acquired from different epochs or stations differ only by rigid transformations. Consequently, systematic errors related to scanner or platform variations can be mitigated through rigid point cloud registration. However, variations in the propagation speed and path of laser beams due to atmospheric refraction, as well as ranging biases induced by surface properties, can introduce non-rigid distortions in the generated point clouds. These effects are particularly pronounced under complex meteorological and topographic conditions, such as in mountainous areas. As a result, the acquired point clouds exhibit inhomogeneous and non-linear deviations that cannot be effectively compensated by simple distance corrections or rigid transformations. In this study, robust rigid registration is first performed to minimize the effects of platform offsets. A data-driven approach is then employed to generate sparse stable points, providing distance deviations that incorporate spatially varying systematic errors. Finally, a free-form surface is fitted to these sparse point-wise distance deviations, thereby establishing a 3D correction field for the entire point cloud. For a dataset collected by a permanent TLS monitoring system in the Vals Valley (Tyrol, Austria), the proposed method effectively reduces the registration residuals in TLS point clouds caused by inhomogeneous systematic errors.

5:00pm - 5:15pm

Calibration of Panoramic Terrestrial Laser Scanners using Planar Patches

Eike Koller¹, Lasse Klingbeil², Heiner Kuhlmann³

¹University of Bonn, Germany; ²University of Bonn, Germany; ³University of Bonn, Germany

Using point clouds captured by Terrestrial Laser Scanners for measurement tasks with high-quality requirements is well established in engineering geodesy. However, geometric imperfections within the scanners introduce systematic deviations into the captured point clouds. These deviations often reach several millimeters in magnitude, exceeding the impact of random measurement noise. Calibrating the scanners by estimating these internal imperfections allows these systematic errors to be corrected, thereby preventing misinterpretations of the measurement results. In this work, we develop a methodology that allows users of Terrestrial Laser Scanners to independently determine calibration parameters for panorama scanners and to correct the resulting point clouds using planar patches extracted directly from the captured data. This approach requires no additional hardware or specialized measurement equipment. We evaluate the methodology using an independent point cloud of a water dam and demonstrate that it achieves a substantial reduction in systematic deviations. Furthermore, by estimating calibration parameters in a dedicated state-of-the-art calibration field, we show that our method delivers results comparable to these established calibration procedures—yet without the need for such specialized calibration environments.

5:15pm - 5:30pm

Methodological framework for determining vertical angular variances of terrestrial laser scanners

Jakob Hummelsberger¹, Omar AbdelGafar¹, Derek Lichti², Christoph Holst¹

¹Chair of Engineering Geodesy, TUM School of Engineering and Design, Technical University of Munich, Munich, Germany; ²Department of Geomatics Engineering, Schulich School of Engineering, University of Calgary, Calgary, Canada

Information on the precision of TLS observables is limited. While the range measurement precision can be modeled with respect to the intensity measurement nowadays, the precision of the angular observations still relies on the claims of the manufacturer. This contribution proposes a method to determine the vertical angular variance of a TLS using profile measurements. Supported by a simulation, which serves as proof-of concept, the methodology is laid out. In the end, measurements with a Z+F IMAGER® 5016A are evaluated. A dependency of the angular standard deviation on the rotational speed of the beam deflection unit is observed. The estimation precision of the angular standard deviation is high with consistent values for differing ranges. The estimated angular standard deviations are much lower than the claims of the manufacturer starting with roughly 2" for the slowest rotating settings, up to 4" for the fastest. All this can be achieved by scanning a reflectivity target with at least two adjacent fields of different homogeneous reflectivity. This needs to be aligned to the scanner to reduce and eliminate as many contributing error sources as possible. The target itself provides the fields and the transitions needed to perform the in-situ estimation of the angular precision.

3:30pm - 5:15pm

WG IV/2B: Artificial Intelligence and Uncertainty Modeling in Spatial Analysis
Location: 713B

3:30pm - 3:45pm

Chat2Map: A ReAct-based Agent Framework for Automated Web Map Generation from Natural Language Instructions

Hongping Zhang¹, Peilong Ma², Cong Wang¹, Lei Ding¹, Zhen Wang¹, Heng Li¹

¹National Geomatics Center of China, China, People's Republic of; ²Nanjing Normal University, School of Geography, Nanjing, Jiangsu,China

WebGIS platforms have revolutionized geospatial data dissemination, yet their adoption remains constrained by the steep learning curve of mapping library APIs. Frontend libraries like Leaflet, OpenLayers, and platforms such as Tianditu contain hundreds of classes and methods, requiring substantial programming expertise. This technical barrier prevents domain experts—urban planners, environmental scientists, public health officials—from independently creating the visualizations they need for analysis and decision-making.While Large Language Models (LLMs) have revolutionized code generation, they struggle with domain-specific, low-resource APIs common in geospatial applications. When applied to specialized geospatial APIs, these models exhibit critical failures: they frequently "hallucinate" non-existent functions, misuse parameters, or generate syntactically plausible but semantically incorrect code. This unreliability stems from the underrepresentation of domain-specific libraries in LLMs' training corpora, creating a "last mile" problem that renders them unsuitable for professional geospatial development. This study proposes a ReAct-based agent framework for automated web map generation from natural language instructions. The framework constructs a stateful, cyclic workflow and enables human–AI interactive WebGIS code generation based on the Tianditu JavaScript API. Its effectiveness and generality are validated through multi-model evaluation (GPT-4, Claude 3, Llama 3, Qwen-Max), demonstrating robust performance across diverse application scenarios. Experimental results show that the framework achieves professional-grade quality in both directive-driven and data-driven geospatial visualization tasks.

3:45pm - 4:00pm

Bridging Human Intent and Geospatial Services: A Conceptual Framework and Feasibility Study for Text2GeoAPI

Lei Ding, Heng Li

National Geomatics Center of China, 100830 Beijing, China

With the proliferation of online geospatial services, Geospatial Application Programming Interfaces (GeoAPIs) have become the backbone of modern spatial data interoperability. However, the high technical barriers of GeoAPIs, characterized by complex RESTful syntax and deterministic parameter requirements, create a significant "digital divide" for non-expert users. To bridge the gap between intuitive human spatial intent and technical service execution, this study proposes Text2GeoAPI, a novel conceptual framework for the automatic invocation and composition of geospatial services via natural language. We introduce the Intent-Entity-Operation (IEO) model to formalize spatial tasks, decoupling high-level semantic goals from atomic technical operations. We developed a modular prototype leveraging Large Language Models (LLMs) as cognitive engines to perform structured intent parsing, dynamic workflow planning, and multi-source result synthesis. Experimental evaluations using 100 diverse spatial queries demonstrate an overall task success rate of 86%, with the system effectively orchestrating multi-hop service chains (e.g., Geocoding → Isochrone Analysis → POI Search). The results confirm that Text2GeoAPI significantly lowers the threshold for accessing professional geospatial analysis, shifting the GIS paradigm from "tool-centric" to "intent-centric" intelligence.

4:00pm - 4:15pm

AI for Inclusive Winter Mobility: Multimodal Integration for Detecting Barriers Affecting People with Disabilities

SARA SHAHSAVARANI, Mir Abolfazl Mostafavi

¹Center for Research in Geospatial Data and Intelligence (CRDIG), Department of Geomatics Sciences, Université Laval, 1055, Avenue du Séminaire, Quebec City, QC G1V 0A6, Canada; ²Center for Interdisciplinary Research in Rehabilitation and Social Integration (Cirris), Quebec City, QC G1M 2S8, Canada

Winter accessibility poses critical challenges in cold-climate cities such as Québec, where snow and ice accumulation restrict the mobility of people with disabilities. This study presents an AI-driven multimodal framework designed to detect, classify, and

map winter barriers affecting pedestrian accessibility in Québec City. Building upon the SNOWMAN project, synthetic image and textual datasets were developed to represent seven major snow- and ice-related obstacle categories, including icy ruts, deep

snow, and uncleared sidewalks. The visual modality employed a self-supervised SimCLR model for snow-barrier classification (F1-score = 0.93), while the textual modality used a fine-tuned BERT classifier, achieving a perfect F1-score = 1.00 on validated

synthetic descriptions. Canonical Correlation Analysis (CCA) aligned the two modalities into a shared latent space, enabling spatial fusion of visual and semantic embeddings for integrated analysis within the MobiliSIG Winter Mobility platform. The fused data

produced dynamic accessibility maps revealing clusters of recurring winter hazards in known high-risk zones. The results confirm the feasibility of using synthetic multimodal data to simulate pedestrian-scale winter conditions and demonstrate the potential of

multimodal AI for inclusive, data-driven mobility management in cold-climate cities.

4:15pm - 4:30pm

Assessing residential Land Efficiency with spatial–contextual GMM and human Activity big Data: a Case Study of Shenzhen

Shihao Liang¹, Yixin Liu¹, Renzhong Guo¹, Weixi Wang¹, Ding Ma¹, Ye Zheng², Shengjun Tang¹, Linfu Xie¹, Xiaoming Li¹

¹Research Institute for Smart Cities & MNR Key Laboratory of Urban Land Resources Monitoring and Simulation, School of Architecture and Urban Planning, Shenzhen University; ²Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, 315211, China

As China’s urban development shifts toward stock-based optimisation, identifying inefficient residential land has become important for urban regeneration. Existing approaches often rely on subjective weighting, linear analytical structures, or homogeneous treatment of different residential types, which weakens robustness and transferability. To address these limitations, this study proposes a data-driven framework that integrates mobile-phone signaling and other multi-source spatiotemporal big data in Shenzhen. Two dominant residential forms—formal residential communities and urban villages—are evaluated separately through a four-dimensional framework covering built form, activity vitality, economic efficiency, and environmental livability. Principal component analysis is used to estimate intrinsic dimensionality and initialize a parametric autoencoder. A spatially constrained Gaussian mixture model is then employed to identify inefficient residential clusters while preserving local coherence. The clustering results are interpreted using a random forest model and TreeSHAP, and externally validated by street-view imagery interpretation and limited field surveys. PCA retained five components for urban villages and six for formal residential communities, and the BIC selected six and five clusters for the two residential types, respectively. The results indicate that inefficient formal residential communities show scattered and island-like spatial patterns, whereas inefficient urban villages tend to form more continuous clusters along the edges of larger village agglomerations. Random forest and TreeSHAP further reveal that inefficient urban villages are more strongly associated with deficiencies in service accessibility and local socioeconomic conditions, whereas inefficient formal residential communities are more closely associated with lower residential vitality and relatively high development intensity. External validation indicates acceptable agreement with observed residential conditions.

4:30pm - 4:45pm

Reproducing Geospatial Crowdsourcing: How Consistent Is the Crowd?

David Collmar, Volker Walter, Uwe Sörgel, Roland Ullmann

University of Stuttgart, Germany

This paper investigates the long-term consistency and reliability of paid geospatial crowdsourcing on the online platform Microworkers.com. Over a five-month period, we conducted three crowdsourcing campaigns, each representing a task typical for remote sensing, i.e., pixel classification, point selection, and geometric outline acquisition, to assess whether repeated worker participation enhances data quality and reproducibility. Beyond individual task performance, we examine the broader question of whether crowdsourcing campaigns can yield reproducible results over extended periods. Despite the large and heterogeneous workforce of Microworkers.com, a substantial share of tasks was completed by recurring workers who consistently outperformed one-time participants. Furthermore, across all campaigns, data quality remained largely stable, with only minor variability between epochs. Additionally performed statistical analyses confirm that reproducible outcomes are achievable, highlighting the potential of reliable and reproducible crowdsourcing results for geospatial data acquisition.

4:45pm - 5:00pm

Shaping the Colonial Port: Urban Networks and Spatial Form in the Early Modern Era

Chaoqun Wang, Jie He

Harbin Institute of Technology, Shenzhen, China, People's Republic of

This abstract presents a comprehensive research framework examining the interplay between colonial trade networks and the spatial form of port cities during the early modern era. Firstly, the study constructs a geographic database of nearly 300 colonial port cities, using intercity trade data from East India Company archives as network edges to analyze their structural and morphological evolution. Secondly, it processes historical maps of colonial ports through a fine-tuned multimodal large language model to extract and classify spatial morphological features, establishing a systematic typology of urban form patterns. Thirdly, the research develops regression models to reveal correlations between network status and morphological patterns. Preliminary findings highlight Batavia's dominant yet volatile role within the network and reveal a trend toward decentralization over the 18th century. The research contributes to both urban historical studies and digital humanities by offering a scalable, comparative approach to interpreting colonial port cities as spatial manifestations of global economic and political forces, while establishing empirical relationships between network status and urban form characteristics. It further provides a refined framework for contextualizing their cultural heritage significance within trans-colonial networks.

5:00pm - 5:15pm

Vector generalization of the drainage network

Thalles Aquino¹, Edilson Bias¹, Maurício Paulo², Raul Feitosa³, Felipe Ferrari²

¹University of Brasília, Brazil; ²Institute of Engineering, Rio de Janeiro, Brazil; ³Pontifical Catholic University, Rio de Janeiro, Brazil

This study explores the application of Graph Convolutional Networks (GCNs), specifically the GraphSAGE model, to the cartographic generalization of hydrographic networks in the state of Santa Catarina, Brazil. The generalization of river segments is critical for transitioning from detailed (1:25,000) to generalized (1:100,000) scales. It's traditionally a manual, rule-based process. By modeling drainage systems as graphs and training deep learning models with data from the Brazilian Army's Geospatial Database (BDGEx), this research evaluates how geometric and semantic attributes influence generalization outcomes. This data follows Brazilian Technical Specifications of the Geospatial Vector Data Structure (ET-EDGV), therefore they figure as a systematic data from Brazilian institutions.

GraphSAGE model was trained four times, each incorporating varying combinations of attributes such as segment length, sinuosity, polygon containment, and river flow regime. The model trained with all attributes achieved the highest accuracy (99.98%). Even models using geometric features surpassed 93% accuracy. These results highlight the effectiveness of GCNs in capturing structural patterns.

This study compares GraphSAGE model outputs to those generated by the GeoData Loader for Mapserver (GDLMS), the current operational system for generalization, developed and used by the Geographic Service of the Brazilian Army. It also compares those generalization to reference data acquired by manual generalization using the same 1:25.000 scale input. Visual analysis in GIS environments reveals that GCNs can be an alternative for generalization tasks.

This research demonstrates the viability of using GeoAI methods for automating complex cartographic processes, offering a scalable and data-driven solution aligned with national geospatial data standards.

3:30pm - 5:15pm

WG III/4A: Landuse and Landcover Change Detection
Location: 714A

3:30pm - 3:45pm

ChangeDINO: DINOv3-Driven Building Change Detection in Optical Remote Sensing Imagery

Ching-Heng Cheng¹, Chih-Chung Hsu²

¹National Cheng Kung University, Tainan, Taiwan; ²National Yang Ming Chiao Tung University, Hsinchu, Taiwan

Remote sensing change detection (RSCD) aims to identify pixel-wise surface changes from co-registered bi-temporal images. However, many deep learning–based RSCD methods rely solely on change-map annotations and underuse the semantic information in non-changing regions, which limits robustness under illumination variation, off-nadir views, and scarce labels.

This paper presents ChangeDINO, an end-to-end multiscale Siamese framework for optical building change detection. The model fuses a lightweight backbone stream with features transferred from a frozen DINOv3, yielding semantic- and context-rich pyramids even on small datasets. A spatial–spectral differential transformer decoder then exploits multi-scale absolute differences as change priors to highlight true building changes and suppress irrelevant responses. Finally, a learnable morphology module refines the upsampled logits to recover clean boundaries. Experiments on four public benchmarks demonstrate that ChangeDINO achieves strong accuracy and robustness under cross-temporal appearance variations, yielding cleaner building boundaries with improved data efficiency.

3:45pm - 4:00pm

Hie-DinoMamba: Hierarchical DINOv3 and Mamba Architecture for Multi-Class Building Change Detection

Youngwoong Yoon¹, Jangwoo Cheon¹, Hwiyoung Kim¹, Impyeong Lee²

¹Geospatial Team, Innopam, Seoul, Republic of Korea; ²Department of Geoinformatics, University of Seoul, Seoul, Republic of Korea

Multi-class building change detection in high-resolution aerial imagery is essential for urban monitoring, yet remains challenging due to severe class imbalance and the limited representational capacity of encoders trained from scratch. We propose Hie-DinoMamba, a novel architecture that integrates a frozen 1.1B-parameter DINOv3-L encoder—pre-trained on the SAT-493M satellite dataset—with a newly designed Hierarchical Mamba FPN decoder. To bridge the domain gap between satellite pre-training and aerial imagery without incurring prohibitive computational costs, we adapt the encoder using parameter-efficient Low-Rank Adaptation (LoRA), updating only a small fraction of parameters while preserving the encoder's rich pre-trained knowledge. The decoder fuses multi-scale feature pairs from both time points via channel-wise concatenation and 1×1 projection, then refines them in a top-down manner using Visual State Space Model (VSSM) blocks that capture long-range spatial context with linear complexity. A dual-loss strategy decouples semantic classification (Focal Loss) from boundary delineation (Focal Tversky + Dice Loss), optimizing each objective at a different hierarchical level. On a 4-class aerial building change detection benchmark (41,548 image pairs, 0.1 m resolution, Seoul), Hie-DinoMamba achieves a state-of-the-art mIoU of 65.12% and Kappa of 75.77%, improving over the strongest baseline by 2.1 percentage points. An ablation study confirms that LoRA adaptation is the most critical component. Qualitative analysis further demonstrates robust generalization to geographically unseen regions.

4:00pm - 4:15pm

Stepwise Optimization and Ensemble Pipeline for Building Change Detection in High Resolution Satellite Imagery Using Mamba-Based Model

DongHyuk Jin¹, Junhwa Chi²

¹Department of Data Engineering, Pukyong National University, Busan, Republic of Korea; ²Division of Data Information Sciences, Pukyong National University, Busan, Republic of Korea

This study presents a stepwise optimization pipeline for high-resolution building change detection in dense urban environments using imagery from CAS500-1, Korea’s national land observation satellite. A dataset of 3,816 bi-temporal patch pairs from 29 urban regions was constructed to support model development and evaluation. A Mamba-based architecture, incorporating efficient global context modeling, was adopted as the baseline for binary change detection.

To enhance performance, the pipeline introduced three sequential optimization stages. First, normalization techniques suited for 12-bit radiometric imagery were compared, including percentile-based scaling, gamma adjustment, and logarithmic transformation. Second, augmentation strategies were evaluated, contrasting standard geometric augmentation with extended optical and temporal augmentation designed to improve generalization in structurally complex urban environments. Third, multiple ensemble strategies, ranging from simple averaging to confidence-weighted and hierarchical aggregation, were examined to overcome the limitations of individual model sizes.

Model performance was assessed using a comprehensive set of pixel-level, change-pixel-level, contour-based, and object-based metrics to ensure robust evaluation of both spatial precision and structural consistency. Experimental results showed that gamma-based normalization, comprehensive augmentation, and selected ensemble strategies each contributed measurable improvements. Combining these optimized components yielded a final hierarchical ensemble that improved the F1-Score from 0.7629 to 0.8070, representing a substantial gain over the baseline model.

Overall, this work provides a validated and extensible optimization strategy for high-resolution satellite-based change detection, offering practical guidance for operational applications and adaptability to future ensemble configurations across diverse architectures.

4:15pm - 4:30pm

Leveraging Geospatial Foundation Models for Bi-Temporal Land-Cover Change Detection

Mozhdeh Shahbazi, Mikhail Sokolov, Charles Authier, Marjan Asgari

Canada Centre for Mapping and Earth Observation, Natural Resources Canada, Canada

Recent advances in geospatial foundation models have enabled scalable and transferable solutions for Earth observation (EO) tasks, which can make them good candidates to achieve the requirements mentioned above. Foundation models are types of large-scale artificial intelligence (AI) models trained on massive and diverse datasets. In the EO domain, these datasets may include imagery, elevation models, geographic coordinates, temporal tags, sensors spectral information, and descriptive metadata. These models excel at representation learning through self-supervised training, enabling them to capture rich descriptive features without requiring labelled data. Consequently, they can serve as powerful backbones for downstream tasks such as land-cover change monitoring.

Accordingly, this paper provides an overview of the development process of a geospatial foundation model, Planaura. It demonstrates how this model is best adapted to Canadian landscapes and how it is used to achieve the task of land-cover change detection. Planaura is now accessible publicly via the model hub at HuggingFace: [Link hidden for blind review process]

4:30pm - 4:45pm

A Transformer-Based Framework for Spatiotemporal Unmixing of Land–Water Mixtures in Multispectral Satellite Data

An Bao Nguyen¹, Andreas Schenk², Stefan Hinz²

¹KU Leuven, Leuven, Belgium; ²Karlsruhe Institute of Technology, Karlsruhe, Germany

This paper presents a novel transformer-based framework for spatiotemporally dynamic spectral unmixing of multispectral satellite imagery. Spectral unmixing is essential for analyzing mixed pixels in remote sensing, especially in analyzing small objects such as narrow rivers when using coarse-resolution observations such as Sentinel-2 data. Most deep-learning based unmixing models typically account for a single scene and ignore the tempo-spatial variation of spectra and land-cover proportions.

To address this challenge, we introduce a unified deep learning architecture that leverages transformer attention mechanisms to exploit both spectral and auxiliary information causing spectral variations. The framework models the temporal and spatial evolution of abundances while simultaneously learning representative endmember spectra. By integrating cross-attention between spectral inputs, auxiliary variables, and temporal embeddings, the model can adapt to seasonal changes, illumination conditions, and scene-specific variability. The method is trained using synthetic mixtures derived from Sentinel-2 surface reflectance data.

Applied to monitoring small rivers with strong temporal, and spatial, and intrinsic variability, the proposed approach demonstrates improved accuracy in estimating water abundances and extracting water spectra in highly mixed river pixels (mixed with water and riverbank). The model effectively captures tempo-spatial transitions in water extent and sediment-laden river inflows, offering a more consistent representation than conventional unmixing techniques.

This work contributes a generalizable and end-to-end framework for handling dynamic unmixing scenarios in multispectral remote sensing. It provides new insights into the use of transformers for modeling spatiotemporal interactions and supports applications in environmental monitoring and water resource assessment.

4:45pm - 5:00pm

Land Cover Classification of Optical–SAR Imagery via Cross-Modal Interaction and Feature Alignment

Junqi Zhao, Min Chen, Wei Guo, Jinbo Zhang, Zelan Fu, Xuming Ge, Han Hu, Bo Xu, Qing Zhu

Faculty of Geosciences and Engineering, Southwest Jiaotong University, Chengdu, 611756, China

Land cover classification (LCC) plays a crucial role in geoscientific research and resource monitoring applications. Compared

with traditional single-modal classification methods, multimodal fusion models can more effectively leverage the complementary

information of optical and synthetic aperture radar (SAR) imagery, thereby improving classification performance in complex scen-

arios. However, due to the significant differences in the imaging mechanisms of the two sensors, inconsistencies in radiometric

properties and spatial structures arise between optical and SAR images, posing challenges for cross-modal feature interaction and

fusion. To address this issue, we propose a multimodal optical–SAR fusion network (MOSFNet) for high-precision LCC, which

incorporates two core modules: the Feature Interaction Module (FIM) and the Feature Fusion Module (FFM). The FIM achieves

complementary feature interaction between optical and SAR images through channel splitting and cross concatenation, while in-

corporating a coordinate attention mechanism to enhance the responsiveness of key land cover regions. The FFM leverages a 2D

selective scan (SS2D) mechanism to implement bidirectional cross-modal feature alignment and gated fusion in the hidden state

space, enabling deep correlation and adaptive integration of optical and SAR features. Experiments on the WHU-OPT-SAR dataset

demonstrate that MOSFNet significantly outperforms existing methods in terms of classification accuracy and model generalization,

providing an efficient and robust solution for high-precision land cover mapping with multi-source remote sensing imagery.

5:00pm - 5:15pm

Seasonal-Aware Scale-Semantic Consistency Alignment Change Detection Network

Bing Shao¹, Hanchao Zhang¹, Mingzhu Li², Yunkun Zou³, Ruiqian Zhang¹, Xiaogang Ning¹, Hao Wang¹

¹Chinese Academy of Surveying and Mapping Beijing, China; ²Liaoning Technical University Geomatics and Geographical Sciences, Fuxin, China; ³Joint Laboratory of Spatial Intelligent Perception and Large Model Application, Nanjing, China

Change detection in remote sensing imagery is a crucial method for obtaining dynamic information about land cover. However, pseudo-changes caused by seasonal variations pose a significant challenge to detection accuracy. Seasonal variations, such as vegetation phenology and snow cover, introduce global appearance differences that are often mistaken for actual land cover changes. This phenomenon is particularly prominent in long-term monitoring tasks, where pseudo-changes dominate the detection results. Addressing the issues of global appearance differences and multi-scale feature fusion induced by seasonal changes, We propose a novel Seasonal-Aware Scale-Semantic Consistency Alignment Change Detection Network (SSCANet) for remote sensing image change detection. This approach incorporates a Seasonal-Aware Scale Alignment (ASA) module and a Seasonal-Aware Semantic Guided Fusion (SGF) module. By employing spatial scale transformation and semantic alignment, it reduces information mismatch in multi-scale feature fusion and enhances the perception of details in change regions. Experiments conducted on the GZ-CD and CDD datasets demonstrate that SSCANet achieves overall accuracy with F1 scores of 89.21% and 97.82%, with precision rates of 89.02% and 98.37%, respectively. These results represent significant improvements over other methods, demonstrating that SSCANet outperforms its counterparts in both overall accuracy and seasonal robustness. The findings confirm that this approach effectively suppresses seasonal false changes, enhancing the accuracy and reliability of change detection.

3:30pm - 5:15pm

WG I/2B: Mobile Mapping Technology
Location: 714B

3:30pm - 3:45pm

Mitigating trajectory drift in tunnel mapping: evaluation of conventional and novel approaches applied to SLAM-based mobile mapping solution

Antonio Gualtiero Mainardi¹, Simone Marmaglio², Luca Perfetti¹, Giorgio Paolo Maria Vassena¹

¹Università degli Studi di Brescia, Dept. of Civil Engineering, Architecture, Territory, Environment and Mathematics (DICATAM), Italy; ²Università degli Studi di Brescia, Dept. of Information Engineering (DII), Italy

In Indoor Mobile Mapping Systems (iMMS) the trajectory estimation is implemented by the SLAM (Simultaneous Localization and Mapping) algorithm. By assuming a fixed environment surrounding the instrument, the algorithm relies on stable geometries to establish the trajectory. Drift effects represent the main source for errors and affect the trajectory estimation. These effects can be magnified in feature-deficient or degenerate environments, where the variation of geometrical elements can be minimal, as in the case of tunnels. In this context, difficult environments such as tunnels are suitable for the implementation of alternative algorithms for the trajectory estimation. Considering this kind of scenario, the contribution has the twofold objective of evaluating the results of two trajectory estimation methods, in terms of trajectory drift, with reference to an indoor SLAM-based MMS, and to establish a repeatable methodology to do so. A novel algorithm for the trajectory estimation, not just relying on geometrical SLAM algorithm, but also taking advantage of reflectance images coming from LiDAR sensors mounted on the system, is considered.

The case study is a 200 m long branch of a motor-way tunnel, with a diameter of 15 m. The test is further subdivided by computing all trajectories with different constraining strategies, first without any constraints, then considering global optimisation, loop closure and static control scans, to replicate typical realistic scenarios in tunnel mapping. The results of this work highlight how the novel reflectance-aided SLAM algorithm is beneficial in terms of drift reduction in the estimated trajectories.

3:45pm - 4:00pm

Range Error Detection and Evaluation for retroreflective Road Signs in Phase-Shift MMS Point Clouds

Saori Fukushi¹, Yoji Takahashi¹, Hirofumi Chikatsu²

¹Aero Toyota Corporation; ²Tokyo Denki University

This presentation addresses the challenge of range errors in point clouds of road signs captured by Mobile Mapping Systems (MMS) equipped with phase-shift laser scanners.

Under certain conditions, retroreflective materials cause range errors in point clouds. Previous studies have proposed mitigation techniques for range errors caused by sensor saturation in TOF systems, but similar studies on phase-shift systems are scarce. In addition, existing road sign detection methods assume accurate point representation, making them ineffective when sign points are displaced.

To overcome this limitation, we developed a detection method that first extracts road signs through point cloud visualization and then identifies range errors based on the standard deviation of relative distances from reference emission points.

The proposed approach was validated using 5 km of driving data collected on general roads. Results show that 32 road signs were extracted, and 26 were correctly detected as exhibiting range errors, achieving 100% agreement with manual visual assessment.

This study demonstrates the effectiveness of the proposed detection method and its potential for improving the reliability of identifying range errors of road signs on general roads.

4:00pm - 4:15pm

An RTK-SLAM Dataset for Absolute Accuracy Evaluation in GNSS-Degraded Environments

Wei Zhang, Vincent Ress, David Skuddis, Uwe Soergel, Norbert Haala

University of Stuttgart, Germany

RTK-SLAM systems integrate simultaneous localization and mapping (SLAM) with real-time kinematic (RTK) GNSS positioning, promising both relative consistency and globally referenced coordinates for efficient georeferenced surveying. A critical and underappreciated issue is that the standard evaluation metric, Absolute Trajectory Error (ATE), first fits an optimal rigid-body transformation between the estimated trajectory and reference before computing errors. This so-called SE(3) alignment absorbs global drift and systematic errors, making trajectories appear more accurate than they are in practice. We present a geodetically referenced dataset and evaluation methodology that expose this gap. A key design principle is that the RTK receiver is used solely as a system input, while ground truth is established independently via a geodetic total station. This separation is absent from all existing datasets, where GNSS typically serves as (part of) the ground truth. The dataset is collected with a handheld RTK-SLAM device, comprising two scenes. We evaluate LiDAR-inertial, visual-inertial, and LiDAR-visual-inertial RTK-SLAM systems alongside standalone RTK, reporting direct global accuracy and SE(3)-aligned relative accuracy to make the gap explicit. Results show that SE(3) alignment can underestimate absolute positioning error by up to 76\%. RTK-SLAM achieves centimeter-level absolute accuracy in open-sky conditions and maintains decimeter-level global accuracy indoors, where standalone RTK degrades to tens of meters. The dataset, calibration files, and evaluation scripts are made publicly available. The dataset, calibration files, and evaluation scripts are publicly available at https://rtk-slam-dataset.github.io/

4:15pm - 4:30pm

Novel View Synthesis Under Rainy Conditions with Neural Radiance Fields and Gaussian Splatting

Ivana Petrovska, Boris Jutzi

Karlsruhe Institute of Technology, Germany

Scene reconstruction and novel view synthesis from calibrated multi-view images still attracts a lot of attention in computer vision and graphics. However, the assumption that images are noise-free rarely holds in real-world scenarios where adverse weather conditions are inevitable. Being a part of our environment, we are particularly interested in rain as dynamic semi-transparent occlusion which imposes challenges to a complete and accurate geometry of the underlying features. More precisely, we qualitatively and quantitatively analyze the photometric image quality under rainy conditions generated by radiance field methods, namely: Neural Radiance Fields (NeRFs), 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) due to the different geometric representation. To assess the impact of rain to the scene reconstruction we consider raindrops and streaks captured with illumination variation as well as occlusion masks with different coverage. The evaluation is based on comparing 2D image metrics of the rendered novel views without and with masks. The experiments and results show that 3DGS achieves highest rendering fidelity in all scenarios without and with masks with SSIM of 0.724 and LPIPS of 0.291, followed by 2DGS with slightly lower scores, while NeRF exhibits lowest correspondence with the input images with SSIM of 0.584 and LPIPS of 0.384. We demonstrate the effectiveness of using masks to handle rain as transient element and radiance field methods’ ability to reliably approximate the geometry behind rain occlusions.

4:30pm - 4:45pm

Toward Seawall Monitoring via Tracking Model-Derived Feature Points of Tetrapods from 3D Point Clouds

Ting On Chan¹, Derek Lichti²

¹School of Geography and Planning, Sun Yat-sen University, China, People's Republic of; ²Department of Geomatics Engineering, University of Calgary, Canada

In recent years, many coastlines worldwide have retreated under the influence of storm surges and other extreme events, exacerbated by intensifying wave conditions in certain regions and seasons. Consequently, wave-dissipating units (e.g., tetrapods) have been widely deployed for coastal protection. In this paper, we propose a novel three-dimensional geometric method for extracting robust feature points from 3D point clouds to track tetrapod displacements and assess seawall safety. The model represents a tetrapod as four cylinders sharing a common center. By fitting this geometric model to the point cloud, we obtain parameters that allow us to derive multiple feature points—such as the intersections of conical surfaces—which can also be verified through alternative measurement techniques. These feature points serve as stable references for position comparison and displacement estimation. As this research is at an early stage, we have not yet collected field data from full-scale tetrapods. Instead, we conducted indoor experiments using a 3D depth camera (Microsoft Azure) in place of LiDAR, utilizing several high-fidelity resin tetrapod scale models (approximately 10 cm in height) as test subjects. The results demonstrate the feasibility of our method: when compared against total-station measurements, our approach yields highly accurate displacement estimates (averaging approximately 3 mm). This provides a solid foundation for the future deployment of 3D laser scanning in seawall monitoring.

4:45pm - 5:00pm

Application of Side-Scan Sonar and Multibeam Echosounder for the Investigation of Underwater Cultural Heritage – A Case Study of a Wreck in the Baltic Sea

Klaudia Pasternak, Paulina Jaczewska, Patryk Wróblewski

Military University of Technology in Warsaw, Poland

As the technology of hydroacoustic sensors advances, there is a growing trend in the use of generated sonar images and point clouds in the analysis of the seabed and objects of anthropogenic origin in water bodies. In the context of cognitive and practical dimensions, obtaining data on sunken ships is of particular importance. Based on the data obtained from hydroacoustic sensors, it is possible to extract their geometric features. As a result, it is possible to develop digital repositories of wrecks, based on sonar and bathymetric data, among others, which in the future may enable the construction of integrated knowledge bases on underwater heritage. The purpose of the work was to extract the geometric features of the wreck of the Zawiszaczek located in the Puck Bay of the Baltic Sea. As part of the work, bathymetric measurements were planned, side-scan sonar and multibeam echosounder data were collected. Based on the acquired data, the geometric features of the wreck were extracted. The differences in the wreck's dimensions, as determined by sonar images obtained from different routes, did not exceed 0.25 m.

3:30pm - 5:15pm

WG II/6: Cultural Heritage Data Acquisition and Processing
Location: 715A

3:30pm - 3:45pm

Open Technologies for the 3D Cultural Heritage Digitisation Pipeline

Fotis Arnaoutoglou¹, Peter Bonsma², Elisa Mariarosaria Farella³, Anestis Koutsoudis¹, Niki Kyriakou⁴, Marco Medici⁵, Vangelis Nomikos⁴, Anthony Pamart⁶, Fabio Remondino³

¹ATHENA Research Centre, Greece; ²RDF Ltd, Bulgaria; ³3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK), Trento, Italy; ⁴Talent S.A., Greece; ⁵INCEPTION, Spin-off of the University of Ferrara, Italy; ⁶MAP CNRS, Marseille, France

This paper introduces the 3D-4CH project and its open framework, i.e. a sustainable ecosystem of tools designed to overcome the fragmentation and limited maintainability of previous EU-funded 3D heritage initiatives. Aligned with the European Collaborative Cloud for Cultural Heritage (ECCCH), the framework integrates an end-to-end pipeline for 3D data generation and processing, semantic enrichment and long-term dissemination, including metadata and paradata inclusion. The 3D-4CH initiative bridges the gap between ICT research and operational heritage practices, ensuring the scalability and reproducibility of 3D digital assets for cross-institutional data sharing and preservation. All software components, including GitHub repositories and online processing frameworks, are openly available, in accordance with open science principles and FAIR data practices. Further information is available at https://www.3d4ch-competencecentre.eu/en/tools/.

3:45pm - 4:00pm

Metric Reliability and Operational Adaptability in the context of the Integrated 3D Metric Survey of the Genete Leul Palace (Addis Ababa, Ethiopia)

Elisabetta Colucci, Giacomo Patrucco, Mahtab Fallah, Andrea Demartis

Department of Architecture and Design (DAD), Laboratory of Geomatics for Cultural Heritage, Politecnico di Torino, Italy

The paper presents the integrated 3D metric survey of the Genete Leul Palace in Addis Ababa, demonstrating how metric reliability and operational speditivity can coexist through an adaptive hybrid TLS–MMS workflow that supported the restoration project and heritage documentation in a low-infrastructure context.

4:00pm - 4:15pm

Photogrammetry Laser Scanning and Reverse Engineering Conrad’s Jewel

Adam Weigert, Mario Santana, Stephen Fai, Sena Kurcenli Koyunlu

Carleton Immersive Media Studio, Canada

Laser scanning, photogrammetry, and other technical tools are staples for cultural heritage documentation and reverse engineering projects. However, manufacturers and even researchers often conflate the data capture process with reverse engineering itself, even though the data alone cannot provide the insight needed for a full reverse engineering or understanding of the historic site. This paper illustrates how laser scanning and photogrammetric applications were used in reverse engineering the construction and details of Conrad’s Jewel, a 1908 Silver/Gold mill in the Yukon, Canada. Analogous to systems and software engineering fields, the reverse engineering process is framed by considering related designs, existing documentation, personal experience, and general external knowledge.

4:15pm - 4:30pm

Modelling Transparent Surfaces in Heritage Artifacts with Gaussian Splatting

Marco Medici¹, Andrea Sterpin^1,2, Stefano Settimo¹, Matteo Bevilacqua¹, Gianluca Bertolasi³, Simone Rigon³, Elisa Mariarosaria Farella³, Fabio Remondino³

¹INCEPTION s.r.l., Spin-off of the University of Ferrara, Italy; ²Department of Architecture, University of Ferrara, Italy; ³3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK), Trento, Italy

The 3D reconstruction of cultural heritage artefacts plays a crucial role in documentation, conservation and dissemination. While recent advances in photogrammetry, laser scanning and neural rendering techniques have significantly improved the geometric accuracy and visual realism of digitised assets, the reconstruction of transparent and reflective materials - typical in museal collections - remains a major challenge. Materials such as glass, glazes and varnishes exhibit complex optical behaviours, leading to incomplete or inaccurate 3D models. Recent developments in Gaussian Splatting (GS) offer a potential alternative by enabling efficient, high-fidelity scene representation without explicit surface modelling. However, their application to non-Lambertian and transparent heritage objects remains largely unexplored. This paper presents a study on GS methods for the 3D digitisation of transparent cultural heritage artefacts. Through a series of experimental reconstructions, the work investigates the potential and limitations of GS, highlight the opportunities of hybrid pipelines for addressing long-standing challenges in the digitisation of non-collaborative materials.

4:30pm - 4:45pm

Evaluating generative AI for museum artifacts documentation

Elisa Mariarosaria Farella, Simone Rigon, Gianluca Bertolasi, Fabio Remondino

3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK)

In recent years, the European Commission (EC) identified the 3D digitization of cultural heritage sites and artifacts as one of its priorities and promoted numerous initiatives and recommendations to accelerate documentation campaigns. However, current digitization targets remain far from being achieved, and heritage institutions have been increasingly encouraged to explore faster and cost-effective 3D documentation solutions. Moreover, traditional image- and range-based 3D surveying techniques frequently struggle when reconstructing objects featuring non-collaborative surfaces (such as reflective or transparent objects), are time-consuming, and require specialized knowledge. Generative AI methods, able to generate 3D models also from a single input image, have recently emerged as a potentially faster alternative, yet their performance on heritage assets remains mostly unexplored. This paper evaluates three state-of-the-art and recent single-image GenAI frameworks - SAM3D, Tripo3D and Trellis2 - on several museum artifacts featuring diffuse, reflective, transparent, and mixed-material surfaces of varying scale and geometric complexity, for which accurate ground truth is available. The aim is to analyze whether these frameworks can act as complementary or alternative solutions for fast heritage documentation.

4:45pm - 5:00pm

LiDAR-Guided Illumination-Aware 3D Gaussian Splatting for Cultural Heritage

Xiao LIU¹, Xinyi LI¹, Wan LI², Tao LIU¹, Wei SUN¹, Sheng Zhang³

¹Wuhan Geomatics Institute; ²Hubei Surveying and Mapping Quality Supervision and Inspection Station; ³Langfang Natural Resources Comprehensive Survey Center, CGS

To address the issues of geometric distortion and loss of details in 3D modeling for complex cultural heritage scenes, this paper proposes an improved 3D Gaussian Splatting (3DGS) reconstruction method that integrates LiDAR and illumination-awareness. First, high-precision 3D coordinates from LiDAR point clouds are utilized to guide the initialization of Gaussian Primitives, establishing a precise geometric foundation and effectively overcoming deformation on weakly textured surfaces. Second, an illumination-aware network is constructed to dynamically adjust appearance parameters by combining global illumination from images with LiDAR reflectance intensity. This decouples complex lighting from material properties, accurately reproducing the unique textures of artifacts. Finally, a multi-dimensional joint loss function incorporating photometric, scale, and appearance smoothness constraints is introduced to collaboratively optimize scene geometry, appearance, and camera poses. Experimental results on indoor and outdoor cultural heritage preservation scenarios demonstrate that the proposed method significantly outperforms various comparative algorithms in terms of both visual fidelity and geometric accuracy. The quantitative and qualitative evaluations confirm that our approach effectively eliminates geometric distortions and recovers fine texture details, providing an efficient and reliable technical solution for the digital preservation of cultural heritage.

5:00pm - 5:15pm

Usability and Potential of Historical Glass Plate Images for 3D Object Reconstruction and Comparison to current Monitoring Data

Heidi Hastedt¹, Ferdinand Maiwald², Silke Wiedmann³, Till Sieberth¹

¹Jade University of Applied Sciences, Institute for Applied Photogrammetry and Geoinformatics, Oldenburg, Germany; ²Chair of Optical 3D-Metrology, Dresden University of Technology, Germany; ³German Maritime Museum – Leibniz Institute for Maritime History, Bremerhaven, Germany

Cultural Heritage assets as the Bremen Cog at the German Maritime Museum are often subject to long-term preservation processes and being monitored over time. The Bremen Cog, a clinker-build vessel from 1380, was found in the River Weser in 1962 and thereafter salvaged and reconstructed until 1981. Prior to conservation efforts (1981 to 1999), a photogrammetric 3D measurement campaign was conducted using a stereometric camera SMK 120. Due to deformation a permanent support system was installed in 2003 including the application of local corrections using pressure plates to correct the hull to its measured one from 1980. Since 2020 a long-term geometric monitoring of the cog has been carried out in order to detect deformation. With the analyses of the monitoring data in connection with the measurement conditions, it is of high interest whether the cog in its current shape corresponds to the one estimated in 1980. Historic SMK 120 stereo image pairs on glass plates are analysed in order to estimate their usability and potential for 3D object reconstruction and subsequently comparing the results to the current monitoring data. The proposed workflow includes an optimized digitization process of the glass plate and reconstruction of the interior and exterior orientations. Feature detection and matching methods as well as robust orientation tasks are analysed in order to allow for a 3D hull reconstruction. The reconstruction at least in parts of the cog and with lower precision is desirable and promising in terms of evaluating changes of the hull over time.

5:15pm - 5:30pm

Full Object Photogrammetry for Architectural Artefacts using the “Mask Model Method”

Adam Weigert¹, Miquel Reina Ortiz², Chloe Dennis³, Lauren Daniels¹, Yesmine Bennani⁴, Mario Santana Quintero¹, Stephen Fai¹

¹Carleton Immersive Media Studio (CIMS), Carleton University, Ottawa, Canada; ²Université de Montréal, Montréal, Canada; ³Bytown Museum, Ottawa, Canada; ⁴University of Hong Kong, Pok Fu Lam, Hong Kong

Photogrammetry and laser scanning are widespread tools for documenting movable and immovable cultural heritage assets. Documenting the entire surface of an object presents a set of specific challenges, with various solutions currently available. Complete object documentation relies on established capture techniques that utilize the registration method for different model orientations. This paper presents the “Mask Model Method,” a semi-automatic approach for seamlessly documenting entire objects while seeking high-quality results. This workflow works well for most objects that would be considered viable for general photogrammetric capture. The advantages are also in capturing small and large objects (with and without a turntable) with hinge-type moving parts. This method of documenting full architectural artefacts is useful in heritage conservation, repairs, and restoration; specifically, digital patternmaking, virtual reconstruction, digital annotation of historic materials & geometry, and applied experimental archaeology.

3:30pm - 5:15pm

WG II/3D: 3D Scene Reconstruction for Modeling & Mapping
Location: 715B

3:30pm - 3:45pm

CARS: A Photogrammetric Pipeline for Global 3D Reconstruction using Satellite Imagery

David Youssefi¹, Valentine Bellet¹, Yoann Steux², Mathis Roux², Cédric Traizet², Marian Rassat², Tommy Calendini²

¹CNES, France; ²CS GROUP, France

We present CARS, a multiview stereo pipeline developed by CNES. This pipeline will be integrated into the CO3D mission processing chain, a mission whose goal is to generate a 3D model of the Earth in less than four years. Because this is an operational mission involving massive production, particular attention has been paid to ensuring that the software is robust, efficient and includes a set of advanced automatic processing features. The paper will provide a comprehensive overview of all the features developed since its creation to achieve this goal.

3:45pm - 4:00pm

SatGeo-NeRF: Geometrically Regularized NeRF for Satellite Imagery

Valentin Wagner¹, Sebastian Bullinger¹, Michael Arens¹, Rainer Stiefelhagen²

¹Fraunhofer Institute of Optronics, System Technologies and Image Exploitation (IOSB); ²Karlsruhe Institute of Technology (KIT)

We present SatGeo-NeRF, a geometrically regularized NeRF for satellite imagery that mitigates overfitting-induced geometric artifacts observed in current state-of-the-art models using three model-agnostic regularizers. Gravity-Aligned Planarity Regularization aligns depth-inferred, approximated surface normals with the gravity axis to promote local planarity, coupling adjacent rays via a corresponding surface approximation to facilitate cross-ray gradient flow. Granularity Regularization enforces a coarse-to-fine geometry-learning scheme, and Depth-Supervised Regularization stabilizes early training for improved geometric accuracy. On the DFC2019 satellite reconstruction benchmark, SatGeo-NeRF improves the Mean Altitude Error by 13.9% and 11.7% relative to state-of-the-art baselines such as EO-NeRF and EO-GS.

4:00pm - 4:15pm

HDR Radiance Learning and Shadow Regularization for Satellite NeRF 3D Reconstruction

Yongjun Song, Pablo d’Angelo

German Aerospace Center (DLR), Germany

High dynamic range (HDR) variations in satellite optical imagery arise from extreme differences in surface reflectance and illumination conditions. Conventional satellite NeRF frameworks are typically trained on tone-mapped or radiometrically enhanced images, where nonlinear preprocessing alters the physical relationship between measured pixel values and true scene radiance. This leads to biased photometric optimization and loss of geometric fidelity, especially under strong illumination contrasts. To address these limitations, we propose an HDR-consistent learning framework that integrates RawNeRF-style radiance supervision with shadow regularization. The method trains directly on raw satellite imagery using a logarithmic, tone mapping–aware loss that preserves linear radiance and stabilizes optimization under high dynamic range conditions. In parallel, a soft shadow regularization constrains network-predicted shadows using geometric cues derived from solar ray casting, promoting physically consistent irradiance decomposition. Experiments on four AOIs from the DFC2019 dataset demonstrate that HDR-aware radiance learning significantly improves DSM accuracy by maintaining linear radiometric consistency. The proposed shadow regularization also improves geometric consistency in structure-dominated urban scenes, although its effect is limited in vegetation-dominant areas where shadow cues are less informative. Although performance gains are smaller in vegetation-dominant areas, the results confirm that combining HDR radiance learning with geometric shadow regularization yields more radiometrically consistent and geometrically accurate 3D reconstruction from satellite imagery.

4:15pm - 4:30pm

EOGS++: Earth Observation Gaussian Splatting with Internal Camera Refinement and Direct Panchromatic Rendering

Pierrick Bournez¹, Luca Savant Aira², Thibaud Ehret³, Gabriele Facciolo¹

¹Universite Paris-Saclay, CNRS, ENS Paris-Saclay, Centre Borelli, 91190, Gif-sur-Yvette, France; ²Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129 Torino TO, Italia; ³AMIAD, Pôle Recherche, France

Recently, 3D Gaussian Splatting has been introduced as a compelling alternative to NeRF for Earth observation, offering competitive reconstruction quality with significantly reduced training times.

In this work, we extend the EOGS framework to propose \namemodel, a novel method tailored for satellite imagery that directly operates on raw high-resolution panchromatic data %and multispectral data

without requiring external preprocessing.

Furthermore, we embed bundle adjustment directly within the training process with optical flow techniques, avoiding reliance on external optimization tools while improving camera pose estimation.

We also introduce several improvements to the original implementation, including early stopping and TSDF post-processing, all contributing to sharper reconstructions and better geometric accuracy.

Experiments on the IARPA 2016 and DFC2019 datasets demonstrate that EOGS++ achieves state-of-the-art performance in terms of reconstruction quality and efficiency, outperforming the original EOGS method and other NeRF-based methods while maintaining the computational advantages of Gaussian Splatting. Our model demonstrates an improvement from 1.33 to 1.19 mean MAE errors on buildings compared to the original EOGS models.

4:30pm - 4:45pm

Evaluating multi-view geometry for satellite-based 3D city modeling: towards 1+N constellation configurations

Xu Cheng, Xianfeng Huang, Yingdong Pi, Xinsheng Wang, Mi Wang

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, 430079, China

The emergence of satellite constellations enables near-synchronous multi-view optical imaging, offering new opportunities for large-scale 3D city modeling. Yet a practically promising configuration, in which a primary near-nadir view is complemented by multiple oblique side-looking viewpoints, remains under-examined. This study develops a controlled semi-simulation framework to analyze how multi-view imaging geometry affects the recoverability of urban 3D structures. Under idealized conditions with imaging perturbations removed, e.g., radiometric, illumination, and sensor model errors, the experiments focus on three practical factors: the number of side-looking views, view obliqueness, and the constellation’s azimuthal orientation relative to the scene. With parameter sweep analysis, it reveals an asymmetric U-shaped trend between reconstruction performance and both the view count and the obliqueness: moderate angular diversity markedly strengthens urban scene recoverability. In contrast, large obliqueness reduces inter-view overlap and destabilizes matching, while excessive redundancy introduces consistency issues that ultimately degrade reconstruction performance. Furthermore, the results shows that geometric accuracy, completeness, and texture appearance each peak at different parameter combinations, revealing intrinsic trade-offs in multi-view urban reconstruction, as different evaluation criteria favor distinct optimal configurations. The study provides practical guidance for the geometric design and mission planning of multi-satellite constellations aimed at improving satellite-based 3D modeling in urban areas.

4:45pm - 5:00pm

Illumination-prior-based high-resolution DEM reconstruction from single-view lunar image constrained with initial DEM

Siyi Qiu¹, Zhen Ye^1,2, Rong Huang^1,2, Yusheng Xu^1,2, Xiaohua Tong^1,2

¹College of Surveying and Geoinformatics, Tongji University, Shanghai, China; ²The Shanghai Key Laboratory of Space Mapping and Remote Sensing for Planetary Exploration, Shanghai, China

This work presents an illumination-prior-based reconstruction model for high-resolution DEM generation from single-view lunar imagery, developed for the extreme illumination conditions and rugged terrain of the lunar south pole. The model integrates an initial DEM prior with multi-scale monocular image features and incorporates illumination priors derived from solar geometry to enhance stability in shadowed, low-texture, and terrain-transition regions. Through cross-modal feature fusion, it effectively aligns geometric structure with shading and photometric cues, enabling accurate recovery of fine-scale topography even when visual information is severely degraded. Experimental evaluations across multiple south-polar regions show that the proposed reconstruction model outperforms existing deep learning approaches and the classical Shape-from-Shading method in elevation, slope, and aspect accuracy, with independent validation using LOLA laser altimetry points confirming its improved geometric reliability. Visual comparisons demonstrate clear advantages in reconstructing crater rims, steep slopes, and permanently shadowed areas where conventional methods often fail or produce blurred terrain structures. The model also maintains robust performance under varying solar azimuths, highlighting the effectiveness of incorporating illumination priors to improve generalization in challenging environments. Overall, the proposed reconstruction model provides a reliable and effective solution for detailed lunar terrain recovery from monocular images and offers valuable support for scientific investigation, resource assessment, landing-site evaluation, and mission planning in the lunar south polar region.

5:00pm - 5:15pm

Construction of Control Network for Multi-temporal LRO NAC Images Based on Matching of Lunar Impact Craters

Pengying Liu^1,2, Jiayao Wang^1,2, Xun Geng^1,2, Zhen Peng^1,2, Jin Wang^1,2, Haoyu Zhang^1,2

¹State Key Laboratory of Spatial Datum, Faculty of Geographical Science and Engineering, Henan University, Zhengzhou, China; ²College of Geographic Sciences, Henan University, Zhengzhou, China

To address the critical demand for high-precision mapping of the Lunar South Pole (LSP)—a region pivotal for deep space resource utilization yet plagued by extreme illumination variations, extensive permanent shadow regions (PSRs), and weak texture—this study proposes a control network construction method for multi-temporal Lunar Reconnaissance Orbiter (LRO) Narrow Angle Camera (NAC) images, anchored in lunar impact crater matching. Leveraging the morphological stability and spatial consistency of impact craters, we first created a dedicated dataset: 94 multi-temporal LSP orthophotos (1 meter/pixel resolution) with manual annotations, allocating 70% for YOLOv8 model training and 30% for validation to ensure accurate crater detection (extracting center coordinates and semi-major/semi-minor axes). For virtual feature point matching, we integrated crater geometric attributes (coordinates, aspect ratio) and inter-crater topological relationships (distance, azimuth angle) to build local descriptors, enhanced by KD-tree indexing for efficient neighborhood queries, multi-attribute similarity measurement, and bidirectional voting to eliminate mismatches. For large craters, normalized cross-correlation (NCC) was used for secondary matching to refine accuracy. Post-matching, tie points were back-projected from orthophoto to original image space via ground coordinates. Experiments on 1,208 LRO NAC images showed the method outperforms SIFT and SuperPoint: it generated 938,029 tie points (even in dark shadows) with 2,347,629 measurements, and bundle adjustment achieved a sigma naught of 0.68. This work enables automatic high-quality control network construction, supporting reliable LSP topographic mapping for deep space exploration.

3:30pm - 5:15pm

ApS: Applied Session
Location: 716A

3:30pm - 3:45pm

A Multi-Stage Framework for Remote Sensing-Based Detection of Mining Disturbances Across British Columbia to Inform Salmon Habitat Conservation

Chen Shang¹, Olivier Tsui¹, Marc Porter², François-Nicolas Robinne³

¹Hatfield Consultants, 200-850 Harbourside Drive, North Vancouver, BC, V7P 0A3, Canada; ²Salmon Watersheds Program, Pacific Salmon Foundation, 300-1682 West 7th Avenue, Vancouver, BC, V6J 4S6, Canada; ³Forest Operations Branch, Alberta Forestry and Parks, J.G. O’ Donoghue Building, 7000-113 Street, Edmonton, AB, T6H 5T6, Canada

Mining activities constitute a major source of land disturbance in British Columbia and pose long-lasting risks to salmon-bearing watersheds through sedimentation, habitat fragmentation, and water quality degradation. However, existing mining inventories often lack spatial precision and consistency, limiting their usefulness for cumulative effects assessment. This study presents a new multi-stage remote sensing framework designed to systematically detect and map mining disturbances across the province using Landsat time series (1984–2023), Sentinel-2 imagery, and provincial mining databases.

The workflow integrates spectral–temporal change detection (LandTrendr), land cover and disturbance history from the Satellite-Based Forest Inventory, Sentinel-2 spectral clustering, and final visual interpretation using very high-resolution imagery. This approach effectively distinguishes mining disturbances from wildfires, harvesting, and other land surface changes common in BC’s diverse landscapes.

Applied province-wide, the framework identified 1,037 mining sites with a 92% thematic accuracy, producing the most spatially explicit and consistent inventory of mining disturbances currently available for British Columbia. Results highlight persistent mining hotspots and reveal that mineral mines—especially coal, gold, and silver—dominate the cumulative disturbance footprint, with peak activity occurring between 1970 and 1990.

The resulting dataset provides a critical foundation for evaluating the cumulative impacts of mining on salmon habitats and supports ongoing efforts toward transparent, data-driven land-use planning. The framework is scalable, updateable, and transferable to other regions where large-area monitoring of mining activity is needed.

3:45pm - 4:00pm

Compact Polarimetry Data for Estimation of Relative Oil Thickness

Gordon Staples, Ji Chen

MDA Space, Canada

The objective of this study was to investigate the application of RADARSAT Constellation Mission (RCM) CP data for the estimation of relative oil thickness. On July 25, 2020, the bulk carrier MV Wakashio ran aground off the coast of Mauritius with 1000 tonnes of oil was estimated to have spilled into the Indian Ocean. RCM CP data were acquired on August 9, 12, and 13, 2020. CP data entails the acquisition of two phase-preserving channels, CH and CV. A 5x5 polarimetric filter was applied and CP discriminators, Degree of Linear Polarization (DLP), Degree of Polarization (DOP), and Entropy (H), were extracted. For the three images, the DLP, DOP, and H were calculated for “thick” and “thin” oil, and oil-free regions. The performance of the DLP, DOP, and H was consistent with the expected results for both thin and thick oil and oil-free regions. The correlation between the thick, thin, and oil-free regions was calculated based on an Area-based Classification-by-Histogram (ACH). The results for H (August 13) show a strong negative correlation between thick oil/oil free, a small positive correlation between thin oil/oil-free, and a negative correlation between thick/thin oil. The results of the CP discriminators were consistent with theoretical expectations, with H providing the best overall performance. The results of the CP discriminators were consistent with theoretical expectations, with H providing the best overall performance. The results suggest that CP data is a viable option for the estimation of relative oil thickness.

4:00pm - 4:15pm

Automatic detection of eelgrass (Zostera marina) from multispectral satellite data along Canada’s Pacific coast to support conservation and restoration efforts

Anne Webster¹, Weigang Tang¹, Maycira Costa², Nic Dedeluk³, Olivier Tsui¹

¹Hatfield Consultants LLP, 200-850 Harbourside Dr, North Vancouver, Canada V7P 0A3; ²Spectral Lab, Geography, University of Victoria, Victoria, Canada; ³‘Namgis First Nation, 49 Atli St, Alert Bay, Canada

Eelgrass (Zostera marina) is the primary native seagrass species in intertidal areas across North America and plays an important role in marine ecosystems. Current eelgrass mapping is primarily limited to localized areas using various field and remotely piloted aerial systems (RPAS) methods, resulting in limited coverage and update frequency. To support more frequent, wide area monitoring of eelgrass along Canada’s Pacific coast, we are developing Eelgrass Explorer (E2), an automated system to provide eelgrass distribution maps across British Columbia’s (BC) intertidal zones from either Sentinel-2 or Planet SuperDove multispectral data. The deep learning approach central to the system is based on a DenseNet architecture developed for seagrass detection elsewhere in the world, modified for BC conditions. Our proof of concept used training data across 6 sites along the BC coast and obtained 95% accuracy for test points within training sites, a 12% percent improvement over a Random Forest approach using the same data. Future work will include more rigorous validation in new sites, refining the model for even better generalization, and incorporating it into an automated processing pipeline. The resulting 10-meter eelgrass extent maps across BC’s intertidal zone will be made openly available to the research community.

4:15pm - 4:30pm

Autonomous Driving in a GNSS-Denied Environment using Real-Time Sensor Fusion

Elena Liang, Xue-Fen Zhang, Jason Kun, Benjamin Brunson, Vi Huynh, Mohamed Mostafa

Trimble Applanix, Canada

Ensuring robust and precise navigation in GNSS-denied or degraded environments remains a core challenge for autonomous systems. The demand for precise, real-time positioning is critical across various applications, including fleet management, automotive, rail, pavement, and airport safety, particularly within GNSS-limited operational settings. This paper presents a novel approach to integrating Visual Odometry (VO) and Map-Based Localization (MBL) as external aiding sources for inertially-aided navigation. This integrated solution is specifically designed for land mobile mapping applications and leverages a high-precision inertially-aided GNSS solution inherent to the mobile mapping system.

This paper is structured as follows:

• Overview of VO and MBL Techniques: A detailed review of the theoretical principles underpinning the Visual Odometry (VO) and Map-Based Localization (MBL) techniques.

• Real-Time Deployment Strategies: Examination of the specific strategies required for real-time operational deployment, including handling delayed measurements, managing out-of-sequence updates, and implementing dynamic uncertainty adaptation.

• Kalman Filter Framework Design: Development of the Kalman filter framework to accommodate the delta pose data (derived from VO) and absolute pose data (derived from MBL) as distinct aiding sources. This includes modelling specific measurement errors and introducing dedicated state components.

• Theoretical and Practical Accuracy Analysis: Evaluation of the integrated system's effectiveness through a rigorous theoretical and practical accuracy analysis under a wide range of operational conditions, including the quantification of positioning performance enhancement when utilizing low-cost IMUs.

4:30pm - 4:45pm

Integrated Multi-Sensor Data Fusion from Land, Air, and Marine Platforms for Enhanced Geospatial Mapping

Michael Koterba¹, Mohamed Mostafa²

¹MJ Engineering, Architecture, Landscape Architecture, and Land Surveying, P.C, 21 Corporate Drive, Clifton Park, NY, USA 12065; ²Trimble Applanix, 85 Leek Cr., Richmond Hill, Ontario, Canada L4B 3B3

Over the last three decades, advancements in sensor and positioning technology have fundamentally transformed geospatial data acquisition, processing, and quality control, enabling surveyors and professionals to collect, interact with, and produce mapping products with unprecedented accuracy and resolution. Sensor Fusion concepts started at the academic level in the early 1990s (c.f., Schwarz et al., 1993; El-Sheimy, 1996; Mostafa and Schwarz, 1997; Ip et al., 2007; Ravi et al., 2018).

The fusion of LiDAR and photogrammetric sensors paired with GNSS, and inertial positioning systems has effectively supplanted many traditional mapping methods that relied heavily on high-accuracy positioning combined with significant data interpolation (c.f., Scherzinger et al., 2018)

Today, geospatial data acquisition is increasingly performed simultaneously using land mobile mapping systems, UAVs, and marine vessels all equipped with multiple LiDARs and diverse imaging sensors (e.g., panoramic, RGB, NIR, thermal, etc.), rapidly becoming the industry standard. These multi-stream datasets are now typically integrated and optimized within a post-processing environment. This paper will highlight the technology and workflows surrounding these synergistic systems, demonstrating how their fusion is yielding an unprecedented level of speed and quality hitherto unseen in the industry.

4:45pm - 5:00pm

From Satellites to Grain Elevators: using NDVI-based Indices to reduce Price Discovery Gaps in non-Futures Prairie Crop Markets

Samuel Scott

Independent, Canada

This contribution examines whether satellite derived crop condition signals can be translated into a practical market indicator for Prairie crops that do not trade on futures exchanges. In Canada, remote sensing programs such as the Crop Condition Assessment Program already provide in season crop monitoring and support official yield and production estimation. This study builds on that foundation, but asks a different question: how crop condition information is incorporated into prices in decentralized cash markets for non futures crops such as peas, lentils, and mustard.

Using Canada’s operational AVHRR and MODIS NDVI archives, the study outlines a simple method for aggregating weekly NDVI composites to key producing regions, deriving seasonal anomalies and phenological measures, and combining them into a normalized regional index for each week of the growing season. The purpose of this index is not to replace official crop condition or yield models, but to provide a transparent and interpretable signal that can be examined alongside observed cash market pricing behavior.

The empirical focus is on market linkage rather than agronomic prediction alone. Specifically, the study compares the relationship between the NDVI based index and weekly changes in benchmark futures prices with its relationship to posted bids for selected non futures crops. The working hypothesis is that crop condition information is incorporated relatively quickly into futures linked markets, while non futures cash bids respond more slowly and less directly. If confirmed, the index could serve as a public benchmark for price discovery in thin and fragmented specialty crop markets.

5:00pm - 5:15pm

Simultaneous LiDAR & Trajectory Data Optimization for Mobile Mapping Systems in GNSS-Denied Environments

Mohamed Mostafa, Vi Huynh

Trimble Applanix, Canada

Accurate mobile mapping, a critical requirement for various applications, is frequently compromised in GNSS-denied environments, resulting in degraded final mapping products. This research investigates the efficacy of simultaneous optimization of mobile mapping system data, specifically encompassing the trajectory, system calibration, and LiDAR point cloud. The study explores the integration of inertially-aided GNSS data with LiDAR data to mitigate trajectory and point cloud errors and refine installation parameter calibration during GNSS outages. Utilizing datasets acquired with a Mobile Mapping System in a suburban setting in Richmond Hill, Ontario, Canada, the performance of this integrated approach was rigorously evaluated. The results demonstrate the capability of Simultaneous LiDAR & Trajectory Data Optimization to effectively and concurrently compensate for diverse error sources using LiDAR data, GNSS/Inertial measurements, and calibration parameters. This highlights the significant potential for achieving enhanced data accuracy in challenging land mobile mapping scenarios where GNSS availability is limited.

3:30pm - 5:15pm

Forum2C: The Future of Space- based Earth Observation
Location: 716B

3:30pm - 5:15pm

Forum7B: Entrepreneurship in the Industry 4.0 Geospatial Landscape
Location: 717A

3:30pm - 5:30pm

InS4: Industry Tech Session
Location: 717B

3:30pm - 5:30pm

P2: Poster Session 2
Location: Exhibition Hall "E"

Refractive Effects of Planar Protective Layers in Stereo Photogrammetry and Their Correction

Zhaoquan Liu¹, Binbin Xu², Wenxing Xu³, Shigang Liu⁴, Yongfeng Ma⁵, Guanqing Li⁶

¹CCCC First Harbor Engineering Company Ltd., 300461 Tianjin, China – liuzhaoquan@ccccltd.cn; ²No.3 Engineering Company Ltd. of CCCC First Harbor Engineering Company, 116011 Dalian, China; CCCC First Harbor Engineering Company Ltd., 300461 Tianjin, China; Key Laboratory of Geotechnical Engineering, CCCC, 300461 Tianjin, China; Key Laboratory of Port Geotechnical Engineering, Ministry of Transport, PRC, 300461 Tianjin, China; Key Laboratory of Port Geotechnical Engineering of Tianjin, Tianjin 300461, China – 2016046927@ccccltd.cn; ³No.3 Engineering Company Ltd. of CCCC First Harbor Engineering Company, 116011 Dalian, China – xuwenxing1@ccccltd.cn; ⁴No.3 Engineering Company Ltd. of CCCC First Harbor Engineering Company, 116011 Dalian, China – liushigang1@ccccltd.cn; ⁵No.3 Engineering Company Ltd. of CCCC First Harbor Engineering Company, 116011 Dalian, China – mayongfeng1@ccccltd.cn; ⁶School of Environment and Spatial Informatics, China University of Mining and Technology, 221116 Xuzhou, China – guanqing.li@cumt.edu.cn

This study addresses the impact of planar protective layers on stereo photogrammetry and introduces a rigorous refractive correction model based on multi-interface ray tracing. Conventional stereo reconstruction assumes a single viewpoint, but planar layers introduce refraction at two interfaces, causing systematic depth-dominated errors. Through simulations and field experiments using an Intel RealSense D455, the study evaluates the influence of target distance, layer thickness, orientation, and layer-to-camera spacing. Simulations with multiple target planes show that conventional stereo produces significant errors—up to several millimeters in depth—even for thin layers, while the refractive model consistently reconstructs points with sub-millimeter accuracy. Layer distance from the camera has negligible effect on the error magnitude, whereas tilts and thicknesses of the layer strongly influence the bias. Field experiments with a 10-mm acrylic plate confirm these findings: conventional reconstruction exhibits systematic lateral and depth errors, whereas the refractive model eliminates bias, achieving near-zero mean errors. The results highlight that even minimal protective layers induce measurable errors if refraction is ignored, emphasizing the necessity of refractive correction in high-precision applications. The study demonstrates that explicitly modeling refraction in stereo photogrammetry significantly improves reconstruction accuracy and robustness. Overall, this work provides a practical framework for accurate 3D measurement in hazardous environments where imaging through protective layers is unavoidable.

Augmenting City Models with Handheld LiDAR and 3D Gaussian Splatting for Inclusive Pedestrian Infrastructure Assessment

Deni Suwardhi^1,2, Wahyunan Andika², Ratri Widyastuti¹, Widiatmoko Azis Fadilah³, Arnadi Murtiyoso³, Pierre Grusennmeyer³, Fabio Remondino⁴, Farhan Helmy⁵

¹Spatial System and Cadastral Research Group, Institut Teknologi Bandung (ITB), Indonesia; ²PT Inovasi Mandiri Pratama, Spatial Information Company, Indonesia; ³Université de Strasbourg, CNRS, INSA Strasbourg, ICube Laboratory UMR 7357, Photogrammetry and Geomatics Group, 67000, Strasbourg, France; ⁴3D Optical Metrology (3DOM) Unit, Bruno Kessler Foundation (FBK), Trento, Italy; ⁵Advanced System Computing, Design and Innovation (ASCODI) Laboratory, Indonesia

Urban digital twins increasingly require pedestrian-scale three-dimensional (3D) representations to support accessibility and inclusiveness assessment. However, existing approaches typically emphasize either geometric accuracy or visual realism, while lacking an integrated framework for analysing pedestrian-level conditions. This study proposes a hybrid workflow integrating handheld LiDAR and 3D Gaussian Splatting (3DGS) within a CityGML-based semantic framework for accessibility assessment. Handheld LiDAR provides centimetre-level geometric measurements, enabling the extraction of key indicators such as slope, surface roughness, and obstacle presence. In parallel, 3DGS reconstruction from 360° video imagery enhance visual realism and perceptual understanding. Both datasets are co-registered and structured within the CityGML 3.0 Transportation model to represent pedestrian environments in a unified spatial and semantic framework. Accessibility assessment was conducted using three approaches: LiDAR-based analysis, field survey observations, and immersive evaluation in a Virtual Reality (VR) environment. The LiDAR-based results were used as a reference. Comparative analysis shows the field survey assessment achieves an agreement of approximately 85.7%, while VR-based assessment reaches approximately 75.4%. The results indicate that while VR does not replace metric-based analysis, it enables perception-driven and participatory evaluation. In particular, VR-based assessment shows potential to involve users, including people with disabilities, in accessibility evaluation through immersive and remote interaction. The proposed approach contributes to the development of human-scale urban digital twins by integrating metric accuracy, semantic structure, and participatory evaluation for more inclusive accessibility analysis

AI-driven extraction of road geometry and asset inventory from mobile LiDAR point clouds

Divya Priya Balasubramani, Zaffar Sadiq Mohamed-Ghouse, Sanjay Khanna D, Ravichandran N, Muthu Kumara Samy S

Institute of Remote Sensing, Department of Civil Engineering, College of Engineering Guindy, Anna University Chennai, India

Rapid urbanization and rising traffic demand are placing significant pressure on transportation infrastructure, necessitating more efficient and accurate approaches to road design auditing and asset management. Traditional survey methods are labor-intensive, time-consuming, and lack comprehensive three-dimensional context. This study presents an end-to-end framework integrating Mobile Light Detection and Ranging (LiDAR) with Artificial Intelligence (AI) for automated extraction of road geometric parameters and asset inventory. Mobile LiDAR data were collected along an urban corridor in Bengaluru, India, and preprocessed using Trimble Business Center. Preprocessing involved statistical outlier removal and progressive morphological ground segmentation. A deep learning model based on the PointNet++ architecture with hierarchical set abstraction layers was developed to classify point cloud data into five categories: road, pole, vehicle, tree, and building. The dataset comprised approximately 45 million points, with 10% manually annotated for training. The trained model enabled large-scale semantic segmentation, achieving a mean Intersection-over-Union (mIoU) of 0.86 and an overall accuracy of 92.4%. Using the classified outputs, key road design parameters—including lane width (8.099 m), road segment length (44.383 m), zebra crossing width (7.336 m), and pole height (7.890 m)—were accurately derived. The proposed workflow reduced manual processing time by approximately 85% (from 40 hours to 6 hours per km) while enhancing measurement consistency and scalability. The results highlight the effectiveness of integrating mobile LiDAR and AI for high-accuracy, data-driven infrastructure assessment, offering a scalable solution for improved planning and management of urban transportation systems.

Rigorous Projection for Image Stitching: a 3D-Informed Approach for Accurate Panoramic Photogrammetry

Riccardo Roncella¹, Luca Perfetti²

¹University of Parma, Department of Engineering and Architecture, 43124, Parma, Italy; ²University of Brescia, Department of Civil Engineering, Architecture, Territory, Environment and Mathematics, 25123, Brescia, Italy

Panoramic image stitching traditionally relies on the assumption that all input images share a single projection centre, a condition rarely satisfied by modern multi-camera rigs composed of multiple fisheye sensors mounted with non-negligible baselines. In confined or close-range environments, these geometric discrepancies introduce significant parallax, limiting the reliability of both classical and “parallax-tolerant’’ stitching techniques based on local warping. Although such methods are simple and efficient, they cannot account for the true camera geometry and therefore degrade the metric quality of the final panorama. At the same time, recent photogrammetric software has begun to accept panoramic imagery directly, yet literature demonstrates that optimal accuracy is still obtained when processing raw multi-camera.

This work presents a new 3D-informed approach for generating panoramic images that fully respects the underlying geometry of the acquisition system. Assuming the availability of a 3D model, derived either from photogrammetric reconstruction or from an external sensor such as LiDAR, the method reprojects each pixel of the desired panorama onto the original multi-camera frames using collinearity equations, mirroring the workflow of precision orthophoto generation. This allows the production of parallax-free panoramas with consistent geometric fidelity even in challenging scenarios.

The method is evaluated on several case studies using both compact panoramic cameras and multi-camera systems with larger baselines. Results demonstrate improvements in stitching accuracy, SfM orientation quality, and final 3D reconstruction, including robustness to varying scene complexity and supporting 3D-model resolution.

Extrusion Segmentation Strategy to improve CAD Reconstruction from Point Cloud

Said Harb, Mehdi Maboudi, Markus Gerke

Technische Universität Braunschweig; Institute of Geodesy and Photogrammetry, Germany

Recovering editable CAD models from point cloud scans is a key challenge in reverse engineering and quality control, where the ability to reconstruct the original modeling history of a physical object enables precise deviation analysis and systematic process optimization. While deep learning has driven significant progress in this area, existing models struggle to generalize to complex CAD models, which feature multiple extrusions and intricate geometric structures.

This paper presents an end-to-end deep learning pipeline that reconstructs CAD models from point clouds as structured CAD sequences, which are series of sketch-and-extrude operations that encode the full modeling history. The model demonstrates high-fidelity reconstruction for non-complex objects, including primitive shapes such as cubes and cylinders, as well as their assemblies.

To address the performance gap on complex shapes, we introduce an extrusion-based segmentation strategy that decomposes CAD models into their constituent extrusions. These partial shapes are incorporated into the training set, increasing data diversity without requiring new data collection. The resulting primitive models feature partially occluded point clouds, surfaces hidden in the original assembly are absent, which forces the model to infer missing regions and learn richer point cloud representations. This increases the complexity of the reconstruction problem and thereby improves generalization.

The strategy is model-agnostic and can be applied to any deep learning approach that reconstructs CAD sequences, making it a broadly applicable tool for the community.

Controlled Multi-source Mapping of Lunar South Polar Regions via Combined Bundle Adjustment

Qionghua You¹, Zhen Ye^1,2, Yusheng Xu^1,2, Rong Huang^1,2, Huan Xie^1,2, Xiaohua Tong^1,2

¹College of Surveying and Geoinformatics, Tongji University, Shanghai, China; ²The Shanghai Key Laboratory of Space Mapping and Remote Sensing for Planetary Exploration, Shanghai, China

Integration of LROC NAC and ShadowCam imagery is essential for meter-scale controlled mapping of the entire lunar south pole including Permanently Shadowed Regions (PSRs), but remains challenging due to extreme radiometric differences, sparse overlap across illumination boundaries, and ill-conditioned bundle adjustment networks. This paper proposes a LOLA DEM-mediated multi-source bundle adjustment framework for controlled lunar polar mapping. A hierarchical cross-modality matching strategy is developed using first- and second-order Gaussian steerable gradient features with multi-scale fusion and phase-correlation-based subpixel refinement. Sensor-specific geometric models are established using second-order polynomial transformations for NAC orthoimages and rational polynomial models for ShadowCam map-projected images. Five types of geometric constraints are formulated to integrate intra-sensor, limited cross-sensor, and image-to-DEM observations, with the LOLA DEM acting as a common geometric mediator. To stabilize the heterogeneous network, a hybrid L1-L2 regularization model with adaptive two-stage weighting is optimized using ADMM algorithm. Experiments in the lunar south polar region demonstrate substantial improvements on intra-sensor, cross-sensor, and image-to-reference positioning accuracy. The final seamless 1 m/pixel orthorectified mosaics achieve approximately 5 m absolute accuracy, validating the proposed framework for geometrically unifying illuminated and permanently shadowed terrain in lunar polar controlled mapping.

Automated and Comprehensive Quality Assessment of Nationwide Aerial LiDAR Data: Insights from the LiDAR-ITA Project

Vittorio Casella, Marica Franzini, Davide Lodigiani

University of Pavia, Italy

National LiDAR programs are increasingly adopted worldwide to support land management, infrastructure planning, and environmental monitoring. Following the examples of large-scale initiatives in the United States and Europe, Italy launched its first nationwide LiDAR survey in July 2025 within the Integrated Monitoring System (SIM) project funded by the National Recovery and Resilience Plan (PNRR). This effort represents the most extensive airborne LiDAR campaign ever conducted in the country, covering over 302,000 km², including coastal zones and major islands. The acquisition plan is designed to ensure a minimum point density of 10 points/m² and produce high-resolution DTMs and DSMs at a 0.25 m grid spacing.

Given the unprecedented spatial and data volume, a robust, standardised, and fully automated quality assurance framework is essential. This paper presents the methodology used to evaluate geometric consistency and spatial accuracy across the national dataset. Congruence between overlapping flight strips is assessed by automatically extracting 100 × 100 m patches at regular intervals and computing point-to-point distances and cross-section profiles to detect horizontal and vertical discrepancies. Plano-altimetric accuracy is further evaluated through comparisons with terrestrial laser scanning (TLS) data collected in dedicated control areas, where robust plane fitting enables rigorous three-dimensional error estimation.

Results from two control areas acquired with different sensors demonstrate the effectiveness, scalability, and reproducibility of the proposed automated workflows. The presented approach provides a reliable foundation for delivering high-precision national LiDAR products and offers a framework applicable to future large-scale geospatial acquisition programs.

Synergy of photogrammetric and ULS data for forestry application through the fusion of bundle adjustment and ICP algorithms

Łukasz Wilk¹, Magdalena Pilarska-Mazurek¹, Wojciech Ostrowski^1,2

¹Warsaw University of Technology, Faculty of Geodesy and Cartography, Department of Photogrammetry, Remote Sensing and Spatial Information Systems, Warsaw, Poland; ²Jagiellonian University, Institute of Archaeology, Krakow, Poland

The study explores a workflow for integrating photogrammetric image blocks with LiDAR point clouds acquired via Unmanned Laser Scanning (ULS) in forestry applications. Hybrid datasets combining UAV imagery and LiDAR data are increasingly used for 3D mapping, yet discrepancies often arise due to independent orientation processes and systematic errors. Traditional solutions rely on numerous ground control points (GCPs), which can be impractical in dense forest environments. To address this, the proposed method fuses Bundle Adjustment and Iterative Closest Point (ICP) algorithms in a joint optimization process, aligning multispectral images with ULS point clouds without additional observations or GCPs. The workflow includes a GPU-accelerated filtering step to extract representative canopy points, reducing computational load and improving correspondence selection. Implemented using Python and C++ extensions, the system leverages the Ceres Solver for non-linear optimization, minimizing reprojection, GNSS, IMU, and point-to-cloud errors iteratively. Tests conducted in Żednia Forest District, Poland, during leaf-on and leaf-off seasons demonstrated significant improvements in alignment accuracy: average horizontal errors decreased by over 50%, and maximum offsets were reduced by more than 1 meter. These results confirm that the proposed hybrid adjustment substantially enhances geometric consistency between photogrammetric and LiDAR datasets, offering a cost-effective solution for forestry mapping and monitoring.

Integrating High‑Fidelity 3D Documentation into Immersive Learning: A VR Serious Game for the Holy Aedicule

Margarita Skamantzari¹, Ioannis Georgoulas¹, Ioannis Rallis¹, Antonia Moropoulou², Anastasios Doulamis¹, Andreas Georgopoulos¹

¹Lab of Photogrammetry, School of Rural, Surveying & Geoinformatics Engineering, National Technical University of Athens– Athens, Greece; ²School of Chemical Engineering, National Technical University of Athens– Athens, Greece

This paper introduces an innovative Virtual Reality (VR) serious game designed to enhance immersive learning in cultural heritage education. The game offers an interactive exploration of the Holy Aedicule in Jerusalem, one of the most sacred monuments of Christianity, based on high-resolution 3D documentation captured before, during, and after its rehabilitation.

By integrating photogrammetric data, textured 3D models, and historical research, the application allows users to navigate the monument virtually, engage with embedded educational content, and participate in interactive learning scenarios. Structured as a multi-phase experience, including virtual tours, a digital classroom, and a quiz mode, the serious game aims to promote transdisciplinary knowledge transfer in a user-friendly, entertaining format.

This contribution outlines the game’s methodological framework, educational objectives, development pipeline, and user evaluation results, highlighting its role in redefining how cultural heritage can be communicated through immersive digital tools. Additionally, it addresses the broader challenge of translating complex heritage documentation into accessible and meaningful experiences for learners, researchers, and the wider audience.

GNSS–Camera Systems for Heritage Documentation. Accuracy assessment of measurements of inaccessible points and preliminary tests in photogrammetric applications.

Lorenzo Teppati Losè, Filiberto Chiabrando, Fabio Giulio Tonolo

LabG4CH, Department of Architecture and Design (DAD) - Politecnico di Torino, Viale Mattioli 39, 10125 Torino (Italy)

The contribution investigates the possibility of using a GNSS receiver equipped with a camera for documenting built heritage. In particular, the possibility of measuring GCPs on vertical surfaces thanks to the combination of satellite observations and digital photogrammetric algorithms will be analysed and metrically validated. Moreover, the use of the acquired images in SfM approaches will be tested and discussed.

Generating Synthetic Image Data with Blender to Address Data Scarcity in Military Applications: Leveraging the RF-DETR Model

Julian Cornel Berndt, Tobias Frisenborg Christensen, Lars Würtz Jochumsen

Systematic A/S, Denmark

Military vehicle recognition faces critical data scarcity due to operational security constraints and prohibitive collection costs.

Classification of vehicles demands extensive training data rarely available in defence contexts. We propose a hybrid approach

combining limited real-world data with scalable synthetic generation. Our methodology comprises: (1) a Blender-based pipeline

generating high-resolution synthetic images with domain randomization across 3D models, lighting, and camera angles; (2) training

transformer-based RF-DETR detectors on real-world and synthetic data, respectively; (3) an in-depth evaluation of the trained

networks to determine the effect of synthetic data. Our approach utilizes a baseline RF-DETR detector trained on real-world

imagery to compare against. Then we utilize the custom-made synthetic data generation pipeline to create an equally large synthetic

dataset. This generated data is added to real data subsets, thus creating a mixed datasets containing varying percentages of real data.

We created five datasets containing 5%, 10%, 25%, 50%, and 100%, respectively. With these new mixed datasets we train another

set of RF-DETR detectors. Afterwards we evaluate the influence of the synthetic data by comparing the detectors across computer

vision metrics.

GDC: Geometric diffusion consistency for weather-robust 3D point cloud segmentation

Jing Du¹, John Zelek¹, Michael A. Chapman², Jonathan Li³

¹Department of Systems Design Engineering, University of Waterloo,; ²Department of Civil Engineering, Toronto Metropolitan University; ³Department of Geography and Environmental Management, University of Waterloo

Semantic segmentation of outdoor 3D point clouds degrades significantly under adverse weather, as rain, fog, and snow corrupt the geometric structure of LiDAR returns through backscatter insertion, range-dependent attenuation, and volumetric scattering. Existing domain generalization methods constrain feature values directly, which becomes less effective when weather-induced perturbations alter the local neighborhood topology that underlies feature aggregation. This work proposes Geometric Diffusion Consistency (GDC), a training-time regularizer that enforces consistent feature propagation behavior across geometrically divergent views of the same point cloud. A dual-view augmentation pipeline generates training pairs through weak and strong perturbations, where the strong branch incorporates dual-mode atmospheric extinction modeling, semantic-aware geometric corruption, and weather-coordinated structural perturbation. A lightweight learnable diffusion operator, implemented via sparse convolutions with a gated residual connection, propagates encoder bottleneck features through local voxel neighborhoods. The consistency loss aligns diffused representations at corresponding points across views, preserving topological relationships essential for dense prediction while allowing feature values to adapt to altered geometry. On the SemanticKITTI to SemanticSTF domain generalization benchmark, GDC achieves 38.6% mIoU, exceeding the previous best method by 3.8%, with consistent improvements across dense fog, light fog, rain, and snow conditions.

Integrated workflow for 3D documentation and spatial analysis of Jewish sepulchral heritage – Project "Stone Witnesses Digital: Space, Form, Inscription".

Lea Puglisi, Michael Groh, Patrizia Hanika, Mona Hess

Digital Technologies in Heritage Conservation, Institute of Archaeology, Heritage Conservation Studies and Art History/ Centre for Heritage Conservation Studies and Technologies (KDWT), University of Bamberg

The project 'Stone Witnesses Digital' ensures the exemplary documentation of a selected number of German Jewish graveyards. This paper presents the results from the first years of the project’s geomatics work, including the development of an integrated multi-sensor workflow for 3D imaging—ranging from geographic-scale documentation of entire graveyards (1:200 scale) to detailed feature imaging of individual gravestones (1:20 scale). The workflow supports the long-term research project on Jewish sepulchral culture "Stone Witnesses Digital".The project brings together expertise from Jewish Studies, Digital Technologies in Heritage Conservation, and Historic Building Research.

The overarching scope is to document the location and context of gravestones, their materiality, decorative elements, inscriptions, and the meanings embedded within them—summarized under the guiding concept 'Space, Form, Inscription.' The aim of the project is to create a comprehensive digital dataset that documents inscriptions as well as the spatial and structural characteristics of gravestones, thereby ensuring their long-term preservation and making them accessible for further academic research.

To achieve this, the work-flow must integrate various sensing and 3D imaging techniques, ensure reliable and sustainable data storage, and support reproducible dataset creation for spatio-temporal analyses and long-term monitoring of grave-yards throughout the 24-year project period. It also enables the combination of advanced sensing technologies with semantic web standards and facilitates the creation of informative Open Access outputs compliant with FAIR data principles.

3d Reconstruction of reindeer antlers using a low-cost optical camera system and gaussian splatting

Julian Robert Stevenson Cramb¹, Derek Lichti¹, John Matyas¹, Shabnam Jabari²

¹University of Calgary, Canada; ²University of New Brunswick, Canada

The research presented in this abstract is a novel, low-cost pipeline for the semi-automated 3D reconstruction of reindeer antlers using an optical camera array and Gaussian Splatting (GS). Traditional antler measurement methods are manual, invasive and prone to errors, while existing 3D scanning techniques struggle with subject motion. Photogrammetric bundle adjustment derived point clouds require well defined points which are generally lacking on antlers. To overcome this a system of 16 synchronized Raspberry Pi cameras was used to capture instantaneous imagery within an animal enclosure. A sparse point cloud along with the oriented network of imagery from a bundle adjustment is fed into a GS algorithm, producing an optimized reconstruction of the scene.

The system was initially validated in a controlled lab environment against a terrestrial laser scanner ground truth point cloud. A sub-centimeter accuracy with mean cloud-to-cloud distance of 4.0mm was achieved. Preliminary live-animal testing demonstrates the systems ability to produce a qualitatively accurate reconstruction under various lighting conditions. This method establishes a non-invasive method for high quality 3D reconstructions of complex reindeer antlers, which has applications in wildlife biology, environmental monitoring and biomechanics. Further work will involve rigorous network and camera calibration along with a comprehensive analysis of live-animal data.

A semi-automated pipeline for extracting architectural plans from 3D LiDAR data of ancient heritage sites

Marianna Bartrick-Krana, Roberto de Lima, Aziliz Vandesande, Maarten Bassier

KU Leuven, Belgium

Automatically generating architectural plans from archaeological sites poses a persistent challenge, particularly when dealing with ancient structures that have experienced severe deterioration. Many heritage contexts—especially those involving rock-cut monuments—present highly irregular geometries, collapsed features, eroded walls, and surfaces obscured by sediment or plaster detachment. These conditions make the extraction of reliable 2D plans or cross-sections from 3D data exceptionally difficult using conventional modeling tools.

In this study, we propose a semi-automated processing workflow tailored to the architectural characteristics of the Sheikh Said tombs. The pipeline converts 3D LiDAR datasets into structured 2D plans and vertical cross-sections, with particular emphasis on documenting deep, narrow shafts and multi-chambered tomb layouts.

Spherical Vision meets 3D Semantics: towards efficient LOD3 Model Generation for Smart Cities

Mohammad Saadatseresht^1,2, Hossein Arefi², Qazaleh Askari¹

¹School of Surveying and Geospatial Engineering, University of Tehran, Tehran, Iran; ²i3mainz - Institute for Spatial Information and Surveying Technology, Mainz University of Applied Sciences, Mainz, Germany

The generation of Level of Detail 3 (LoD3) building models is essential for applications such as urban digital twins, energy analysis, and smart city planning. However, conventional approaches based on terrestrial LiDAR or UAV photogrammetry remain costly, labor-intensive, and difficult to scale. This paper presents a scalable framework for transforming LoD1 building models into LoD3 façade representations using openly available urban data, including OpenStreetMap footprints, street-level spherical imagery, and weak point-cloud priors. The proposed method formulates the reconstruction problem as a facet-based modeling task, where each façade is processed independently in a local coordinate system derived from LoD1 geometry. A rectification strategy is introduced to generate fronto-parallel façade images directly from spherical panoramas, avoiding perspective distortions and facilitating image analysis. To address the challenges of unstructured data acquisition, a visibility-driven view selection scheme and a multi-view fusion framework are developed to construct robust façade evidence maps. The 3D geometry is estimated as a depth field through a multi-resolution optimization framework integrating ray consistency, appearance cues, point-cloud support, and structural regularization. Planar segmentation, polygonization, and geometric regularization are subsequently applied to derive structured façade elements. Openings such as windows and doors are detected using combined geometric and image-based evidence and further refined through architectural constraints. Experimental results demonstrate that the proposed framework enables reliable reconstruction of façade geometry and structural details using only open and low-cost data sources, providing a practical pathway for large-scale LoD3 generation in real urban environments.

LiDAR Point Cloud Oversegmentation via SAM-based Knowledge Distillation

Dening Lu¹, Michael Chapman², Jonathan Li^1,3

¹Department of Systems Design Engineering, University of Waterloo; ²Department of Civil Engineering, Toronto Metropolitan University; ³Department of Geography and Environmental Management, University of Waterloo

Large-scale LiDAR point clouds provide rich geometric information, yet learning effective structural representations remains challenging due to the misalignment between semantic categories and geometric structures. To address this issue, we propose a SAM-guided framework for point cloud oversegmentation. We transfer grouping knowledge from 2D vision by constructing a large-scale oversegmentation dataset using the Segment Anything Model (SAM) on bird’s-eye-view projections.

Based on these grouping priors, a structure-aware point cloud encoder is learned via a distillation objective that enforces intra-region compactness and inter-region separation in the embedding space. The proposed approach does not rely on semantic supervision and directly learns generalizable structural representations.

Experiments on various benchmark datasets (STPLS3D, Toronto-3D, DALES, and S3DIS) demonstrate that the proposed method achieves competitive performance.

In particular, it significantly improves boundary recall (e.g., 92.21% on STPLS3D and 93.47% on Toronto-3D) while maintaining high oracle accuracy (up to 97.62%).

Moreover, the model generalizes well to unseen datasets without retraining, showing strong cross-dataset inference capability.

Shape Representation using Gaussian Process mixture models

Panagiotis Sapoutzoglou, George Terzakis, Georgios Floros, Maria Pateraki

National Technical University of Athens, Greece

In this work we propose an object-specific implicit representation: Functional modeling of surface geometry using Gaussian Processes (GPs). n contrast to neural models, our method leverages the ability of GPs to model continuous functions from irregularly sparse sampled data and apply this concept in the context of a probabilistic model that learns the shape of an object as the mixture of multiple directional distance fields anchored at reference points specially placed in the object’s skeletal outline. The resulting mixture model provides continuity, sparsity, and finer shape detail while avoiding the heavy training burden associated with deep implicit methods

A Deep Learning Model for Tree Species Classification Using Ground-Level RGB Imagery and Automated Annotations

Hristina Hristova, Clemens Blattert, Sunni K.P. Kushwaha, Janine Schweier

Swiss Federal Research Institute for Forest, Snow and Landscape Research WSL, Switzerland

Accurate tree species identification is essential for effective forest management, biodiversity monitoring, and resource estimation. While automated methods relying on aerial and canopy-level remote sensing have become prevalent, they often struggle in dense, multi-layered forest stands, where critical lower-stem and bark features are obscured. To address this limitation, we present a Deep Learning (DL) framework for tree species classification utilizing ground-level RGB imagery. Because manual annotation of terrestrial images in forest environments is labor-intensive and complicated by occlusions, we introduce a new "in-situ" forest image dataset alongside an automated labeling pipeline. This pipeline generates training annotations by projecting tree-species data derived from Mobile Laser Scanning (MLS) onto 2D images based on photogrammetric reconstruction. The proposed DL model leverages these automatically labeled images to effectively recognize tree species based on structural and bark characteristics. The model achieves overall F1-scores of 0.78 and 0.75 for object detection and instance segmentation, respectively. Ultimately, our approach complements existing methods for detecting tree positions and diameters, facilitating the creation of a holistic, cost-effective, and scalable forest inventory dataset.

Pattern recognition approaches for the detection of alteration and degradation phenomena in hyperspectral and UAV multispectral imagery: the case study of a historical masonry water bridge

Alessandra Spadaro, Francesca Matrone, Andrea Maria Lingua, Ramin Rashidi Alavijeh

Geomatics Lab, Department of Environment, Land and Infrastructure Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, Italy

Historical masonry hydraulic infrastructures are affected by complex degradation processes, including vegetation growth, moisturerelated anomalies, and salt efflorescence, whose detection requires non-invasive, repeatable, and scalable diagnostic approaches. This study proposes a multi-scale workflow for the detection and classification of degradation phenomena affecting the Cavour Canal water bridge, a nineteenth-century masonry structure in northern Italy. The methodology integrates UAV-based multispectral orthophotos and close-range hyperspectral imagery within a common Object-Based Image Analysis (OBIA) framework. The multispectral workflow was designed for façade-scale screening, whereas the hyperspectral workflow was used to refine the interpretation of selected sectors through detailed spectral characterisation. Multiple supervised classifiers, including Support Vector Machine (SVM), k-Nearest Neighbours (kNN), Decision Tree (DT), Random Trees (RT), and Naïve Bayes (NB), were tested on both datasets. The results show that the multispectral workflow is effective for the identification of vegetation and broad water-related anomalies, with kNN providing the best overall performance, while the hyperspectral workflow improves the discrimination of subtle surface alterations, particularly efflorescence, with SVM yielding the most stable results across the tested configurations. Overall, the proposed methodology demonstrates the value of integrating multispectral and hyperspectral data within a hierarchical workflow for non-invasive degradation mapping of historical masonry infrastructures.

A Framework for Individual Tree Segmentation from Multi-Resolution LiDAR Data in Complex Tropical Forests

Hazem Hanafy¹, Sangyoon Park¹, Songlin Fei², Ayman Habib¹

¹Lyles School of Civil and Construction Engineering, Purdue University, West Lafayette, USA; ²Department of Forestry and Natural Resources, Purdue University, West Lafayette, USA

The increasing demand for accurate forest inventory in tropical ecosystems requires robust, scalable methods for individual tree segmentation. Tropical forests pose particular challenges due to dense understory, high species diversity, and complex multi-layered canopies, which often lead to tree under- and over-segmentation in LiDAR-based workflows. This study presents a general framework for individual tree segmentation from dense, multi-resolution LiDAR point clouds acquired by a Backpack LiDAR system over a 15-year-old palm stand in Belém, Brazil. After trajectory enhancement and mapping, an adaptive cloth simulation filter is used to derive a Digital Terrain Model and height-normalized points. Woody components are then isolated using Otsu-based intensity thresholding, eigenvalue-derived linearity, and statistical outlier removal. Trunk detection combines DBSCAN clustering on lower-stem points with a dual tree-localization strategy based on sum-of-elevation heat maps and RANSAC circle fitting. A segmentation quality-control module addresses over- and under-segmentation before reattaching canopy and foliage via voxel-based KD-tree retrieval to generate final per-tree segments. Compared with 3DFIN and TreeLearn using point cloud–derived reference tree locations, the proposed framework achieves a precision of 92.85%, recall of 95.97%, and F1-score of 94.38%, substantially outperforming 3DFIN (75.97%) and TreeLearn (15.14%). These results demonstrate the potential of the proposed framework to deliver reliable tree-level inventories in complex tropical forests.

Digital Preservation and Augmented Reality for Historical Surveying Instruments: A Photogrammetric Approach to Cultural Heritage Documentation

Clóvis Andrade, Juyara Bezerra, Simone Sato, Karoline Jamur

Universidade Federal de Pernambuco, Brazil

Historical surveying instruments embody centuries of innovation in cartography and engineering, serving as crucial scientific and pedagogical artifacts. Their fragility, risk of damage, and limited exhibition space restrict access and highlight the need for effective preservation strategies (Duester, 2023). Traditional conservation methods protect material integrity but do not address broader challenges related to accessibility and engagement. Digital technologies now offer transformative alternatives capable of creating accurate and interactive representations of these instruments (Farella et al., 2022).

This study proposes a low-cost, replicable digital preservation pipeline integrating close-range photogrammetry and augmented reality (AR). Photogrammetry provides a non-contact method for generating detailed 3D models using consumer-grade smartphones, democratizing access to advanced documentation techniques (Icardi et al., 2018; Förstner & Wrobel, 2016). AR enables users to interact with these digital surrogates in real environments, fostering deeper engagement and overcoming limitations imposed by fragile originals (Spallone, 2022; Gong et al., 2022).

Image acquisition was conducted with a Xiaomi Poco F5 Pro under controlled lighting, maintaining 30–60% overlap. Processing in Agisoft Metashape included alignment, dense cloud generation, mesh reconstruction, and texturing. Post-processing in Blender optimized the models for real-time visualization. Integration into AR was achieved using Unity and the Vuforia Engine SDK.

Results demonstrate high-fidelity 3D models that preserve fine details and offer immersive AR interaction. This pipeline provides durable digital records, enhances educational experiences, and expands public access. The approach aligns with ISPRS Working Group II/6 objectives and offers a scalable model for cultural heritage institutions seeking accessible and effective preservation strategies.

Synthetic Dataset Generation for Partially Observed Indoor Objects

Jelle Vermandere, Maarten Bassier, Maarten Vergauwen

KU Leuven, Belgium

Learning-based methods for 3D scene reconstruction and object completion require large datasets containing partial scans paired with complete ground-truth geometry. However, acquiring such datasets using real-world scanning systems is costly and time-consuming, particularly when accurate ground truth for occluded regions is required.

In this work, we present a virtual scanning framework implemented in Unity for generating realistic synthetic 3D scan datasets. The proposed system simulates the behaviour of real-world scanners using configurable parameters such as scan resolution, measurement range, and distance-dependent noise. Instead of directly sampling mesh surfaces, the framework performs ray-based scanning from virtual viewpoints, enabling realistic modelling of sensor visibility and occlusion effects. In addition, panoramic images captured at the scanner location are used to assign colours to the resulting point clouds.

To support scalable dataset creation, the scanner is integrated with a procedural indoor scene generation pipeline that automatically produces diverse room layouts and furniture arrangements. Using this system, we introduce the V-Scan dataset, which contains synthetic indoor scans together with object-level partial point clouds, voxel-based occlusion grids, and complete ground-truth geometry. The resulting dataset provides valuable supervision for training and evaluating learning-based methods for scene reconstruction and object completion.

Automatic Segmentation of 3D Gaussian Splatting for Urban Cultural Heritage Sites

Widiatmoko Azis Fadilah, Virgile Gauthier, Arnadi Murtiyoso, Tania Landes, Pierre Grussenmeyer

Université de Strasbourg, CNRS, INSA Strasbourg, ICube Laboratory UMR 7357, Photogrammetry and Geomatics Group, 67000, Strasbourg, France

3D Gaussian Splatting (3DGS) has emerged as a promising method for photorealistic scene reconstructions, yet its application to semantic segmentation in real-world heritage documentation remains underexplored. This study proposes and evaluates an automated semantic 3DGS segmentation pipeline integrating the Segment Anything Model 3 (SAM 3) with per-class prompting for Gaussian reconstruction, applied to a nadiral UAV dataset of the Siti Inggil heritage complex in Cirebon, Indonesia. Segmentation performance of four semantic classes (ground, roofs, vegetations, and water bodies) were assessed against manually segmented 2D and 3D reference data, supplemented by geometric accuracy assessment via the M3C2 analysis. Results reveal both the promise and the inherent challenges of applying 3DGS segmentation to complex real-world heritage scenes, where acquisition geometry, surface characteristics, and foundational model limitations can be observed.

Collaborative Multimodal Drone-Based Remote Sensing for Levee Piping Detection

Tu Hu, Tongqi Wang, Shan Su, Changjun Chen, Haoxiang Liu

Wuhan University, China, People's Republic of

This paper addresses the critical challenge of early and accurate detection of piping, a major failure mode in levee systems. Traditional methods are limited, and even advanced techniques such as infrared thermography struggle to capture weak thermal anomaly signals under complex environmental interference. To overcome these limitations, we propose an innovative intelligent algorithm that achieves breakthroughs by synergistically integrating drone-based infrared imagery and point cloud data.

The methodology follows a rigorous two-stage pipeline. First, potential piping zones are coarsely extracted from thermal infrared images using an enhanced saliency detection model. This involves superpixel segmentation and multi-scale (global and local saliency) analysis to highlight temperature anomalies, followed by adaptive thresholding based on Gaussian distribution fitting for automatic segmentation. Second, a fine discrimination step is introduced, which integrates multimodal prior information from point clouds to significantly reduce false alarms. This is achieved by applying a series of physical constraints: area filtering, temperature variance filtering, terrain-based filtering, and overlap analysis between the infrared and point cloud data.

Validation with field data collected during the flood season demonstrates that this method achieves high-precision localization of piping zones. Its key advantage lies in its ability to effectively suppress false positives caused by environmental clutter while ensuring that the detection results align with physical principles. This study provides a practical and reliable technical solution for enhancing the safety inspection and early warning systems of levee structures.

An Open-Source Pipeline for Runtime-Optimized Heritage Photogrammetry in Game Engines

Arkoun Merchant¹, Adam Weigert¹, Chloe Dennis², Stephen Fai¹

¹Carleton Immersive Media Studios, 1125 Colonel By Dr, Ottawa, Canada; ²Bytown Museum, Ottawa, Canada

This paper presents Mesh2Tile, an open-source pipeline that converts photogrammetric meshes into runtime-optimized 3D Tiles for interactive visualization in game engines. Photogrammetry produces high-polygon meshes that remain difficult to deliver at scale

through interactive platforms. Cloud-based conversion services like Cesium Ion provide a path to the OGC 3D Tiles format but impose cost barriers and raise data sovereignty concerns for confidential heritage projects. Existing open-source converters rely on uniform spatial partitioning, export redundant textures with every tile, and offer limited control over LOD generation. Mesh2Tile leverages Blender's Python API to perform adaptive octree tiling driven by triangle density, per-tile texture baking that eliminates texture redundancy, and parallel processing to generate georeferenced 3D Tiles from OBJ meshes. The pipeline is validated through a case study of the Bytown Museum Commissariat Building on the Rideau Canal UNESCO World Heritage Site. It is processed at three scales from 900 thousand to 90 million triangles. Results demonstrate linear scaling of processing time, up to 62% file size reduction for larger models, and successful runtime streaming in Unreal Engine 5 through the Cesium for Unreal plugin at 120 FPS with comparable tile balance to Cesium Ion's commercial output. The pipeline enables institutions to maintain full control over sensitive heritage data while achieving performance suitable for interactive visualization.

Location determination of dynamic objects using a single CCTV with monocular depth estimation

JiHeon Jung¹, Junhee Youn², Jieun Kim¹, Junho Gong¹, Phillip Kim¹, Sunwoong Paik¹

¹1 Dept. of Future & Smart Construction Research, Korea Institute of Civil Engineering and Building Technology, 10223 Goyang-Si, Gyeonggi-Do, Republic of Korea; ²Corresponding Author : Dept. of Future & Smart Construction Research, Korea Institute of Civil Engineering and Building Technology

This contribution presents a method to determine ground coordinates of pedestrians from a single CCTV frame using monocular depth estimation and orthophoto-based ground control points. Urban crowd monitoring requires pedestrian location information, but many CCTV-based approaches rely on accurate camera calibration or multi-view configurations, which are often unavailable in real deployments. In this study, we exploit relative depth values from a monocular depth estimation model (Depth Anything V2) and ground control points jointly identifiable in both the CCTV frame and an orthophoto in EPSG:5186. For each frame, depth-based distance ratios between the pedestrian and ground control point pairs are used to construct Apollonius circles in the orthophoto plane, and the pedestrian position is estimated by a weighted least-squares adjustment of their intersections. The method is evaluated on 180 frames across two scenes from an urban testbed with camera–target distances within approximately 50 m, across three GCP placement scenarios. For the optimal configuration (Scenario A), a mean RMSE of 1.989 m was achieved, excluding frames in which GCPs were temporarily occluded by moving objects, demonstrating that single-frame CCTV imagery combined with an orthophoto can achieve an accuracy of approximately 2 m without any EOP/IOP information, which is practically useful for urban crowd monitoring and dynamic thematic mapping. The influence of GCP placement geometry and occlusion conditions on estimation accuracy is also analyzed

ML-MIFD: Multi-Level Multimodal Invariant Feature Descriptor

Zening Wang, Haoyu Guo, Yongxiang Yao, Yongjun Zhang, Peihao Wu, Yi Wan

School of Remote Sensing and Information Engineering, 430079, Wuhan, Hubei, China

With the rapid advancement of multi-sensor technology, cross-modal image matching has become a key research focus. However, significant challenges persist, primarily caused by differences in imaging mechanisms that lead to nonlinear radiation variations and feature heterogeneity.Coupled with complex geometric distortions, traditional feature description methods in matching struggle to directly or effectively represent common feature information across modalities, resulting in matching failures. Thus, effectively mitigating noise and radiation distortions to enable robust cross-modal matching remains an open and critical problem, compounded by the intrinsic difficulty of balancing descriptor parameters like patch size and histogram partitioning. To address the aforementioned issues, this paper proposes a novel Multi-Level Multimodal Invariant Feature Descriptor (ML-MIFD), designed to enhance resistance to nonlinear radiometric differences and multi-source noise while maintaining rotation invariance. The proposed algorithm consists of three stages: feature detection, ML-MIFD descriptor construction, and image matching.This paper conducts comparative experiments with various state-of-the-art methods using typical cross-modal image datasets. The results demonstrate that the ML-MIFD method exhibits significant advantages in both registration accuracy and matching stability.

Geomorphological Monitoring of Erosion on Restored Slopes Through the Integration of Drones, GIS, and LiDAR

Mónica López Moncada^1,2,3, Joan-Cristian Padró^1,4, Vicenç Carabassa⁵, Paulo Escandón-Panchana^6,7, Andrés Velastegui-Montoya^2,3

¹Departamento de Geografía, Universitat Autònoma de Barcelona (UAB); ²Faculty of Engineering in Earth Sciences, ESPOL Polytechnic University; ³Laboratory of Geoinformation and Remote Sensing, Faculty of Engineering in Earth Sciences, ESPOL Polytechnic University; ⁴Institut Cartogràfic i Geològic de Catalunya (ICGC), Parc de Montjuïc; ⁵CREAF, Universitat Autònoma de Barcelona (UAB); ⁶Departamento de Ingeniería Cartográfica y Topografía, Universidad Politécnica de Madrid (UPM); ⁷Escuela de Ciencias Ambientales, Universidad Espíritu Santo

Mining represents a strategic activity for economic development; however, this activity causes significant impacts on the landscape, soil, and water resources. During the restoration phase, slope erosion represents a challenge for ensuring the geomorphological stability and ecological functionality of the affected areas. This study aims to evaluate the erosion dynamics of restored mining slopes by integrating Geographic Information Systems (GIS) and data obtained from Unmanned Aerial Systems (UAS) for geomorphological monitoring and quantification of soil loss on slopes. The research was carried out at the Lázaro quarry, Tarragona, Spain, using a fixed-wing UAS equipped with a multispectral camera to generate high-resolution orthophotos and Digital Elevation Models (DEMs), and compared with historical LíDAR data. Height Difference Models (HDMs) and volumetric analysis were applied to quantify erosion and deposition processes. Three modelling approaches were compared: ridge-derived DEM (DEMp), filtered DEM (DEMf), and lidar DEM (DEMl), considering their accuracy, spatial detail, and ability to represent erosional microtopography. The findings revealed that the DEMp provides the most consistent estimates of volume loss and most faithfully reproduces pre-erosion morphologies. At the same time, the DEMf tends to smooth relief, while the DEMl provides a lower-resolution overview. These results confirm the effectiveness of integrating UAS data, photogrammetry, and geospatial analysis for monitoring restored slopes, enabling the accurate quantification of eroded volumes and the detailed characterisation of morphological processes. This study contributes to the optimisation of the geomorphological and environmental management of restored mining areas, promoting their long-term stability and sustainability.

Application of SfM Methods for the Photogrammetric Processing of Historical Aerial VHS Videos

Grzegorz Jóźków, Maurycy Hechmann

Wroclaw University of Environmental and Life Sciences, Poland

This submission presents the results and analysis of the SfM application for the processing of historical aerial VHS videos. The test data was collected during the 1997 Central European Flood and poses significant challenges due to the low quality of the data, the manner of the data acquisition (corridor mapping from different altitudes), and the object (a significant part of the images show the water). The SfM processing was executed in commercial software and allowed for successful image block bundle adjustment and creation of subsequent products, such as dense point cloud and orthomosoaics. One of the challenges during processing was the extraction of the approximate position of images and the selection of processing parameters.

Global Block Adjustment for Mosaicked Stereoscopic Satellite Imagery

Michaël Erblang¹, Emelyne Saulnier¹, Guillaume Laurent¹, Nicolas Delaygue², Fabrice Buffe², Alice Latourte², Mathilde Jassaud³, Noémie Bricout³

¹Thales Services Numériques (TSN), 290 Allée du Lac, 31670 Labège, France; ²Centre National d’Etudes Spatiales (CNES), 18 avenue E. Belin, 31400 Toulouse cedex 9, France; ³Institut national de l'information géographique et forestière (IGN), 18 avenue E. Belin, 31400 Toulouse cedex 9, France

Satellite imagery acquired over large areas from multiple viewpoints introduces subtle geometric misalignments that degrade the quality of derived products such as Digital Surface Models (DSMs). This paper presents a global block adjustment workflow designed to correct these errors across overlapping stereo acquisitions from the CO3D constellation, which captures Earth's surface at 50 cm resolution.

The proposed pipeline operates in three stages: individual acquisition refinement using Space Reference Points (SRPs) as Ground Control Points; tie point extraction between overlapping scenes through two-pass image correlation; and a weighted global spatio-triangulation simultaneously optimizing attitude biases, attitude drifts, and per-satellite magnification parameters.

Applied to a large stereo acquisition dataset over the Aorounga crater, Chad, the method demonstrates strong geometric performance. The results highlight that careful parameterization — combining observation weighting, n-tuple point filtering, and per-satellite sensor refinement — is key to producing accurate, geometrically consistent large-scale mosaics from bi-satellite stereo imagery. This paper does not include the in-orbit performances due to confidentiality agreement.

Learning-Based Semantic Segmentation and Context-based Quality Control of Bike-Pack LiDAR data for Tree Mapping in Semi-Urban Environments

Sungwoong Hyung¹, Hazem Hanafy¹, Chunxi Zhao¹, Sangyoon Park¹, Songlin Fei², Ayman Habib¹

¹Lyles School of Civil and Construction Engineering, Purdue University, West Lafayette, IN, 47907, USA; ²Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, 47907, USA

Accurate tree mapping in semi-urban areas is essential for ecological monitoring and infrastructure maintenance, but is challenged by complex structures and clutter in LiDAR data. This study proposes a learning-based framework using a Superpoint Transformer (SPT) for semantic segmentation. The model is pretrained on the KITTI-360 dataset and then fine-tuned using transfer learning on a high-resolution dataset captured by our in-house Bike-Pack LiDAR system. A key contribution of this work is a context-based quality control process applied after the initial segmentation. This quality control process refines the results by removing building artifacts, correcting misclassifications between vegetation and poles using geometric and intensity analysis, and refining building boundaries. Experiments demonstrate that this QC process significantly improves segmentation accuracy, especially for the critical vegetation and pole classes.

Multitemporal Monitoring of Posidonia Oceanica Banquettes using UAV Photogrammetry

Valeria Longhi¹, Andrea Lingua², Francesca Gallitto², Filiberto Chiabrando³

¹DIST – Interuniversity Department of Regional and Urban Studies and Planning, Politecnico di Torino, Italy; ²DIATI – Department of Environment, Land and Infrastructure Engineering, Politecnico di Torino, Italy; ³DAD – Department of Architecture and Design, Politecnico di Torino, Italy

Posidonia oceanica (PO) meadows represent one of the most valuable coastal ecosystems in the Mediterranean Sea, providing key ecological functions and ecosystem services (Vassallo et al., 2013). Even after detachment, PO leaves and rhizome fragments accumulate along the shoreline forming thick deposits known as banquettes (Rotini et al., 2020). These natural structures play a crucial role in protecting beaches from erosion, buffering wave energy, and contributing to the nutrient cycling of coastal systems (Fonseca and Cahalan, 1992).

Despite their ecological importance, banquette dynamics are not consistently monitored, standardized monitoring procedures are lacking, and their spatial and temporal variability remains poorly understood. Within the framework of the POSEIDON project, funded by the Italian National Recovery and Resilience Plan (PNRR), innovative high-resolution mapping techniques are being developed to monitor PO ecosystems both underwater and on the coast. This contribution presents a methodology based on UAV RGB photogrammetry for the multitemporal analysis of banquette morphodynamics, demonstrating its potential for quantitative assessment of seasonal and interannual changes. UAV photogrammetry has become a widely adopted tool for high-resolution coastal monitoring and topographic mapping, providing centimeter-scale DEMs when combined with RTK positioning and well-distributed ground control points (Zannutta et al., 2020; Vecchi et al., 2021; Yoo and Oh, 2016).

Photogrammetry and 3D Gaussian Splatting for Cultural Heritage. Pro Cons and Main Differences

Xinchen Li, Alessio Martino, Filiberto Chiabrando, Xiang Li

Department of Architecture and Design(DAD), Politecnico di Torino, Italy

This paper presents a comparative analysis of traditional photogrammetric methods and 3D Gaussian Splatting (3DGS) technology in the digitisation of Cultural Heritage (CH). Two representative datasets, differing in scale and image acquisition conditions, were selected to systematically evaluate the performance of both methods in terms of visual quality, geometric accuracy, computational efficiency and stability. The results indicate that 3DGS significantly outperforms traditional photogrammetry methods in terms of rendering quality and real-time visualisation capabilities, generating more realistic and immersive visual effects. However, its geometric accuracy is generally slightly lower than that of traditional methods, a difference that is particularly pronounced in small-scale datasets or under low-resolution input conditions. Among the various implementation methods, Postshot and LichtFeld Studio demonstrated higher stability and robustness, whilst the original GraphDeco method exhibited greater sensitivity to data scale and parameter settings. Photogrammetry offers reliability in high-precision geometric reconstruction, whilst 3DGS demonstrates significant potential for complementing this with a high-fidelity visual experience. The research findings try to provide practical guidance for selecting 3D reconstruction methods across different cultural heritage application scenarios.

Prediction of Understorey Vegetation using Remote Sensing in Fennoscandian Forests

Ritwika Mukhopadhyay, Ruben Valbuena, Inka Bohlin

Dept. of Forest Resource Management, Swedish University of Agriculture (SLU), 90183 Umeå, Sweden

Understorey vegetation (USV) contributes to forest structure, nutrient cycling, species diversity, habitat functions, and disturbance processes in Fennoscandian forests. It also provides non‑wood forest products such as wild berries. Mapping USV is important for understanding ecosystem functioning and its links to overstorey conditions. Although remote sensing (RS) enables large‑scale forest monitoring, its use for USV mapping remains limited because the layer is often obscured by upper‑canopy foliage. This study assesses the accuracy of USV cover prediction (i.e., the ground area covered by USV) using multiple RS data sources, identifies key predictors, and evaluates how canopy cover influences model performance. Field data were collected in 2024 from 487 plots in the Krycklan catchment. Sentinel‑2 summer and autumn imagery provided spectral reflectance, spectral indices, and grey‑level co‑occurrence matrix (GLCM) texture variables. Additional texture variables were derived from canopy height models (CHMs) generated using airborne laser scanning (ALS; 1–2 points/m²) and Pléiades tri‑stereo image matching (0.5 m; 1.5 points/m²). Beta regression and random forest regression (RFR) models were trained on 70% of plots and validated on 30%. Important predictors included seasonal red‑edge differences, greenness‑based indices, CHM texture variables, and ALS‑based canopy cover. Model performances indicated obstruction due to overstorey canopy cover remains for USV cover prediction. Beta regression with Sentinel‑2 data performed slightly better (RMSE = 21.7 m², variance explained = 5%) than RFR. However, best results occurred in low‑canopy plots (≤40%) using RFR with Sentinel‑2 and Pléiades‑derived CHM texture variables (RMSE = 14.6 m², variance explained = 32%).

Sequence-based decoupling Encoder for Well Log Interpretation

Ning Qian, Yiming Xu, Monica Sester

Institute of Cartography and Geoinformatics, Leibniz University Hannover, Germany

Well logging curves play a crucial role in oil and gas exploration and geological engineering, as they provide essential information about subsurface formations and reservoir properties. In recent years, with the growing adoption of deep learning techniques in geoscientific data analysis, well logging data have increasingly been modeled as depth-dependent sequences, enabling the application of sequential neural networks for their analysis. Among these approaches, attention mechanisms have been adopted in log interpretation tasks due to their ability to capture long-range dependencies within sequences. However, directly applying attention mechanisms without considering the intrinsic structure of logging data may introduce model redundancy and increase learning complexity, which can ultimately degrade predictive performance. To address this issue, this study proposes a Sequence-based Decoupling Encoder (SDE). The proposed encoder explicitly disentangles the interactions between logging curves and across depth, enabling the model to learn relationships along different dimensions separately, which allows more effective feature extraction and mapping into a latent space. The decoupling strategy also reduces the learning complexity of the attention mechanism and provides clearer learning objectives for the model. The proposed method is evaluated on the public dataset \textit{FORCE2020} and applied to two common well log interpretation tasks: missing log reconstruction and lithology prediction. We compare SDE against several representative sequential baselines. Experimental results demonstrate that SDE achieves superior predictive performance in both tasks.

Exploring the Potential of the Mandeye Handheld LiDAR System for Ecosystem Characterization

Cosme Hernanz-Gilbert¹, Carlos Cabo³, Álvaro Moreno-Martínez², Mónica Herrero-Huerta¹

¹Desertification Research Centre (CIDE) - CSIC, Spain; ²Image Processing Laboratory (IPL), Universitat de Valencia, Paterna, Valencia, Spain; ³Department of Mining Exploitation, University of Oviedo, Spain

Handheld LiDAR systems are emerging as a promising alternative to traditional terrestrial and airborne laser scanning for environmental research, yet their performance and applicability remain insufficiently explored. The Mandeye LiDAR device, developed between 2022 and 2024, stands out for its lightweight design, portability, integrability with other sensing platforms, and notably low cost. These characteristics make it especially attractive for ecological monitoring, enabling high-resolution structural data collection even in projects with limited resources. Despite this potential, very few studies have evaluated the device’s performance or its capacity to support ecosystem characterization.

This research presents a comprehensive review and experimental assessment of the Mandeye LiDAR system to determine its suitability for environmental applications. Field data are being collected in Mediterranean forest and riparian environments using three acquisition modes, on foot, bicycle, and kayak, to test how platform mobility and scanning geometry influence point cloud quality. The study evaluates point density, coverage, structural accuracy, and noise sensitivity while integrating ground-truth measurements and independent LiDAR references.

Preliminary findings show that the Mandeye performs robustly across diverse environments, with kayak-based acquisitions offering particularly detailed representations of the vegetation-water interface. Walking and cycling configurations provide efficient alternatives for forest structure assessment. Overall, the results demonstrate the value of handheld LiDAR as a flexible, accessible complement to conventional remote sensing methods. The project also aims to establish methodological guidelines for Mandeye deployment, contributing to the broader adoption and standardization of low-cost LiDAR tools in ecosystem monitoring.

VISTA-GS: MVS-Guided virtual view augmentation for sparse-view 3d gaussian splatting

Hongsheng Huang¹, Yaxin Li^1,4, Shengjun Tang², Siqi Du³, Mahmoud Mostafa¹, Mahmoud Adham¹, Wu Chen¹

¹Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, P.R. China; ²Research Institute for Smart Cities, School of Architecture and Urban Planning, Shenzhen University, Shenzhen, P.R. China; ³College of Urban and Environmental Sciences, Peking University, Beijing, P.R. China; ⁴Micro Dimension Technology Limited, Hong Kong, P.R. China

3D Gaussian Splatting (3DGS) has achieved remarkable success in novel view synthesis with dense input views. However, its performance deteriorates rapidly in sparse-view scenarios, particularly for viewpoints distant from training cameras. This degradation stems from two fundamental limitations: sparse initial point clouds from limited input views and insufficient viewing angle constraints for robust optimization.

To address these challenges, we propose VISTA-GS (Virtual Image Synthesis and Training Augmentation), a novel framework that leverages Multi-View Stereo (MVS) reconstruction for point cloud densification and generates virtual training views through alpha-blending rendering of MVS-reconstructed dense colored point clouds. Unlike existing approaches relying on generative models or learned priors, our method exploits the geometric consistency inherent in MVS point clouds to create physically-grounded virtual views. By rendering dense point clouds from strategically positioned virtual camera viewpoints, we generate additional training images that preserve accurate geometric relationships while providing crucial angular constraints, effectively regularizing 3DGS training without synthesis-induced artifacts.

Our main contributions are twofold. First, we address sparse SfM initialization by employing MVS for dense point cloud generation with adaptive depth-weighted ellipsoid scaling. Second, we introduce a rendering-based virtual view generation strategy that creates geometrically consistent training images around original viewpoints using the same alpha blending principle as 3DGS. This approach enables robust reconstruction from minimal input views (3-12 images), substantially improving novel view synthesis performance while maintaining geometric fidelity that generative approaches often compromise.

An Approach to 3D Digitisation and Segmentation of the Interior and Exterior of a complex Museum Object

Simon Albers, Thomas Luhmann, Till Sieberth

Institute for Applied Photogrammetry and Geoinformatics, Jade University of Applied Sciences, Oldenburg, Germany

The digitisation of cultural heritage objects is an important procedure to conserve, share and analyse artefacts from the past. Nowadays, it is common practice to digitise artefacts using DSLR cameras and Structure from Motion. For most objects, this is a suitable procedure, but in some cases, objects have narrow interiors which cannot be reached with common camera equipment. Our case study is a small kayak model (~ 1 x 0.1 x 0.15 m) from the 19th century with an interior that can only be documented through small openings (0.1 m radius). We developed a method using a modified webcam to safely digitise the interior of the kayak. By comparing three datasets of a test object, we describe advantages and disadvantages of the usage of integrated autofocus and colour balance of the webcam. Furthermore, we extended our approach for segmentation of 3D models to consider the interior and prepare the models for future analysis. There were no major differences between the models of the three datasets, and all of them could reduce the data gaps in the 3D model based on the DSLR images noticeably.

Three-dimensional Reconstruction and Crack Measurement of Cultural Monuments using UAV-based Photogrammetry

Wei-Che Huang¹, Wen-Cheng Liu¹, Yi-Shan Luo¹, Po-Yu Chen², Kuei-Luo Lin³

¹National United University, Taiwan; ²Shin-Mag Industrial Co., Ltd., Taiwan; ³Fullai Construction Co., Ltd., Taiwan

Three-dimensional (3D) modeling for the documentation, preservation, and management of cultural heritage is indispensable. To achieve this goal, a low-cost unmanned aerial vehicle (UAV) combined with the Structure from Motion (SfM) photogrammetric technique was utilized to build a 3D model and conduct surface crack measurements of cultural monuments. The results showed that, under simple conditions, non-specialists can easily generate accurate 3D models from UAV-acquired imagery. In this study, the statistical errors of checkpoints between 3D reconstruction and field measurements, expressed as total RMSE, ranged from 0.103 m to 0.848 m. However, the mean absolute errors of surface crack measurements between tape-based methods and 3D reconstruction ranged from 0.002 m to 0.099 m. Furthermore, UAV-SfM was applied to measure surface crack lengths on an inaccessible cultural monument. The findings demonstrated that employing the UAV-SfM photogrammetric technique for 3D reconstruction of cultural monuments is both feasible and reliable.

Towards transparent geohazard model: XAI for ground deformation susceptibility in Rhenish Coalfields, Germany

Dibakar Kamalini Ritushree^1,2, Marzieh Baes¹, Mahdi Motagh^1,2

¹GFZ Helmholtz Center for Geosciences, Germany; ²LUH Leibniz Universitat Hannover, Germany

Satellite remote sensing has become a vital tool for monitoring environmental change and supporting disaster management, offering consistent and wide-area observations of the Earth’s surface. Combined with the rapid growth of Earth observation data, machine learning (ML) enables the detection of complex spatial patterns and improves the prediction of geohazards. One significant hazard is ground deformation caused by coal mining, which threatens infrastructure, ecosystems and local communities. This study presents an interpretable ML framework that integrates multi-source geospatial datasets with eXplainable Artificial Intelligence (XAI) techniques to map deformation susceptibility in open-pit coal mining regions. Beyond achieving high predictive performance, the approach reveals the key factors controlling ground instability, including proximity to mining operations and faults, groundwater variation and topographic conditions. The results supports enhanced monitoring strategies for reducing disaster risks in mining-affected areas.

Comparative Accuracy Assessment of two Low-Cost Devices for Underwater Structure-from-Motion 3D Reconstruction

Lukas Quirin, Gunnar Lelle-Neumann, Ferdinand Maiwald

Chair of Optical 3D-Metrology, TUD Dresden University of Technology, Germany

Accurate three-dimensional (3D) documentation of underwater environments is essential for evaluating the structural integrity of submerged infrastructure such as dams, pipelines or offshore platforms, as well as for repair operations or monitoring sites affected by potential pollution hazards including underwater chemical or ammunition residues. Automatic 3D surveying plays a key role in fulfilling these tasks remotely with a spectrum of uncrewed systems, such as remotely operated (underwater) vehicles (ROV), autonomous underwater vehicles (AUV) or robots. Conventional underwater surveying methods, including high

resolution imaging sonars and laser-based techniques, often require expensive instrumentation. Advances in photogrammetry and Structure-from-Motion (SfM) techniques enable detailed

3D reconstructions from standard imagery. This study presents a comparative accuracy assessment of two imaging devices for underwater SfM-based 3D reconstruction, giving practical workflow recommendations for low-budget underwater inspection and survey tasks.

UAV Photogrammetry and Laser Pointer Targeting for High-Precision Mapping of Inaccessible Surfaces

Dobromir Filipov¹, Stefan Vlaykov²

¹UACG, Faculty of Geodesy, Sofia; ²ESO PROEKT EOOD, Sofia

Accurate georeferencing is a fundamental requirement in UAV based

photogrammetry, directly influencing the spatial

precision, reliability, and analytical value of the derived 3D

models. However, achieving high

accuracy in areas such as rockslides or steep geological

formations presents considerable challenges, primarily due to

the difficulty or danger associated with placing conventional

Ground Control Points (GCPs) on-site. This

study introduces a novel hybrid methodology that leverages

laser pointer indication and total station surveying to establish

high-precision reference points that can be safely and

effectively integrated into UAV photogrammetric workflows.

The proposed approach aims to improve the absolute and

relative accuracy of photogrammetric models without the need

for physical GCP placement in inaccessible or hazardous areas.

A mixed reality generator for real-world envirinments in real-time

Devrim Akca¹, Çağın Torkut², Gerhard Kemper³, Armin Grün⁴

¹Faculty of Engineering and Natural Sciences, Işık Üniversitesi; ²RedHorizon Technology, Inc.,; ³GGs GmbH; ⁴4DiXplorer AG

By integrating computer vision, photogrammetry, UAV technology, and Extended Reality (XR) solutions, the presented innovative Mixed-Reality (MR) photogrammetry system enables real-time 3D visualization, interaction and measurement of realworld

environments. By eliminating the need for physical presence, the system enhances safety, efficiency and accuracy in tasks like assessing structural integrity, tracking construction progress, and observing environmental changes over time. At the

core of the system is a UAV equipped with a stereo camera rig and onboard processing capabilities. Operated on-site by an operator, the UAV captures high-resolution stereo imagery, which is processed in real time through a centralized Rest API running on cloud infrastructure. Experts located anywhere in the world connect to the system using VR headsets or a webbased application, gaining immersive access to a 3D stereoscopic view with full photogrammetric measurement functionality.

The system supports multi-user collaboration, enabling synchronized analysis and data sharing across different locations. This seamless integration of hardware and software components represents a significant advancement in real-time stereoscopic visualization.

CityZen: LOD2 building reconstruction with point cloud-free model-driven approach

Mehmet Büyükdemircioğlu¹, Ibrahim Sall^1,2, Simone Rigon¹, Fabio Remondino¹

¹3D Optical Metrology (3DOM) Unit, Bruno Kessler Foundation (FBK), Trento, Italy; ²Ecole Nationale des Sciences Geographiques (ENSG), Institut National de l’Information Geographique et Forestiere (IGN), France

Accurate building footprints and 3D models are nowadays essential for a wide range of urban applications, yet the generation of Level of Detail 2 (LOD2) models remains constrained by the availability of dense 3D data such as LiDAR or image matching products. While these sources provide high geometric accuracy, they are costly to acquire and update, creating a gap between data availability and the increasing demand for city-scale 3D modelling. Recent advances in deep learning enable monocular height estimation from aerial imagery, offering a potential alternative to traditional 3D data sources. However, integrated workflows that combine image-based inference with structured 3D reconstruction are still limited. This paper presents CityZen, a point cloud-free workflow for LOD2 building reconstruction from only RGB orthophotos. The proposed approach integrates monocular height estimation (evaluating DSMNet, HTC-DC-Net and TSE-Net), roof type classification and model-driven reconstruction within a unified pipeline. Building footprints are used as geometric constraints, while learned height and semantic cues guide the generation of consistent 3D structures. The proposed framework enables scalable and practical LOD2 city modelling using widely available aerial orthophotos, reducing dependency on costly 3D data acquisition.

Fast acquisition for modelling heritage-related complex scenes based on TLS and spherical photogrammetry

Antonio Tomás Mozas-Calvache, José Luis Pérez-García, José Miguel Gómez-López, Diego Vico-García

University of Jaén, Spain

Documenting complex heritage sites, such as the QH36 Egyptian rock-cut tomb and La Lobera cave (Iberian sanctuary), often faces severe time and logistical constraints (e.g., concurrent activity, limited access). This necessitates a methodology that ensures fast data acquisition while maintaining high geometric and radiometric quality.

This study proposes a data fusion methodology combining Terrestrial Laser Scanning (TLS) and Spherical Photogrammetry (SP). TLS is prioritized for rapid, high-accuracy geometry acquisition, while SP, using a pre-calibrated 360-degree multi-camera, is utilized primarily for detailed texture mapping and supporting geometry in occluded areas.

A key element of this approach is leveraging the TLS point cloud to extract Ground Control Points (GCPs) and Checkpoints (CPs) directly, significantly reducing the need for time-consuming total station surveying and greatly improving field work efficiency.

Results demonstrate that the methodology achieves the core objective:

• Speed: Static capture time is reduced to approximately 5 minutes per station (TLS), less in the case of static spherical photographs, and even less using SP with video.

• Accuracy: Geometric registration errors given by TLS are less than 0.5 cm.

• Efficiency: Texture acquisition is improved at least 6-fold compared to conventional photogrammetry.

This validated approach offers a viable, efficient, and reliable solution for the high-quality 3D documentation of geometrically complex and time-constrained cultural heritage scenes.

Large-Field Binocular Vision Attitude Determination Method for Rocket Recovery

Yuqi Zhang, Xianglei Liu, Runjie Wang, Haibo Shi, Zhao Lu, Haiqian Wu

Beijing University of Civil Engineering and Architecture, China, People's Republic of

High-precision attitude measurement in rocket recovery is critical for reusable launch vehicles (RLVs) and aerospace sustainability, but existing technologies have key flaws. Inertial Measurement Units (IMUs) accumulate drift, misaligning control commands with actual states; high-precision gyroscopes are costly and hard to integrate; Visual-Inertial Fusion (VINS) is light-sensitive, failing in dynamic re-entry—all risking recovery failure.

To address this, a large-field binocular vision method is proposed via four stages. First, camera calibration uses Zhang’s method for intrinsic parameters (left/right reprojection errors: 0.056/0.066 px) and control-point stitching for extrinsics, solving the large-field coverage issue and achieving 33.42 mm 3D positioning error. Next, image preprocessing applies bilateral filtering for denoising, Roberts operator for edge extraction, morphological closing for contour continuity, and multi-threshold Canny fusion to suppress spurious edges, ensuring stable input. Then, total least squares fits the midline, and left/right camera plane intersection extracts the rocket’s spatial central axis, avoiding noise from point-by-point triangulation. Finally, phase correlation resolves roll ambiguity from cylindrical symmetry, and the spatial axis calculates pitch/yaw to build a Z-Y-X Tait-Bryan angle matrix for attitude determination.

Experiments on a 1:20 scale model (1 m long, 0.3 m diameter) used µs-synced high-speed cameras (6 m height, 3 m baseline). Results show roll/pitch/yaw RMSEs of 1.58°/1.54°/1.41°, with 93% mean absolute errors ≤±2°—outperforming ORB+PnP (2.11° roll RMSE), SGBM (2.50°), and Chamfer (3.00°). Ablation experiments confirm key modules’ necessity—removing line support score filtering raises roll RMSE to 1.85°—verifying robustness in dynamic re-entry.

Low-cost stereo vision and deep learning for river water level measurement

Pedro Zamboni¹, Robert Krüger¹, László Bertalan², Xabier Blanch³, Paul Hindorf¹, Anette Eltner¹

¹Dresden University of Technology, Germany; ²University of Debrecen, Hungary; ³Universitat Politécnica de Catalunya, Spain

This study presents a low-cost, non-contact stereo vision system for automated river water level monitoring, addressing the growing need for dense and scalable hydrological observation networks under increasing climate-driven flood risks. The proposed system uses paired consumer-grade cameras combined with deep learning–based image segmentation to estimate water levels without requiring physical reference markers or pre-existing 3D models.

Two processing strategies are evaluated: a standard stereo workflow and an enhanced approach incorporating semantic masking to exclude dynamic regions such as water and sky. Camera pose estimation is assessed using both global and epoch-based optimization methods. Results show that unmasked configurations provide more stable and robust camera pose estimates, while masking improves geometric accuracy but introduces temporal instability.

Water level estimates derived from stereo reconstruction demonstrate strong agreement with reference gauge data, achieving correlation coefficients between 0.70 and 0.77. Both approaches successfully capture overall hydrological trends, including flood dynamics, although accuracy decreases under high water levels and challenging imaging conditions. Masking introduces a systematic offset in absolute values but does not significantly improve correlation performance.

Research on Cloud Control photogrammetry based on Time-series Archived Aerial Photos and Its Application in Urban Governance in Beijing

Xiaokun Zhu¹, Yingchun Tao¹, Huimin Tian¹, Mingce Xu², Yutao Guo¹

¹Beijing Institute of Surveying and Mapping, China, People's Republic of; ²Beijing SmartSpatio Technology, China, People's Republic of

This study applies cloud control photogrammetry to time-series archived aerial photos to support urban governance in Beijing. Addressing challenges such as missing ground control points, heterogeneous coordinate references, and non-digitized aerial triangulation results, the proposed method leverages existing basic geographic products (e.g., DOM, DEM) as dense control sources, enabling automated aerial triangulation and 3D reconstruction without field control points. The workflow includes control source selection and organization, image preprocessing, cloud control point and tie point matching, block adjustment, and time-series product generation. Three experimental applications are presented: (1) reconstruction of river course changes in the Beijing Municipal Administrative Center using KH satellite images (1961–1974) and 1996 DOM, yielding time-series DOM products meeting 1:50,000 scale accuracy; (2) detection of illegal self-built building additions via DSM differencing from ADS80 images (2016–2017), identifying one-to-three-story structures; (3) 3D real-scene modeling of the Grand Canal’s Tonghui River section from 1975 film photos and 2015 control data, revealing 40 years of urban transformation. Results demonstrate that cloud control photogrammetry ensures spatiotemporal consistency and enables quantifiable, multi-temporal 3D analysis for urban change detection, illegal construction monitoring, and cultural heritage preservation.

UrbanVGGT: Scalable Sidewalk Width Estimation from Street View Images

Kaizhen Tan¹, Fan Zhang²

¹Heinz College of Information Systems and Public Policy, Carnegie Mellon University, United States of America; ²Institute of Remote Sensing and Geographical Information System, Peking University, China

Sidewalk width is an important indicator of pedestrian accessibility, comfort, and network quality, yet large-scale width data remain scarce in most cities. Existing approaches typically rely on costly field surveys, high-resolution overhead imagery, or simplified geometric assumptions that limit scalability or introduce systematic error. To address this gap, we present UrbanVGGT, a measurement pipeline for estimating metric sidewalk width from a single street-view image. The method combines semantic segmentation, feed-forward 3D reconstruction, adaptive ground-plane fitting, camera-height-based scale calibration, and directional width measurement on the recovered plane. On a ground-truth benchmark from Washington, D.C., UrbanVGGT achieves a mean absolute error of 0.252 m, with 95.5% of estimates within 0.50 m of the reference width. Ablation experiments show that metric scale calibration is the most critical component, and controlled comparisons with alternative geometry backbones support the effectiveness of the overall design. As a feasibility demonstration, we further apply the pipeline to three cities and generate SV-SideWidth, a prototype sidewalk-width dataset covering 527 OpenStreetMap street segments. The results indicate that street-view imagery can support scalable generation of candidate sidewalk-width attributes, while broader cross-city validation and local ground-truth auditing remain necessary before deployment as authoritative planning data.

Pompeii. From the measurement of small indentations to the calculation of the terminal ballista.

Monil Mihirbhai Thakkar¹, Amir Ardeshiri Lordejani¹, Mario Guagliano¹, Silvia Bertacchi², Sara Gonizzi Barsanti², Adriana Rossi²

¹Department of Mechanical Engineering, Politecnico di Milano, via la Masa 1, 20156, Milan, Italy; ²Department of Engineering, Università degli Studi della Campania Luigi Vanvitelli, Via Roma 29, 81031, Aversa (CE),Italy

During Sulla’s siege of Pompeii in 89 BC, Roman artillery projectiles struck the city’s fortified walls, leaving visible impact craters. The subsequent eruption in AD 79 buried the site, preserving both its architectural layout and the damaged wall surfaces, which were later excavated in the early 20th century. By analysing the visible damage found on the fortified walls of Pompeii, reverse engineering techniques were used to decipher the engineering principles behind Roman military technology. This study simulates the impact of metal projectiles on grey tuff to estimate the impact velocities and the energy required to cause the observed damage, providing insights into the destructive capabilities of Roman weapons. It develops material models and applies finite element analysis, including mesh convergence, velocity calibration, and angular impact studies for both ballista stones and darts to better understand impact mechanics and crater formation.

metal darts on the city walls, along with the simulation of forces and trajectories. Among the objectives is to verify the calculated data against experimental relationships developed in antiquity and applied to the detection of small pyramidal indentations.

BEV-LOC: Real-Time and Lightweight Cross-View Localization via Online BEV Mapping

Jiyong Kwag, Charles Toth, Alper Yilmaz

Ohio State University, United States of America

This abstract presents a deep learning and classical computer vision framework for cross-view geolocalization using 360-degree multi-perspective view (PV) images and an offline global map. Recent studies on cross-view geolocalization typically rely on deep learning models to localize panoramic PV images by matching them with reference satellite imagery. However, such approaches face practical limitations in real-world deployments, due to their dependence on large-scale GPU resources and the need to store extensive satellite image datasets. To address these challenges, we propose BEV-LOC, a lightweight and real-time cross-view geolocalization method. BEV-LOC employs Bird’s Eye View (BEV) encoder that learns to transform 360-degree multi-PV images into a local high-definition (HD) BEV map. The localization is then performed using Intersection Over Union (IoU)-based template matching with an offline global map. Our architecture achieves real-time performance at 30 FPS without the need for high-end GPU hardware and delivers a high positioning accuracy of 1.2 meters.

Remote Pipe Diameter Measurement from a single Image using Laser Scale Projection with a Depth Compensation Model

Alice Bilbáo¹, Leonardo Galvão¹, João Andrade¹, Daniel Regner¹, Moacir Wendhausen¹, Gierri Waltrich¹, Carla Marinho², Tiago Pinto¹

¹Federal University of Santa Catarina, Brazil; ²CENPES/Petrobras, Brazil

Monitoring geometric integrity of risers and pipelines is critical in offshore oil & gas operations, where swell, collapse or torsion often manifest as diametral changes that must be detected safely and efficiently. Historically, this kind of inspection is made by industrial climb, a time-consuming, dangerous and costly operation. Increasing efforts are on remote riser inspection using drones, primarily aimed at qualitative assessment through visual analysis, as well as photogrammetry, which offers accurate inspection but requires many images, image acquisition network design and well-trained drone pilots. To overcome the limitations of a qualitative image inspection and the complexity of photogrammetry, we propose a simple, low-cost method to estimate the pipe diameter from a single image by projecting two laser points of known spacing, building a scale directly in the scene and correcting depth differences between the laser projection plane and the pipe silhouette plane. This work evaluates the proposed method in laboratory conditions for nominal and calibrated focal lengths, distances from 2 m to 10 m and four pipe diameters, demonstrating the improvement of remote pipe diameter measurement by modelling and compensating for this depth difference. The improvement becomes more evident for longer focal lengths, shorter distances, and larger pipe diameters. It has an important effect in minimizing errors, e.g., from 3.5% to less than 0.2% at a 2 m distance for a 165 mm diameter pipe. The next steps include the construction of a lightweight projector to be integrated into a drone camera gimbal.

Evaluating the synergy of hand-crafted and AI-driven feature matching in structure-from-motion 3D reconstruction

Min-Lung Cheng, Yasutaka Kuramoto

SkymatiX Inc., Japan

This study evaluates the effects of hand-crafted and AI-driven feature extraction and matching approaches on 3D scene reconstruction. While hand-crafted methods remain widely adopted in structure-from-motion (SfM), their performance often deteriorates when repetitive or uniform textures occur across multiple images, leading to alignment failures and incomplete reconstructions due to insufficient or erroneous feature correspondences. Recent advances in artificial intelligence have introduced robust pipelines capable of addressing these challenges by improving feature detection and matching in texture-repetitive imagery. In this study, hand-crafted and AI-driven feature extraction and matching techniques are integrated and assessed on challenging datasets to examine their performance in SfM-based 3D reconstruction. Experimental results demonstrate that combining hand-crafted feature points with AI-driven matching significantly enhances the robustness and reconstruction success rate across diverse challenging scenarios. This hybrid approach offers a promising alternative for reliable SfM 3D reconstruction when dealing with images dominated by repetitive or uniform textures.

The Emerging Role of Vision-Language Models in the Automation of Railway Asset Management: A Review and Future Perspective

Ashley Varghese, Mohammadjavad Ghorbanalivakili, Gunho Sohn

York University, Canada

Automated railway inspection is critical for safety, but current deep learning models are limited by a "closed-world" assumption, failing to identify novel or rare assets without costly retraining. This review explores a transformative solution: Vision-Language Models (VLMs). We introduce the concept of "reasoning-powered detection," where a model’s linguistic intelligence is used to guide the identification process.

Multi-Modal LoD2 Building Reconstruction Benchmark for Urban Modeling

Mohammad Moein Sheikholeslami¹, Youssef Korny¹, Andreas Wichmann², Ksenia Bittner³, Gunho Sohn¹

¹York University, Canada; ²Jade University of Applied Sciences, Germany; ³German Aerospace Center (DLR), Weßling, Germany

Accurate 3D building modeling at level of detail 2 (LoD2) is

fundamental for urban analysis, supporting applications such

as realistic city simulations, energy assessment, and infrastructure

planning. While cadastral data is often freely accessible in

many developed countries, existing publicly available 3D building

benchmarks are typically limited either in scale or in the

diversity of input modalities required for developing and evaluating

modern deep learning methods.

We present a new large-scale, open, instance-wise dataset for

LoD2 building modeling from aerial imagery and LiDAR.

Through rigorous processing and validation, it bridges the

gap between raw open geospatial data and structured research

benchmarks. Its modular design supports both single- and

multi-modal reconstruction workflows. The upcoming public

release aims to enable reproducible research in 3D urban modeling,

cross-modal learning, and digital-twin creation, advancing

automated, reliable city-scale 3D reconstruction.

GeoRGMAE: Geospatially Guided Masked Autoencoders for Building Segmentation

Tugba Eraslanoglu¹, Guneet Mutreja², Martin Kada¹, Ksenia Bittner²

¹Technical University of Berlin, Germany; ²German Aerospace Center (DLR)

Accurate building segmentation from high-resolution aerial imagery is essential for various urban applications such as digital twins, geographic information system, and flood risk modelling. However, conventional supervised deep learning approaches require large amounts of pixel-level annotations, which are costly and time-consuming to obtain for large remote sensing datasets. To address this limitation, self-supervised learning has recently emerged as an effective paradigm in order to learn visual representations from unlabeled data. In particular, masked autoencoders (MAE) have demonstrated strong performance by reconstructing masked image patches during pretraining. Nevertheless, conventional MAE frameworks rely on random masking strategies that do not consider the spatial structure and semantic importance of regions in high-resolution remote sensing imagery. In this study, we propose GeoRGMAE, a geospatially guided masked autoencoder for building segmentation. Unlike standard MAE, which rely on random masking, our approach leverages building footprint annotations available in the pretraining dataset to guide the masking process while preserving the original reconstruction objective. We introduce three masking strategies -core, balanced, and density-aware masking- that prioritize semantically relevant building regions under the varying urban densities. The core strategy focuses on building interiors, the balanced strategy distributes masking between buildings and background, and the density-aware adapts masking based on scene-level building density. Experiments on the Roof3D and WHU Building datasets demonstrate consistent, though modest, improvements over standard MAE pretraining, with the most effective masking strategy depending on dataset characteristics. These results indicate that incorporating geospatial priors into masked image modelling can improve representation learning for downstream building segmentation tasks.

Deep Learning-based Roof Detection from UAV Dense Point Cloud for Solar Panels Mapping

Aleksandra Sekrecka, Damian Wierzbicki, Kinga Karwowska, Agnieszka Myrcik

Military University of Technology in Warsaw, Poland, Poland

Photovoltaic panels are becoming increasingly popular, and finding a suitable location for them quickly and automatically is a current and practical problem. In our experiment, we test whether a point cloud from dense multi-image matching can be useful for the automatic detection of the best locations for installing photovoltaic panels. We propose a methodology for processing and analyzing UAV point clouds, where the use of deep learning in combination with the CANUPO algorithm results in high roof recognition efficiency.Two classes were selected: roofs and non-roof objects. This made it possible to filter the detected roofs and remove erroneous objects. The resulting model detected buildings with an accuracy of approximately 80% and an effectiveness of 100% (there were no false detections). the following factors were taken into account in the insolation calculations: roof angles, roof slope exposure, changes in the angle of sunlight throughout the year, and atmospheric transmittance. The roof angles and exposure were determined using a Digital Surface Model (DSM) generated from multi-image UAV data. In our research, we took into account the average angle of incidence of sunlight throughout the year and at quarterly intervals.The use of DSM for roofs and the SVC algorithm combined with CANUPO made it possible to eliminate false detections and significantly increase the effectiveness of location detection. Research conducted for the entire year and quarters enabled the analysis of changes in roof insolation throughout the year, which is crucial when estimating the profitability of installing photovoltaic panels.

Comparison of Different Object Detection Methods for Automatic Facade Enrichment of Existing Building Modells from Arial Images

Johannes Otepka¹, Günter Sükar², Martin Kerschner², Gerald Forkert², Norbert Pfeifer¹

¹TU Wien, Austria; ²UVM Systems GmbH, Wien, Austria

This study investigates the enrichment of existing building models using deep learning-based window detection from oblique aerial imagery acquired by a high-end multi-camera sensor system. While many cities maintain LOD2 building models at Level of Detail 2, higher levels of detail require the integration of facade elements such as windows. Three detection strategies are evaluated using 3D reference building models to assess accuracy and completeness. The test site is located in Vienna and consists of multiple large residential buildings with varying facade characteristics.

The evaluated methods include zero-shot object detection with Grounding DINO combined with Segment Anything Model 2, applied to both oblique images and facade orthophotos, as well as a SAM2-UNeXT network requiring minimal training. Results indicate that zero-shot detection on orthophotos achieves the best performance, with a precision of 0.95 and an F1 score of 0.85. In contrast, the SAM2-UNeXT approach shows lower precision and F1 scores but slightly higher recall.

The investigation shows that detection performance is influenced by facade viewing angles. Steeper viewing angles generally improve detection quality but increase susceptibility to occlusions, particularly in dense urban environments. The article concludes with a detailed outlook on future work, including the extension of the approach to more complex three-dimensional building structures.

Quality Restoration of Point-Cloud-Derived 2D Projections: A Comparative Study of Void-Filling Techniques

Md Rakibul Islam Chowdhury¹, Sang Hyeok Han^1,2, Jong Won Ma³

¹Dept of Building, Civil and Environmental Engineering, Concordia University, Montréal, QC, Canada; ²Centre for Innovation in Construction and Infrastructure Engineering and Management (CICIEM), Gina Cody School of Engineering and Computer Science, Concordia University, Montréal, QC, Canada; ³School of Civil and Environmental Engineering, Yonsei University, Seoul, South Korea

Point-cloud-derived 2D projections enable generating unlimited virtual views for indoor scene analysis and dataset creation. However, projecting irregular 3D samples onto a dense image grid commonly produces void pixels due to sparsity, occlusions, and incomplete scan coverage. These projection-induced artifacts degrade the visual fidelity of rendered images and limit their usefulness in downstream image-based workflows. This study investigates void-filling strategies tailored to point-cloud-generated RGB projections and provides a comparative evaluation of three representative approaches: (i) K-nearest neighbor (KNN) interpolation with KD-Tree accelerated neighbor search, (ii) a rule-based neighborhood method (NNRule) that adapts filling behavior using local variability to preserve edges, and (iii) a mask-normalized Gaussian-weighted propagation method that diffuses valid color information into void regions. Experiments were conducted on multi-view perspective projections generated from Stanford Large-Scale 3D Indoor Spaces Dataset (S3DIS) Area 3, totalling 5,520 images. Restoration quality was assessed using standard pixel-level metrics such as MAE, RMSE, PSNR, and SSIM. Quantitative results show that Gaussian-weighted propagation achieved the best overall performance, followed by NNRule, while KNN performed weakest numerically. Qualitative comparisons further indicate that KNN produces the most visually realistic texture appearance, whereas diffusion-based filling is softened fine details. Finally, the study establishes a practical baseline that enables both academic researchers to advance point-cloud-to-image restoration without relying on paired RGB datasets and industrial practitioners to deploy light weight void-filling pipelines in real-world applications such as digital twins, indoor robotics, facility management, and augmented reality.

Bridging the Gap: Improving handheld Laser Scanning Point Cloud Quality in Forests via RTK-GNSS integrated SLAM

Carolin Rünger, Stefan Binapfl, Sophia Böhme, Ferdinand Maiwald, Anette Eltner

Technical University Dresden, Germany

Accurate forest inventories are essential for sustainable forest management. Handheld personal laser scanning (H-PLS) enables efficient and flexible forest data acquisition. However, ensuring reliable point cloud quality in complex environments remains challenging. While Simultaneous Localization and Mapping (SLAM)-based H-PLS allows rapid data collection, trajectory drift and accumulated registration errors can reduce the accuracy of derived tree parameters and structural metrics. In contrast, Global Navigation Satellite System (GNSS)-based Real-Time Kinematic (RTK) positioning provides centimetre-level absolute accuracy and drift-free trajectories, although its application in forested environments is still emerging. This study evaluates the impact of RTK-GNSS integration on point cloud geometry compared to SLAM-based point clouds without GNSS across two Central European forest plots with contrasting canopy structures. Analyses focused on tree parameter accuracy, structural metrics based on quantitative structural models, point density and noise characteristics. To isolate the effect of GNSS integration, data from the RTK-GNSS enabled H-PLS device were additionally processed without GNSS information, and an open-trajectory scan without loop closure was included for comparison. Results show that RTK-GNSS improves point cloud consistency and especially enhances the estimation of volume- and branch-related metrics. In the dense canopy plot, RTK-GNSS information reduced mean errors in branch number (−6100 to −5369) and crown volume (−492.75 to −357.21 m³). However, overall performance in tree parameter estimation depends on point density. These findings highlight RTK-GNSS H-PLS as a promising approach for flexible and efficient forest data acquisition in inventory applications.

Semantically-Driven Adaptive Registration for Correcting Non-Constant Drift in Multi-Temporal MLS Data

Aimad El Issaoui^1,2, Veikka Taka¹, Harri Kaartinen¹, Antero Kukko^1,2, Juha Hyyppä^1,2

¹Finnish Geospatial Research Institute (FGI), the National Land Survay of Finland; ²Aalto University, School of Engineering, Department of Built Environment

Mobile Laser Scanning (MLS) provides high-accuracy 3D point clouds essential for road infrastructure monitoring. However, multi-temporal MLS analysis is often limited by non-constant, spatially varying trajectory drift caused by GNSS outages and IMU inaccuracies. These misalignments can exceed the magnitude of the changes being monitored, such as pavement deformation, making accurate change detection challenging. This paper presents a fully automatic, semantically driven registration pipeline designed to correct spatially varying drift in directly georeferenced MLS data. The method first applies Principal Component Analysis (PCA) and intensity-based filtering to classify points into stable geometric categories, including flat horizontal surfaces, flat vertical structures, and linear vertical features. A correspondence-based filtering step removes dynamic objects and temporal changes to ensure that registration is driven by stable geometry. The core of the method is an adaptive piecewise registration strategy, where the reference point cloud is divided into sequential 1-meter patches. Each patch is assigned a local rigid transformation estimated using an adaptively expanding registration window guided by the availability of stable vertical features. A final smoothing step ensures spatial continuity between adjacent transformations. The method was evaluated on two MLS datasets collected one year apart along a 3 km road corridor using the FGI Roamer-R4DW system. Validation using 30 independent ground signals showed that the 3D RMSE improved from 3.38 cm to 1.54 cm, with vertical RMSE improving from 2.54 cm to 0.67 cm. The results demonstrate that the proposed approach enables centimeter-level alignment suitable for high-precision multi-temporal road monitoring and change detection applications.

3D Meshing of Challenging Surfaces using Gaussian Splatting

Dario Billi^1,2, Chaimaa Delasse^2,3, Arnadi Murtiyoso², Hélène Macher², Pierre Grussenmeyer², Gabriella Caroti¹, Andrea Piemonte¹

¹Department of Civil and Industrial Engineering, ASTRO Laboratory, University of Pisa, Largo Lucio Lazzarino, 56122 Pisa, Italy; ²Université de Strasbourg, INSA Strasbourg, CNRS, Laboratoire ICube UMR 7357, 67000 Strasbourg, France; ³Ecole des Sciences Géomatiques et de l’Ingénierie Topographique, Institut Agronomique et Vétérinaire Hassan II, Madinat Al Irfane, 6202 Rabat, Morocco

This work addresses the challenge of accurate 3D reconstruction of complex scenes such as vegetation, transparent, or non-Lambertian surfaces, which often cause difficulties for traditional Multi-View Stereo (MVS) methods. This issue is particularly relevant in the field of Cultural Heritage (CH), where many objects and environments exhibit such characteristics. To overcome these limitations, the study proposes the use of the new MILo (Mesh-In-the-Loop Gaussian Splatting) approach (Guédon et al., 2025), comparing its results with conventional MVS techniques and Terrestrial Laser Scanner (TLS) data.

MILo builds upon the 3D Gaussian Splatting (3DGS) technique, introducing a differentiable mesh extraction during optimization of the Gaussian parameters. This enables gradient flow between the volumetric and surface representations, resulting in more accurate and lightweight meshes, suitable for downstream applications such as simulations or animations.

The study uses three datasets: a Tilia tomentosa tree (Strasbourg) for complex natural geometries, the winter garden of the Sarreguemines Museum for reflective surfaces, and woodcarvings from Kasepuhan Palace (Indonesia) for fine ornamental details. Preliminary results on the tree dataset show that MILo significantly improves reconstruction quality, preserving thin structures such as branches and leaves compared to traditional MVS methods.

The final analysis will include both qualitative and quantitative comparisons (RMSE, standard deviation, completeness, mesh complexity) against TLS data, to rigorously assess MILo’s performance across different geometric and material conditions.

Render-to-Real Image-Based Change Detection of Outdoor Infrastructure Using 3D Gaussian Splatting

Satoko Hattori-Nagao, Kazuo Oda, Tomoaki Eguchi, Takanobu Nagao, Satomi Kakuta

Asia Air Survey Co., Ltd., Japan

This study proposes a framework for detecting changes in outdoor civil infrastructure using bi-temporal images and validates its effectiveness through experiments on real-world datasets. The proposed method performs change detection by comparing a 3D Gaussian Splatting (3DGS) model reconstructed from multi-view images acquired before changes occur with a single real image captured from a new observation viewpoint after changes. The processing pipeline consists of: (1) construction of the 3DGS model, (2) generation of an initial rendered image corresponding to the post-change real image, (3) feature matching between the rendered image and the real image followed by camera pose estimation, and (4) change detection. Experiments conducted on a sediment control dam and a bridge dataset demonstrate that the proposed method achieves a maximum Intersection over Union (IoU) of 0.82 for change detection. Furthermore, compared to a baseline method based on bi-temporal real image pairs, the proposed method improves IoU by up to 24 percentage points. The results also indicate that even under limited acquisition conditions after changes, accurate change detection can be achieved when the 3DGS reconstruction quality and pose estimation are sufficiently reliable.

Empirical assessment of geometric accuracy of underwater lidar in tropical shallow waters

Mentari Khoerunnisa Azzahra¹, Fickrie Muhammad², Arnadi Murtiyoso³, Annette Scheider⁴, Harald Sternberg⁴, Gabriella Alodia²

¹Institut Teknologi Bandung, Faculty of Earth Sciences and Technology, Geodesy and Geomatics Engineering Postgraduate Programme, Bandung, Indonesia; ²Institut Teknologi Bandung, Faculty of Earth Sciences and Technology, Hydrography Research Group, Bandung, Indonesia; ³Université de Strasbourg, CNRS, INSA Strasbourg, ICube Laboratory UMR 7357, Photogrammetry and Geomatics Group, Strasbourg, France; ⁴HafenCity University Hamburg, Department of Hydrography and Geodesy, Hamburg, Germany

Light detection and ranging or lidar technology has been widely applied across various spatial domains. To meet the needs for a detailed underwater survey, Fraunhofer IPM developed an underwater lidar, known as ULi. The system has been tested under controlled laboratory conditions. Nevertheless, Fraunhofer IPM claims sub-millimetre range precision in clean water. However, no empirical study has managed to address this aspect, as fieldwork in the Elbe River (Walter et al., 2025) did not manage to obtain suitable data due to its naturally high turbidity. The present study will evaluate the geometrical accuracy of ULi against terrestrial laser scanner (TLS) and photogrammetry. An acoustic Doppler current profiler (ADCP) was chosen as a measurement target on the field experiment due to its rigidity and high reflectivity, with the dimensions of the frame is 75 × 75 × 65 cm. The data sets were georeferenced to the WGS 84/UTM Zone 48S coordinate system using control point targets affixed to the ADCP frame and measured with a total station applying the intersection method. Subsequently, the geometric accuracy assessment was performed through statistical evaluations, including root mean square error analysis and 3D point cloud deviation comparison among ULi, TLS, and photogrammetry data sets. The 3D model derived from the ULi data will be assessed against models derived from TLS and photogrammetry through statistical analyses of length discrepancies and spatial deviations. Additionally, intensity, point density, linearity, planarity, and scattering analyses will be performed to evaluate how well the point cloud represents the geometric characteristics.

Experimental Validation of Human-Readable Coded Targets for Cross-Platform Photogrammetry and 3D Laser Scanning

Miglena Raykovska^1,5, Milen Borisov², Stanislav Harizanov¹, Lyubka Pashova³, Nikolay Petkov¹, Kristen Jones⁴, Pavel Georgiev¹, Georgi Vasilev¹, Ivan Lirkov¹

¹Institute of Information and Communication Technologies, Bulgarian Academy of Sciences; ²Institute of Mathematics and Informatics, Bulgarian Academy of Sciences; ³National Institute of Geophysics, Geodesy and Geography, Bulgarian Academy of Sciences; ⁴Queens University, Canada; ⁵Centre of Excellence in Informatics and Information and Communication Technologies

Coded targets are widely used in close-range photogrammetry and 3D laser scanning for automated referencing and registration. However, most fiducial systems are optimized for specific software environments, limiting interoperability across processing pipelines. This study presents a cross-platform coded target framework for multi-sensor 3D acquisition that combines geometric redundancy, binary encoding, and human-readable elements to enhance robustness and reproducibility. An open-source implementation (PGT-Toolkit) supports marker generation, detection, and standardized coordinate export. Performance was evaluated using a controlled laboratory framework with systematically varied viewing angles, distances, and illumination conditions. Experiments were conducted using DSLR-based photogrammetry and terrestrial laser scanning. Detection rate, centroid repeatability, reprojection error, and cross-platform coordinate consistency were assessed and compared with those of established fiducial systems. Results demonstrate stable detection under oblique viewing geometries and consistent coordinate estimation across both commercial and open-source software environments. Laboratory studies confirm that Human Readable Coded Targets (HRCT) provide reliable, accurate, and cross-platform compatibility for both photogrammetric and 3D laser scanning workflows, which remain to be verified by field studies. The proposed framework contributes a structured methodology for experimental validation of interoperable coded targets in multi-sensor 3D workflows.

Integrating Multi-View Stereo and Depth Foundation Models for Precise 3D Reconstruction of Thin Urban Structures

Hwiyoung Kim¹, Impyeong Lee², Kyoungah Choi³

¹Geospatial Team, InnoPAM, Korea, Republic of (South Korea); ²Dept. of Geoinformatics, University of Seoul, Korea, Republic of (South Korea); ³Geospatially Enabled Society Research Division, Korea Research Institute for Human Settlements, Korea, Republic of (South Korea)

Constructing high-fidelity 3D models for urban Digital Twins is challenging, particularly for thin, texture-less structures like power lines where traditional Multi-View Stereo (MVS) fails due to matching ambiguities. While recent Monocular Depth Foundation Models offer dense estimation, they lack absolute scale and often degrade when applied to large-scale aerial imagery. This paper proposes a hybrid depth estimation pipeline that synergizes the metric accuracy of MVS with the structural coherence of foundation models.

Our method follows a Coarse-to-Fine strategy. First, we generate a scale-aware initial depth map by injecting sparse MVS points into the "Depth Anything" model as geometric priors, compensating for the lack of absolute scale in monocular estimation. Subsequently, a structure-guided refinement stage employs edge-based contour grouping to rectify object boundaries and suppress noise. Experimental results demonstrate that our approach successfully reconstructs power lines as distinct, linear objects with absolute scale, effectively resolving the data voids inherent in MVS and the geometric distortions typical of monocular models. This research provides a robust workflow for enhancing the precision of urban 3D reconstruction.

Estimation of refraction in photogrammetry from airborne data in an alpine environment

Myrta Maria Macelloni, Nives Grasso, Alberto Cina

Politecnico di Torino, Italy

Valpelline is an unspoilt Alpine valley located in the northernmost part of the Aosta Valley, on the border between Italy and Switzerland. It is the region’s longest valley, shaped by glaciers and rivers, with elevations ranging from about 900 m to over 4000 m at peaks such as Mont Gelé (3518 m) and Dent d’Hérens (4171 m).

Since 2020, the glaciers have been monitored by the GlacierLAB group (Politecnico di Torino) and ARPA Valle d’Aosta. Because of the valley’s steep, inaccessible terrain, biannual aerial photogrammetric surveys with a GNSS antenna, a low-accuracy IMU, and a PhaseOne iXM-RS150F camera (151 MP, 50 mm lens).

Due to a lack of synchronization between the camera and GNSS, Ground Control Points (GCPs) are needed for georeferencing. However, their configuration is often insufficient. Camera calibration certificates (2019, 2022) are crucial to correct image distortions; when unavailable, calibration is estimated using Agisoft Metashape and Structure-from-Motion methods, dividing known points into GCPs and Control Points to evaluate residuals.

High-altitude flights require correction for atmospheric refraction, which affects image geometry independently of optical distortion. Tests were carried out to estimate refraction errors (via Saastamoinen formulas) and to separate them from optical effects, enabling more accurate 3D models of Valpelline’s complex alpine environment.

Learning-based Estimation of Surface Normals in Unstructured Airborne LiDAR Point Clouds

Max Hermann^1,2, Martin Weinmann²

¹Fraunhofer IOSB, Karlsruhe, Germany; ²Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

To produce suitable 3D models for downstream tasks, point clouds are often triangulated to reconstruct a triangle mesh, which first requires estimating normal vectors that define the local surface orientation. Because normals are not directly measured during laser scanning, they are often estimated in postprocessing using two steps: (1) selecting a neighborhood around each point and fitting a local surface function, and (2) orienting the resulting normal to distinguish inside from outside. Larger local neighborhoods often yield more consistent normals by averaging the surface, but can smooth out sharp discontinuities.

For orientation, various methods attempt to estimate the inside versus outside direction. In watertight scans, orientation can be determined by locally triangulating the points and propagating consistent normal orientations along the connected triangles. For surface scans containing holes and occlusions, typical for airborne LiDAR, this is more challenging, and heuristics like Minimum-Spanning-Trees or global flips towards one major coordinate axis are often used.

We propose a learning-based approach to estimate surface normals in unordered point clouds from airborne LiDAR scanning. Across multiple datasets, our approach consistently reduces artifacts and improves the quality of reconstructed triangle meshes compared to baseline methods, while achieving significantly faster runtime

Railway parameter extraction with high-precision UAV-photogrammetry: a feasibility study

Lucas De Burggrave¹, Pierre Prévost¹, Erkki Bartczak¹, Suzanna Cuypers¹, Jens Derdaele², Maarten Bassier¹

¹KU Leuven, Belgium; ²TUC RAIL, Brussels

This study investigates the feasibility of using UAV-based photogrammetry for the accurate extraction of railway geometry parameters such as gauge, alignment, and cant. The research explores whether aerial image-based reconstruction can meet the high precision requirements traditionally achieved through terrestrial survey methods. A series of experimental flights were carried out to evaluate how flight configuration, image quality, and processing strategy influence measurement accuracy and reliability. The results provide insight into the potential and current limitations of UAV photogrammetry for rail infrastructure documentation and quality control. Overall, the study contributes to advancing automated, efficient, and safe methods for railway inspection and geometric parameter extraction.

Sand Engine Beach State Assessment by applying Machine Learning on massive ARGUS Imagery

Alex De Jong, Roderik Lindenbergh, Sander Vos, Daan Hulskemper

Delft University of Technology, Netherlands, The

Dynamic beach locations world-wide are monitored by so-called Argus camera systems. Their automatic image capturing results in large databases of coastal images acquired during different illumination conditions. We present a lightweight and efficient method to automatically extract meaningful sand and supporting classes from ∼ 1 million Argus images of the Sand Engine, The Netherlands, a nature-based solution for beach erosion of 2 by 1 km. The method consists of 2 neural networks. First, a ResNet18 model selects images of sufficient quality. The second network, a shallow multi-layered perceptron is fed by RGB, intensity and texture features and classifies pixels into 6 classes, Water, Foam and Vegetation on one hand, and Aeolian, Wet and Armoured Sand on the other hand. Initial results shows good agreement with human interpretation. Final results will be used to assess the multi-year morpho-dynamic evolution at the hour scale of the Sand Engine.

Pixel-based vegetation mapping at class-level from UAV multispectral imagery: application in an alpine lake ecosystem

Mohammad Elahi¹, Alessandra Spadaro², Francesca Matrone², Andrea Maria Lingua², Chiara Graziani², Vittorio Fra¹

¹Interuniversity Department of Regional and Urban Studies and Planning (DIST), Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino; ²Department of Environment, Land and Infrastructure Engineering (DIATI), Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino

Vegetation mapping in alpine environments is essential for monitoring ecosystem dynamics and climate change impacts, yet remains challenging when using very high-resolution UAV imagery under limited labeled data. This study proposes a data centric, pixel-based classification framework for species-level vegetation mapping using multispectral UAV data acquired in an alpine study area. The approach prioritizes improving data representation rather than increasing model complexity. To address label scarcity, a feature-rich dataset was constructed by integrating spectral information, vegetation indices, and lightweight spatial descriptors to enhance class separability. Classification was performed using XGBoost, which is well suited for multispectral tabular data and robust under imbalanced conditions. The results show consistent classification performance across vegetation types and demonstrate the effectiveness of dataset enrichment under limited supervision, highlighting the importance of feature representation in data-scarce scenarios.

A Lightweight CNN–Mamba Hybrid Architecture for Efficient Crack Segmentation

Masaya Shimasaki, Mitsuteru Sakamoto, Toshiaki Satoh

PASCO Corporation, Japan

Pavement crack segmentation is an important task in road infrastructure inspection. However, the practical deployment of deep learning-based methods remains challenging because many high-performance models require substantial computational resources. This limitation is particularly critical in large-scale Mobile Mapping System (MMS)-based workflows, where large volumes of road surface imagery must be processed efficiently. In this study, a lightweight CNN–Mamba hybrid architecture is proposed for efficient crack segmentation as a deployment-oriented redesign of CT-CrackSeg. The proposed model replaces the original MobileViT-based global modelling modules with EfficientViM-inspired blocks based on hidden-state mixer-based state space duality (HSM-SSD), while preserving the overall encoder–decoder structure. In addition, the boundary enhancement branch is refined by introducing DCNv2-based deformable convolution. Experiments were conducted on the publicly available GAPs384 and CamCrack789 datasets. The results show that the proposed model maintains competitive topology-aware segmentation performance while substantially improving computational efficiency. Compared with CT-CrackSeg, the proposed method improves inference speed from 1.49 to 4.44 FPS on GAPs384 and from 1.31 to 3.92 FPS on CamCrack789. At the same time, peak memory consumption is reduced from 2827 MB to 355 MB, while the clDice score remains comparable, changing from 0.760 to 0.758 on GAPs384 and from 0.921 to 0.922 on CamCrack789. These results indicate that the proposed architecture provides a favourable balance between segmentation quality and deployment efficiency, and is suitable for large-scale pavement inspection and related photogrammetric infrastructure monitoring applications.

A Multi-Sensor and Multi-Temporal Approach to 3D Documentation of Historic Gardens: A Case Study of Villa Burba, Italy

Fangming Li¹, Cristiana Achille¹, Raffaella Laviscio², Luca Perfetti³, Francesco Fassi¹

¹3D Survey Group, ABC Lab, Department of Architecture, Built Environment and Construction Engineering (DABC), Politecnico di Milano, Via Ponzio 31, 20133 Milano, Italy; ²PaRID, ABC Lab, Department of Architecture, Built Environment and Construction Engineering (DABC), Politecnico di Milano, Via Ponzio 31, 20133 Milano, Italy; ³DICATAM, Civil Engineering, Architecture, Territory, Environmental and Mathematics, Università degli Studi di Brescia, Italy

Historic gardens are dynamic Cultural Heritage, shaped by seasonal cycles, vegetation growth, and continual maintenance, and require documentation methods capable of capturing change over time. This study presents a multi-sensor, multi-temporal workflow applied to Villa Burba, a seventeenth-century garden near Milan, Italy. Two surveys conducted in 2023 (leaf-on) and 2025 (leaf-off) combined UAV photogrammetry with mobile laser scanning (MLS) to maximize completeness under contrasting environmental conditions. Both datasets were processed independently, harmonized within WGS84 / UTM Zone 32N, and evaluated through point density analysis, deviation modelling, MLS loop-closure checks, and GCP residual evaluation.

Multi-temporal point clouds were analyzed in QGIS using PDAL-enabled tools. Cloud-to-cloud differencing and canopy height modelling revealed key transformations, including the drying of a water channel, the loss of a historic tree, and spatial shifts in vegetation structure. These digital findings were confirmed through field inspection. The workflow demonstrates a practical approach for monitoring dynamic heritage gardens and supporting long-term conservation and management through accurate, repeatable 3D survey data.

Affine Invariant OpenCV Descriptors and the Effects on Aerial Photgrammetry

Evan Okeeffe², Debra Laefer¹, Eleni Mangina²

¹New York University, United States of America; ²University College Dublin

Robust feature descriptors are necessary for computer vision applications such as image matching, photogrammetric three-dimensional (3D) reconstructions, and simultaneous localisation and mapping (SLAM). While most state-of-the-art feature descriptors are invariant to image transformations (such as translation, rotation, and scale) the majority lack stability in tracking points over large 3D perspective transformations. One successful method to solving these large perspective changes is by simulating affine tilts on the latitude and longitude axes of an image. These simulated tilts create greater invariance to changes in 3D perspective. To demonstrate the widespread efficacy of this approach, this paper applies affine simulation to seven state-of-the-art descriptors in OpenCV and to two of the enhanced OpenCV descriptors in OpenMVG.

Evaluating ORB-SLAM 3 Performance using a Photogrammetry-based Reference Trajectory

Leonardo Galvão, Daniel Regner, Moacir Wendhausen, Tiago Pinto, Armando Albertazzi

Federal University of Santa Catarina, Brazil

The robust evaluation of Visual Simultaneous Localization and Mapping (vSLAM) systems is fundamental to their development and deployment. However, this process is often constrained by the reliance on expensive and complex external infrastructure, such as laser trackers or motion capture systems, to provide accurate ground-truth trajectories. This paper introduces a novel and self-contained methodology for the high-fidelity evaluation of stereo vSLAM and stereo-inertial algorithms. Our approach leverages the very same image sequence used by the SLAM algorithm to generate a dense, globally optimized photogrammetric model. The proposed methodology comprises two fundamental steps, the first step consisted of validating photogrammetry as a ground truth method. For this purpose, the linear displacement measured by photogrammetry was compared with the displacement of a precision guide, which was benchmarked against a laser interferometer as the standard. Once the reference was validated, the second step assessed the performance of ORB-SLAM 3 on a free trajectory within a complex environment, by directly comparing the SLAM result to the trajectory generated by photogrammetry. The accuracy was then quantified using standard metrics, including Absolute Trajectory Error (ATE) and Relative Pose Error (RPE). The results validate our approach as an accessible, low-cost, and reliable alternative for benchmarking vSLAM systems, enabling rigorous performance analysis using only the data from the sensor suite under evaluation.

Deriving Tree Stem Profile and Volume Using a Close-Range Remote Sensing and Machine Learning Approach

Basam Dahy¹, Dag Björnberg^1,2, Shafiullah Soomro¹, Johan E. S. Fransson¹

¹Linnaeus University, Sweden; ²Softwerk AB, Sweden

Accurate estimation of tree volume is essential for precision forestry and sustainable forest management. Traditional forest inventory methods rely on manual measurements of tree height and diameter, which are time-consuming and costly to conduct over large areas, and difficult to perform efficiently in dense forest stands. This study presents a data-driven approach for estimating tree volume from partial tree stem profiles derived from high-resolution datasets. While the study relies on harvester production data (Sweden) and field-measured tree stem profiles (Brazil), the framework is designed to support the estimation of tree volume from close-range remote sensing techniques, such as terrestrial photogrammetry using handheld cameras. Three modelling approaches were evaluated, including two machine learning models (XGBoost and Random Forest) using partial tree stem profile measurements as predictors, and one baseline model (XGBoost) using diameter at breast height and tree height as predictors. The models were developed using two independent datasets: harvester production data of Norway spruce (Picea abies (L.) H. Karst.) from Sweden and field-measured tree stem profiles of Slash pine (Pinus elliottii Engelm.) and Loblolly pine (Pinus taeda L.) plantations from Brazil. The results show that tree volume can be predicted with reasonable accuracy using partial tree stem profiles, although models incorporating tree height achieved the lowest prediction errors. The findings demonstrate that partial tree stem profiles provide valuable structural information for machine learning-based tree volume estimation. This framework supports the future integration of close-range remote sensing techniques into modern forest inventory systems.

Towards Open-Vocabulary ALS Point Clouds Semantic Segmentation: An Empirical Study

Yanghong Lin^1,2, Tianyu Li^1,3, Shudong Zhou¹, Jingru Zhang¹, Li Fang¹, Wei Yao¹

¹Institute of Urban Environment, Chinese Academy of Sciences, China, People's Republic of; ²University of Chinese Academy of Sciences, China, People's Republic of; ³School of Resource and Environmental Sciences, Whuhan University, China, People's Republic of

Semantic segmentation of Airborne Laser Scanning (ALS) point clouds is critical for numerous photogrammetric and remote-sensing applications. While deep learning has become the dominant approach for ALS semantic segmentation, most existing methods rely on predefined label sets and thus lack the ability to recognize arbitrary semantic categories. With recent advances in visual foundation models (VFM), zero-shot visual understanding has achieved notable progress in natural image domains. However, the potential of adapting 2D VFMs to 3D ALS point cloud segmentation remains underexplored.

This contribution develops three VFM-based approaches for zero-shot, open-vocabulary ALS semantic segmentation: Grounding DINO+SAM, CLIP+SAM, and GSNET. Grounding DINO+SAM identifies object regions using text prompts and employs SAM to refine segmentation masks. SAM+CLIP first generates instance masks via SAM and then assigns semantic labels using CLIP text and visual embedding. GSNET integrates a remote-sensing-specific encoder with a CLIP-aligned encoder to alleviate the domain gap between natural and aerial imagery.

Empirical study conducted on the ISPRS Vaihingen dataset demonstrate that all three methods possess certain zero-shot open-vocabulary capabilities. Methods trained solely on natural images perform well on common classes (e.g., roof, tree) but struggle with rare categories such as powerline. GSNET improves performance across most categories, highlighting the importance of domain adaptation; however, rare-class segmentation remains challenging. These findings suggest that substantial domain gap and limited representation of rare classes are key obstacles to applying VFM in remote sensing. Future research should focus on test-time adaptation and unsupervised domain adaptation to enhance VFM generalization for 3D ALS point cloud.

A Workflow for the automatic Extraction of Glacier Contours from 4D Point Clouds

Steffen Isfort¹, Melanie Elias², Hans-Gerd Maas¹

¹TUD - Dresden University of Technology, Germany; ²HTWD - University of Applied Sciences Dresden, Germany

A workflow for the automatic extraction of the outlines of debris-covered glaciers and rock glaciers is presented. As the outlines in these scenarios are not clearly discernible, our approach is based on identifying geomorphological changes in multi-temporal 3D point clouds. We assume that these changes are caused by changes of the glacier. Consequently, areas with significant changes can be used to map the outline of the glacier. Our workflow uses pairs of multi-temporal 3D point clouds, which are captured for example by UAV imagery and TLS. After applying a robust registration algorithm, the difference of both point clouds is calculated. Considering only the areas that show significant changes, the glacier areas are isolated, and the outlines are mapped in a 2D mapping plane.

For evaluation, we test our workflow on two data sets. The Bøverbreen glacier, with only little debris cover, allows for a manual assessment of the glacier margins using an orthophoto mosaic from UAV imagery. A comparison of our calculated glacier margins with the manually assessed ones shows good agreement. The results confirm the basic functionality of our proposed method. However, tests show that the most challenging task is filtering glacial and non-glacial points, which is currently done solely based on the point density. More robust solutions to this problem will be discussed.

Automated detection of box-girder bridge deterioration using cylindrical projection from multi-camera 3D reconstruction and deep learning

OU Ming-Yun¹, Jhan Jyun-Ping¹, Lin Chen-Kuang³, Lin Shih-Syun¹, Tsai Hsin-Chu², Chou Tzu-Liang², Chang Chang-yu²

¹National Taiwan University of Science and Technology, Chinese Taipei; ²China Engineering Consultants, Inc., Chinese Taipei; ³Department of Mechanical and Materials Engineering, Tatung University, Taiwan

As large-scale infrastructure gradually ages, hundreds of existing bridges require regular inspections to ensure structural safety. While many researchers have proposed deterioration detection methods based on computer vision and deep learning—which can detect deterioration at the image level—no effective approach has yet been developed that integrates 3D reconstruction technology to achieve spatial localization and area quantification. To address this, this study proposes a two-part automated inspection workflow for the classification, localization, and measurement of internal deterioration in box-girder bridges. In the first part, the camera system is calibrated using an indoor calibration scene, and images are captured inside the box girder. A 3D model is constructed using Structure from Motion (SfM) algorithms, and a cylindrical projection unfolded map is generated. In the second part, a boundary-aware model—modified from DeepV3+—is used to perform pixel-level deterioration detection and classification on the unfolded map. Experimental results demonstrate that the system can generate scale-corrected cylindrical unfolded maps from 3D models with sub-millimeter scale accuracy (0.105 mm), effectively transforming complex 3D inspection tasks into measurable and analyzable 2D images. The model achieved an overall mean Intersection over Union (mIoU) of 65.11% across four categories of deterioration, representing a 7.54 percentage point improvement over the original DeepV3+. The research results validate the effectiveness of the proposed workflow in enhancing detection efficiency and objectivity for box-girder bridge maintenance.

Methodology and Practice of Hong Kong 3D Digital Map Construction Based on Multi-Source Data Fusion

Li Chen, Jun Li, Yaping Wang, Jing Wang, Weichen Yao

Shaanxi TIRAIN Science & Technology Co., Ltd., People's Republic of China

In response to Hong Kong's smart city development strategy, this paper takes the 3D digital map construction project in Kowloon as a practical case study and systematically presents a construction method -for 3D digital mapping based on multi-source data fusion. Aiming at the technical challenges in high-density urban environments—including dense buildings, complex 3D traffic networks, and severe shadow occlusion—an "air-ground fusion" data acquisition strategy is proposed. By comprehensively adopting multiple approaches such as oblique aerial photography, Vehicle Mobile Mapping System (VMMS), and Portable Mobile Mapping Survey (PMMS), a high-precision and highly realistic urban 3D model has been constructed. The paper focuses on the principles of multi-source data fusion based on feature registration and combined adjustment, as well as the 3D modeling process and the quality control methods for the final results. The project’s technical innovation and practical feasibility have been validated through international benchmarking. The research results have been applied to urban planning, traffic management, environmental studies and other fields, providing a solid data foundation and technical support for Hong Kong's smart city development.

Automatic Reconstruction of High-Accuracy 3D Roof Models from Orthophotos and Digital Surface Models

Yonghe Li¹, Masaya Shimasaki², Mitsuteru Sakamoto², Toshiaki Satoh², Tatsunori Sada¹

¹NIHON University, Chiba, Japan; ²PASCO Corporation, Tokyo, Japan

In recent years, the demand for 3D city model development has grown, as demonstrated by initiatives such as Project PLATEAU in Japan. In the construction of LoD2 building models, which are an essential component of 3D city models, the reconstruction of 3D roof models still heavily depends on manual work. To enhance productivity through automation, this study proposes a novel method for automatically reconstructing high-accuracy 3D roof models using orthophotos and Digital Surface Models (DSMs) derived from aerial imagery. In the proposed method, a deep-learning-based model is first applied to orthophotos and DSMs to extract 2D rooflines. Then, the extracted 2D rooflines are refined and polygonised to assemble 2D roof models. Finally, planar fitting was performed on the point cloud generated from the DSM within each 2D roof plane to reconstruct 3D roof models. In this process, the horizontal alignment of rooflines and the continuity between adjacent roof planes were preserved. In the experiments, 3D roof models manually digitized by stereoscopic measurement were used as the ground truth, and the automatically reconstructed 3D roof models were evaluated by comparison with this reference. As a result, the recall values for 2D and 3D roof planes were 0.686 and 0.430, respectively, and increased to 0.723 and 0.455 for roof planes larger than 4 m².

LiDAR-aided neural Scene Representation using low-cost Sensors

Mohamed Negm, Ahmed Elamin, Ahmed El-Rabbany

Toronto Metropolitan University, Canada

Neural scene representations are increasingly explored as alternatives to classical SfM and MVS in civil and architectural mapping, yet their ability to satisfy survey-grade geometric tolerances remains contested. This contribution examines how LiDAR guidance may stabilize NeRF and 3D Gaussian Splatting reconstructions of building façades obtained from low-cost cameras.

Research on Adaptive Feature Band Extraction Technology Based on Fractional Order Differentiation and Machine Learning

Fang Liu, Fei Liu, Xian Guo, Yikang Ren

Beijing university of civil engineering and architecture, China, People's Republic of

The Dunhuang murals, a significant component of China's cultural heritage, are severely threatened by salt-induced deterioration. To address the limitations of traditional invasive detection methods, this study explores a non-destructive approach using hyperspectral remote sensing to monitor mural salinity. Focusing on phosphate content, a key salt damage indicator, we propose a multi-level optimization framework that integrates Fractional Order Differentiation (FOD) for spectral enhancement and various feature selection strategies (including LASSO, SiPLS, SPA, CARS, and Random Frog) to improve prediction accuracy. Partial Least Squares Regression (PLSR) models were constructed using optimized spectral features. Results demonstrate that FOD effectively amplifies subtle spectral responses related to salinity. The model combining 1.9-order FOD spectra with LASSO feature selection achieved the highest performance, with a cross-validated R² of 0.908—a 15.96% improvement over the best model using FOD-transformed spectra alone. This study confirms that integrating FOD with advanced feature selection significantly enhances the precision and reliability of hyperspectral inversion models for mural salt damage, providing a powerful, non-destructive tool for cultural heritage conservation.

Assessing the sensibility of intervisibility on the quality of 3D geometry

Darshan Venkatarayappa, Bruno Vallet, Teng Wu

Univ Gustave Eiffel, G´eodata Paris, IGN, LASTIG, F-77454 Marne-la-Vall´ee, France

This work explores a new evaluation framework for 3D Model Quality Assessment using 3D intervisibility, a critical concept in 3D spatial analysis. In this work we will consider a high-quality LiDAR ground-truth 3D model and lower quality (dense matching and decimated) versions of it. Then we run the same intervisibility analysis on all of them and compare the results. This will allow us to evaluate the impact of geometric quality on intervisibility analysis This analysis is useful for anyone using 3D data for simulations, as it indicates what data quality they actually need to purchase or produce for their specific use case. Ultimately, the goal of this

work is to see how much the quality of the 3D model affects intervisibility results.

Neural Radiance Fields with Physically Based Reflectance for Satellite Images

Lulin Zhang¹, Ewelina Rupnik², Tri Dung Nguyen¹, Stephane Jacquemoud¹, Yann Klinger¹

¹Universite de Paris, Institut de Physique du Globe de Paris, CNRS; ²Univ. Gustave Eiffel, IGN-ENSG, LaSTIG

Recent adaptations of Neural Radiance Fields (NeRF) to remote sensing have shown strong potential for high-fidelity surface reconstruction from multi-view satellite imagery. NeRF represents a scene using multilayer perceptrons and optimizes a volumetric rendering objective to infer geometry and appearance. However, its performance declines sharply with the limited number of satellite viewpoints, and remote sensing imagery violates the simple reflection assumptions of natural scenes. Surface reflectance depends on material properties and illumination geometry, requiring explicit Bidirectional Reflectance Distribution Function (BRDF) modeling. In this work, a physically based NeRF formulation is proposed using the Hapke radiative transfer model, which efficiently describes surface–radiance interactions with a small set of parameters. This physically grounded approach is compared experimentally with empirical BRDF models, demonstrating its potential to enhance the physical realism and interpretability of NeRF reconstructions for Earth observation applications.

Mobile multi-camera system performance for photogrammetric road surface 3D measurements - assessment the effect of driving speed

Matti Tapio Vaaja¹, Markus Sarlin¹, Eino Waldén¹, Petri Rönnholm¹, Hannu Hyyppä¹, Juha Hyyppä², Mikko Vastaranta³, Matti Kurkela¹

¹Department of Built Environment, Aalto University, Finland; ²Department of Photogrammetry and Remote Sensing, Finnish Geospatial Research Institute, National Land Survey of Finland, FI-02150 Espoo, Finland; ³School of Forest Sciences, University of Eastern Finland, Joensuu, 80101, Finland

In this study, we built a mobile multi-camera system and investigated its use for photogrammetric 3D measurement of road surface geometry. More specifically, we tested the effect of driving speed on the quality of the 3D point cloud geometry on road surface. Our conclusion was that, with a five-camera system at speeds of 3-20 km/h, we achieved 3D distance errors of less than 0.5 mm when the data was compared to reference data measured from road surface samples. The results show that the method has great potential for producing sub-millimetre resolution and precision data on road surface damages, road roughness, and other road parameters. The

purpose is to use the system to collect reference data for verifying data from operational mobile laser scanning systems. The system can also be installed on other platforms and applications.

Digital Analysis of Rock Art in Santa Olaya Canyon: Integrating Cultural Landscape and UAV Technologies for Conservation

Fabiola D. Yépez-Rincón^1,3, Glenda N. Requena Lara², Carlos C. Aguilar Treviño³, Jacinto Treviño-Carreón², Ma. Eugenia Calvillo-Villacaña⁴, Pablo A. Cerda-Luque⁴, Juan F. Morales-Pacheco⁵

¹Faculty of Civil Engineering, Universidad Autónoma de Nuevo León, San Nicolás de los Garza, Nuevo León, México; ²Faculty of Engineering and Sciences, Universidad Autónoma de Tamaulipas, Ciudad Victoria, Tamaulipas, México; ³Teebcon Servicios, Ingenierías y Proyectos, SA de CV, Monterrey, Nuevo León, México; ⁴Faculty of Architecture, Design and Urbanism, Universidad Autónoma de Tamaulipas, Tampico, Tamaulipas, México; 5Faculty of Law and Social Sciences Victoria, Universidad Autónoma de Tamaulipas, Ciudad Victoria, Tamaulipas, México;; ⁵Faculty of Law and Social Sciences Victoria, Universidad Autónoma de Tamaulipas, Ciudad Victoria, Tamaulipas, México.

This research work presents the digital documentation of rock art found on a rock face in the Santa Olaya Canyon, in the municipality of Burgos, Tamaulipas. Unlike rock art found in caves, these open-air expressions are actively integrated with the natural and cultural landscape, functioning as symbolic markers of the territory. Through controlled flights with a DJI Mavic Air 2 drone and 3D photoreconstruction techniques, a difficult-to-access vertical surface of a rock face with rock paintings was recorded with high precision. The methodology employed responds to the need for conservation and study of these sites, which lack institutional protection mechanisms from the INAH (National Institute of Archaeology and History) or, as in this case, conservation and cultural research studies. It also contextualizes the value of rock art in Tamaulipas, particularly in the San Carlos and San Nicolás mountain ranges, where some of the most significant collections in northeastern Mexico are found. The application of non-invasive digital technologies is positioned as an effective tool for the documentation, analysis, and dissemination of archaeological heritage, especially in remote and limited-access regions. The generated orthomosaic and point clouds provide the opportunity to create a digital legacy of the area.

LiDAR Point Cloud Classification by 3D Sparse CNN for large-scale Mobile Laser Scanning

Nan Li¹, Florian Pöppl², Andreas Ullrich², Harald Teufelsbauer²

¹RIEGL Research & Defense GmbH; ²RIEGL Laser Measurement Systems GmbH

This work presents a deep learning-based framework for semantic classification of Mobile Laser Scanning (MLS) point clouds using a 3D Sparse Convolutional Neural Network (SparseCNN). The proposed approach addresses challenges specific to MLS data, such as varying point density, high data volume, and diverse urban or highway environments. A two-stage, coarse-to-fine classification pipeline is designed to ensure both scalability and high resolution: the first stage performs scene-wide semantic labeling, while the second refines ground-surface features such as road markings, sidewalks, and curbstones at finer spatial resolution.

To enhance robustness, the model is trained with tailored data augmentations including geometric transformations, density dropout, artificial noise injection, and local patch swapping. In addition to geometric input, radiometric features such as reflectance and echo information are incorporated to improve object differentiation, especially for materials like traffic signs and painted road surfaces.

Two sets of models are trained for different acquisition wavelengths (905 nm and 1550 nm), to account for the impact of laser wavelength on reflectance responses. Classification results on urban and highway scenes demonstrate the effectiveness of the method across a variety of environments and sensor platforms.

MUSF-SSA: Multi-scale Umbrella Feature with Spatial Self-Attention Model for Semantic Segmentation of Point Clouds

Linfu Xie, Rutao Zhang, Tianyi Xu, Weixi Wang, Xiaoming Li, Shengjun Tang, Renzhong Guo

Shenzhen University, People's Republic of China

Semantic segmentation of point clouds, a fundamental task in 3D scene understanding, faces two persistent challenges. First, it is difficult to efficiently extract discriminative features for complex and irregular surfaces; existing methods struggle with the trade-off between simple features, which are insufficient, and complex features, which are computationally expensive. Second, many deep learning models ignore the inherent spatial correlation of point cloud features during the training process, limiting segmentation accuracy. Optimizing the feature representation for complex surfaces while fully leveraging feature correlation is key to advancing segmentation performance.

To tackle these challenges, we propose the Multi-Scale Umbrella Feature model with Spatial Self-Attention (MUSF-SSA). This model introduces a novel Multi-Scale Umbrella Feature (MUSF) to efficiently represent irregular surfaces and integrates a spatial self-attention (SSA) mechanism in its backbone to explicitly learn the spatial correlation between features.

Through these improvements, while maintaining a low parameter count (1.088M), our model achieves 68.6% mIoU, 76.5% mAcc, and 90.4% OA on the S3DIS Area-5 test, a typical indoor point cloud dataset. Compared to the similar method RepSurf-U, this represents a gain of +3.6% mIoU, +4.0% mAcc, and +2.6% OA.

Evaluating the Efficiency of Machine Learning Algorithms in Identifying Geothermal Energy Potential Areas in Akita and Iwate Provinces, Japan

Majid Kiavarz, Mohammadreza Jelokhani Niaraki, Avin Meysami, Yasaman Ghorbani, Najmeh Neysani Samany

University of Tehran

The growing demand for clean and renewable energy sources has intensified the need to identify and exploit geothermal resources as a key solution for sustainable energy development. However, geothermal exploration faces significant challenges including geological complexity, high drilling costs, economic risks, and spatial data limitations. This study evaluates the efficiency of advanced machine learning algorithms, specifically Random Forest and Generative Adversarial Networks (GANs), in identifying geothermal energy potential areas in Akita and Iwate provinces, Japan. Using a limited dataset of 152 geothermal well locations, seven key parameters were analysed: volcanic activity, fault and fracture density, hot springs, surface thermal indices, fumaroles, mud volcanoes, and surface alteration evidence. Data were collected from geological and remote sensing sources and pre-processed for modelling. Results demonstrate that both algorithms effectively identify high-potential areas despite data scarcity. Random Forest achieved 94.08% accuracy in well identification with a C/S(C) index of 10.93, demonstrating robust performance and spatial correlation. The Generative Adversarial Network showed superior performance with 96.71% accuracy and a C/S(C) index of 4.36, indicating exceptional capability in identifying geothermal potential areas and detecting complex spatial patterns. These findings confirm that hybrid approaches combining machine learning and deep learning, particularly GANs, possess high capability for accurate geothermal prospectivity mapping and can effectively overcome limitations posed by data scarcity, providing valuable tools for exploration prioritization and investment decision-making

Theoretical Comparison of Façade Texture Resolution for 3D Building Models Generated from Nadir and Oblique Aerial Imagery

Masato Ishikawa, Tomoaki Inazawa, Yoshihiko Nakanishi, Futa Kawamata, Masahito Takada, Takuya Danjo

Kokusai Kogyo Co., Ltd., Japan

Building models are one of the key features in 3D city models. To realistically represent building exteriors, texture images are often applied to these models. Such textures are important not only for visual appearance but also for practical applications, such as automated generation of higher-Level-of-Detail (LoD) models and various urban simulations. In large-scale urban modeling projects, façade textures are typically obtained through aerial photogrammetry conducted by manned aircraft, primarily due to operational efficiency. In many such surveys, image acquisition is mainly based on nadir-oriented cameras. However, nadir-only imaging inherently limits façade resolution due to viewing geometry. In this study, we compare the façade resolution attainable from nadir and oblique cameras to examine the effectiveness of multi-directional camera systems in producing high-resolution façade textures. A theoretical approach is adopted to estimate the attainable façade resolution under given imaging conditions. A comparative analysis using the camera parameters of UCE M3 (nadir-only) and CM-2 (multi-directional) indicates several advantages of oblique cameras for façade texture generation: (1) significant improvement in the lowest façade resolution compared to nadir photography, (2) more consistent façade resolution across the entire survey area, and (3) limited sensitivity of façade resolution to increased camera station interval. These findings suggest that incorporating oblique cameras into an aerial survey system can contribute to stabilizing and enhancing attainable façade resolution compared to nadir only configurations.

Calibrating large-FOV stereo videogrammetric system using drone and epipolar geometry

Haibo Shi, Xianglei Liu, Runjie Wang

Beijing University of Civil Engineering and Architecture, China

Videogrammetry is widely used in fields such as structural health monitoring, surveillance, and aerospace, where accurate 3D measurements rely on precise calibration of stereo camera systems. Traditional planar target–based calibration provides high accuracy but becomes impractical for large-FOV setups due to the need for large, high-precision targets placed at long working distances. Control-field calibration, which uses spatially distributed artificial targets measured by total stations or GPS-RTK, similarly faces limitations in environments lacking accessible mounting locations. Other existing methods—such as rigid stereo-target calibration, close-range light-spot targets, and active phase targets—offer partial improvements but remain constrained by fabrication complexity, optimization instability, or limited depth-direction accuracy.

To address these challenges, this work proposes a flexible calibration method for large-FOV stereo videogrammetric systems using UAV trajectory imaging and epipolar geometry. A UAV carrying a rigid circular target flies through the measurement volume, while two synchronized cameras record its motion. Target centers are extracted using Circular-MarkNet, intrinsic parameters are obtained using an active-phase target, and scale-free extrinsic parameters are initialized from essential matrix estimation. The metric scale is introduced through static GPS measurements, and all parameters are refined via nonlinear optimization. Validation against a conventional circular-target control field shows that the proposed approach achieves comparable calibration accuracy within a 70–50–10 m volume while avoiding the need for large calibration targets.

A Hybrid Approach using Gaussian Splatting and Parametric Models based on 3D Renders for Real-Time Visualisation

Etienne Sommer, Arnadi Murtiyoso, Mathieu Koehl, Pierre Grussenmeyer

INSA Strasbourg, France

The valorisation and dissemination of built heritage to the public is a crucial objective, complementing conservation efforts. However, traditional 3D models, such as dense meshes, often present limitations for this purpose, proving too heavy and complex for easy sharing and real-time visualisation.

This paper presents a hybrid approach that addresses this challenge by leveraging 3D Gaussian Splatting (3DGS) for the real-time visualisation of complex parametric models. This method is particularly effective for visualising 4D reconstructions representing historical phases of edifices that may no longer exist.

The methodology employs synthetic images generated from the parametric model using 3D rendering software. To ensure compatibility with procedural textures, path-tracing is used , but photorealistic effects such as cast shadows and reflections are deliberately removed. These optimised 3D renders are then processed through a conventional photogrammetric pipeline to generate the necessary camera orientations and sparse point cloud for 3DGS training.

The resulting 3DGS representation enables real-time rendering. This technique successfully converts a model composed of multiple, distinct parametric components into a single, unified object. This approach also demonstrates a strong capability for reconstructing contextual elements, such as vegetation, which are often poorly handled by traditional meshing techniques. The method effectively transforms a complex, software-specific model into a lightweight representation ideal for applications where visualisation speed is essential.

Improving Head Pose Estimation in Radiation Therapy through photogrammetric Techniques for Machine Learning Applications

Cyrill Milkau¹, Sebastian Preußel², Sarah Guy³, Danilo Schneider¹

¹Faculty of Spatial Information, HTW Dresden – University of Applied Sciences, Germany; ²Institute of Photogrammetry and Remote Sensing, Dresden University of Technology, Germany; ³Department of Radiotherapy and Radiation Oncology, Dresden University of Technology, Germany

This study investigates the integration of photogrammetry and machine learning to enhance head pose estimation in radiation therapy. The primary objective is to improve the accuracy of patient positioning, which could reduce the reliance on immobilization masks, thereby enhancing patient comfort. The methodology involves the use of markers and cameras to track head movements, combined with machine learning algorithms to refine pose estimation. By merging deterministic photogrammetric techniques with advanced machine learning models, this approach aims to achieve more precise and reliable head pose estimation. The potential outcomes of this research could lead to more effective and comfortable radiation therapy treatments for patients with head-and-neck cancers.

A Comparative Study of Deep Learning and Unsupervised Segmentation Methods for Individual Tree Delineation from LiDAR point clouds

Jinhong Wang¹, Wei Yao^2,3, Tiangang Yin¹

¹Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University; ²Institute of Urban Environment, Chinese Academy of Sciences, China, People's Republic of; ³School of Engineering and Design, Technical University of Munich, Munich, 80333, Germany

This study aims to conduct a comparative analysis of individual

tree segmentation (ITS) methods for forest LiDAR point clouds.

Traditional ITS approaches have been predominantly based on

unsupervised segmentation algorithms using geometric features.

In recent years, research has progressively shifted toward super-

vised deep learning (DL) techniques. However, the perform-

ance of existing methods across diverse forest types has not yet

been systematically assessed.

On solving exterior orientation of an image with particle swarm optimization

Petri Rönnholm, Matti Kurkela, Matti T. Vaaja, Hannu Hyyppä

Department of Built Environment, Aalto University, Finland

Solving the exterior orientation of images is a fundamental component in photogrammetric mapping and 3D restitution processes. Additionally, it is essential in photogrammetric tasks such as visual odometry, camera-based visual simultaneous localization and mapping, camera calibration, camera-based 3D tracking of movement, and change detection. The aim of this research was to evaluate whether particle swarm optimization is suitable for finding the exterior orientation parameters of a single image using image resection. In addition, we developed a robustified particle swarm optimization by adding an iteratively changing stochastic model to the optimization criteria by attaching a weight matrix with residual vectors. The method was compared to the solution from the least squares method using both simulated ideal and noisy data. Solving the exterior orientation parameters reliably with particle swarm optimization was possible after fine-tuning the algorithm's options. The non-robustified version of particle swarm optimization provided identical results to the non-robustified least squares method. However, in the case of the robustified particle swarm optimization, only 60% of attempts resulted in the same outcome as the corresponding robustified least squares method, with sub-millimeter accuracy. In 40% of cases, the results achieved millimeter accuracy. The sub-millimeter accuracy was achieved in every case with sequential robustified particle swarm optimization, where the algorithm was rerun using stricter bounds for unknown parameters if the evaluation criteria were too large. The implementation of particle swarm optimization is easier than that of the nonlinear least squares method. However, the computation time for particle swarm optimization was significantly longer.

Incremental Semantics-Aided Meshing from LiDAR-Inertial Odometry and RGB Direct Label Transfer

Muhammad Affan, Ville Lehtola, George Vosselman

University of Twente

Geometric high-fidelity mesh reconstruction from LiDAR-inertial scans remains challenging in large, complex indoor environments– such as cultural buildings– where point cloud sparsity, geometric drift, and fixed fusion parameters produce holes, over smoothing, and spurious surfaces at structural boundaries. We propose a modular, incremental RGB + LiDAR pipeline that generates incremental semantics-aided high-quality meshes from indoor scans through scan frame-based direct label transfer. A vision foundation model labels each incoming RGB frame; labels are incrementally projected and fused onto a LiDAR-inertial odometry map; and an incremental semantics-aware Truncated Signed Distance Function (TSDF) fusion step produces the final mesh via marching cubes. This frame-level fusion strategy preserves the geometric fidelity of LiDAR while leveraging rich visual semantics to resolve geometric ambiguities at reconstruction boundaries caused by LiDAR point-cloud sparsity and geometric drift. We demonstrate that semantic guidance improves geometric reconstruction quality; quantitative evaluation is therefore performed using geometric metrics on the Oxford Spires dataset, while results from the NTU VIRAL dataset are analyzed qualitatively. The proposed method outperforms state-of-the-art geometric baselines ImMesh and Voxblox, demonstrating the benefit of semantics-aided fusion for geometric mesh quality. The resulting semantically labelled meshes are of value when reconstructing Universal Scene Description (USD) assets, offering a path from indoor LiDAR scanning to XR and digital modeling.

Evaluation of systematic and random errors in occupancy grid maps

Yuguang Liu¹, Marko Radanovic¹, Krista A. Ehinger², Kourosh Khoshelham¹

¹Department of Infrastructure Engineering, The University of Melbourne, Australia; ²School of Computing and Information Systems, The University of Melbourne, Australia

Map evaluation for occupancy grid mapping (OGM) is critical in the field of high-definition mapping of the road environment for autonomous vehicles. Existing methods cannot adequately evaluate the systematic and random errors that might be present in OGM. This article introduces two evaluation metrics for OGM under LiDAR position uncertainty: Mean Signed Distance (MSD) and Mean Absolute Deviation (MAD). MSD quantifies systematic displacement of occupied cells, while MAD measures random error exhibited as boundary thickening. Unlike classification-based, probabilistic, and geometric metrics, MSD and MAD directly isolate displacement and thickening effects in OGM. We validate both metrics in a controlled synthetic environment and on a real indoor LiDAR dataset, showing better performance than conventional metrics.

Deep learning-based building detection using high-resolution RGBI orthophotos and DSMs

Mohamed Fawzy^1,2, Attila Juhasz¹, Arpad Barsi¹

¹Department of Photogrammetry and Geoinformatics, Faculty of Civil Engineering, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary, {mohamed.fawzy, juhasz.attila, barsi.arpad}@emk.bme.hu; ²Civil Engineering Department, Faculty of Engineering, Qena University, 83523 Qena, Egypt, mohamedfawzy@eng.svu.edu.eg

Deep learning techniques have demonstrated a promising efficacy for building feature extraction, presenting practical strategies to lessen the labour-intensive work of map updating, change detection, and urban growth monitoring. To address the labour-consuming challenges, a U-Net-based convolutional neural network model is developed to generate building maps automatically using high-resolution RGBI orthophoto and DSM data. The approach shows the effectiveness of the U-Net-based semantic segmentation for urban scene analysis. The presented procedures collect, preprocess, and combine orthophoto with DSM in order to train, apply, and assess the U-Net model for building extraction in urban environments using two input scenarios: (1) solely RGBI orthophoto and (2) RGBI orthophoto integrated with DSM. Four standard metrics: completeness, correctness, quality, and overall accuracy are applied to evaluate the model outputs, comparing the single orthophoto input to the combined orthophoto with DSM for building detection. The significant impact of the DSM and RGBI pairing is demonstrated by the heightened reliability of the data integration strategy when estimating buildings within nearby similar objects like roads and impervious surfaces. However, a few challenges related to the model's generalisation are noticed across complex urban contexts, including tree occlusions, unreferenced building extensions, and height irregularities surrounding structures. The findings highlight the potential of multimodal data fusion in urban investigations and reveal how it can improve the mapping of built-up assets. Final results argue that DSM incorporation significantly enhances building classification performance using deep learning frameworks for geospatial applications, particularly in complex urban environments where single data and traditional image-based segmentation methods face limitations.

Simulation of Stationary and Mobile Laser Scanning with VRscan3D

Denys Gorkovchuk¹, Julia Horkovchuk¹, Maria Chizhova², Darius Popovas³, Thomas Luhmann³

¹Kyiv National University of Construction and Architecture; ²Otto-Friedrich Universität Bamberg; ³Institute for Applied Photogrammetry and Geoinformatics

The VRscan3D project introduces a virtual simulation environment for stationary and mobile laser scanning designed to enhance education, research, and AI-based point cloud analysis. Developed using Unreal Engine, the simulator replicates the physical behavior of real terrestrial laser scanners, allowing users to perform realistic scanning operations within immersive 3D environments. The system reproduces manufacturer-specific parameters such as range noise, beam divergence, and intensity, generating synthetic point clouds that closely approximate real data.

VRscan3D enables users to plan and execute virtual scanning campaigns, analyze data quality, and understand the influence of scanning geometry, surface materials, and user behavior. Recent developments include dynamic scene simulation with moving objects, integration of user-imported environments, and support for mobile scanning trajectories—handheld, vehicle-mounted, or UAV-based—reflecting natural oscillations and movement patterns.

In addition to training and education, VRscan3D serves as a generator of synthetic point clouds with known ground truth, facilitating the development and validation of AI algorithms for object detection, segmentation, and classification. Comparative studies between simulated and real scans demonstrate high similarity in terms of accuracy, resolution, and completeness.

By bridging real-world surveying practice and virtual learning, VRscan3D offers a cost-effective, accessible platform for universities and professionals lacking physical equipment or facing mobility restrictions. It represents a new step toward open, immersive, and intelligent learning environments in geospatial education and research.

Symmetry-aware Texture Refinement for 3D Building Models via Massing Decomposition and Generative AI

Fan Xue¹, Yijie Wu², Maosu Li³

¹The University of Hong Kong, Hong Kong S.A.R. (China); ²The Hong Kong Polytechnic University, Hong Kong S.A.R. (China); ³The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China

Three-dimensional (3D) building models with accurate geometry and realistic textures remain essential for city information modeling and digital twin applications. However, photogrammetric reconstructions consistently suffer from severe texture defects caused by occlusions, shadows, distortions, and projection errors. Existing approaches either rely on rigorous photometric optimization that demands topological correctness and multi-view imagery, or employ flexible AI-driven generation that leverages semantics but often lacks geometric constraints.

This paper presents a novel hybrid framework that exploits architectural regularities—specifically massing decomposition and partial symmetries—to guide high-fidelity texture refinement. We first decompose building meshes into mass-aligned convex volumes using MorphCut. Textures are then reprojected onto these volumes, followed by Building Section Skeletons to pair symmetric facades and establish precise geometric correspondences. Finally, generative AI is applied using symmetry-aware constraints to achieve contextually accurate inpainting and correction.

Pilot studies on three Hong Kong buildings demonstrate robust decomposition, faithful texture transfer, and effective defect mitigation, while revealing current limitations of unconstrained generative models in preserving floor counts and structural regularity. The proposed symmetry-guided pipeline notably advances the reliable and semantically coherent reconstruction of textures for complex urban buildings.

AI-Driven 3D reconstruction and quality assessment for Cultural Heritage: first results from the HERITALISE project

Filiberto Chiabrando¹, Andrea Maria Lingua², Alessio Martino¹, Francesca Matrone², Alessandra Spadaro²

¹Laboratory of Geomatics for Cultural Heritage (LabG4CH), Department of Architecture and Design (DAD), Politecnico di Torino, Viale Pier Andrea Mattioli, 39, Torino (TO), Italy; ²Geomatics Lab, Department of Environment, Land and Infrastructure Engineering (DIATI), Politecnico di Torino, Corso Duca degli Abruzzi, 24, Torino (TO), Italy

The accurate digital documentation of Cultural Heritage (CH) assets demands workflows capable of integrating heterogeneous, multiscale datasets while preserving both geometric fidelity and radiometric completeness. This paper presents the first results of the AI-based processing pipeline developed within the HERITALISE project (Horizon Europe, 2025–2028), applied to three multiscale case studies at the Reggia di Venaria Reale (Turin, Italy): an outdoor-indoor UAV photogrammetric survey, a kinematic SLAM acquisition of a contemporary sculpture garden, and a close-range dataset of an 18th-century decorative artefact. 3D Gaussian Splatting (3DGS) is evaluated as a novel view synthesis method across all three scenarios, demonstrating strong photorealistic rendering capabilities, particularly for complex material properties and geometrically challenging interiors, whilst highlighting current limitations for metric surveying applications. A two-stage crack detection workflow, combining tile-based text-prompted segmentation with SAM3 and multiview ray-based reprojection onto the reconstructed mesh, is validated on UAV imagery, achieving an 84.9% ray–mesh intersection rate. Finally, a standardised evaluation framework is proposed, encompassing adaptive, scale-dependent geometric and radiometric metrics organised into reference-based and no-reference assessment scenarios, aggregated into a transparent synthetic quality score with three adaptive quality classes. The proposed methodology contributes toward a reproducible, sensor-agnostic standard for the assessment of AI-generated CH documentation products.

Haul Road Extraction in Open-Pit Mines via Dual-Encoder RGB–DSM Transformer Fusion

Loghman Moradi, Kamran Esmaeili

University of Toronto, Canada

Haul roads are essential to open-pit mines, acting like the mine’s circulatory system. Keeping accurate, up-to-date maps of these roads is critical for maintenance, safety, and efficient material handling, yet automating this task is challenging. Traditional deep learning models that rely only on RGB images often fail in mining environments, where road surfaces resemble bare earth, dusty terrain, or shadowed areas. To address this, we propose a dual-encoder transformer that combines UAV-captured RGB images with DSM data using stage-wise cross-attention, leveraging both visual and topographic information. Two SegFormer encoders process each data type separately, creating detailed feature representations that are fused at each stage. This allows the model to learn specialized information while sharing knowledge between modalities. A lightweight All-MLP decoder produces the final segmentation map. We tested our method on a high-resolution dataset of 12,000 tiles from the Mildred Lake open-pit mine in Fort McMurray, Canada. Our model achieves 80.8% mIoU, 88.7% F1-score, and 73.7% road accuracy, outperforming an RGB-only baseline by 3.3%, 2.4%, and 7.8 points, respectively. Ablation studies demonstrate that including DSM data consistently improves recall and road detection, especially in areas where RGB information alone is ambiguous or terrain is complex.

Benchmarking Local Registration Algorithms on Multi Temporal and Multi Spatial Point Clouds

Tommaso Mainiero, Jad Ghantous, Nives Grasso, Vincenzo Di Pietra

Department of Environment, Land and Infrastructure Engineering , Politecnico di Torino, Italy

This study presents a systematic benchmarking framework to evaluate the performance of local point cloud registration algorithms and their impact on geomorphological change detection. Three widely used methods—Iterative Closest Point (ICP), Point-to-Plane ICP, and Generalized ICP (GICP)—were tested across two alpine case studies in Italy (Rio Cucco catchment and Belvedere Glacier), considering different surface types and initial alignment conditions.

Three local registration methods—Iterative Closest Point (ICP), Generalized ICP (GICP), and Point-to-Plane ICP—were tested under varying initial alignment and terrain conditions using standardized voxelized patches (0.3 m). Performance was evaluated through median distance, cloud-to-cloud mean distance, and computation time metrics.

Results highlight the strong influence of surface morphology on algorithmic stability: rocky areas ensure reliable convergence, while dense vegetation introduces ambiguity and reduced accuracy. GICP provided the best compromise between robustness and efficiency.

The study further highlights that integrating robust outlier rejection significantly improves statistical consistency and reduces LoD95. The proposed approach provides a reproducible framework for optimizing co-registration strategies and improving the accuracy of geomorphological monitoring in high-relief environments.

Human Trajectory Prediction on UAV Images: A Comparative Study

Rafael D. M. da Hora¹, Daniel R. Santos¹, Maurício C. M. Paulo¹, Felipe Ferrari¹, Raul Q. Feitosa², Paulo F. F. Rosa¹

¹Military Institute of Engineering, Brazil; ²Pontifical Catholic University, Brazil

Video human trajectory prediction is a fundamental research task for many applications in civil and defense. Compared to trajectory prediction based on a single frame, human trajectory prediction in videos, especially in the context of unmanned airborne vehicles (UAVs) platforms, is a challenge due to the time series prediction analyses required. As frames in a video streaming are highly correlated, trajectory detection in UAV images is affected by particular factors such as oblique camera views and the platform motion. This study aims to identify the most robust and accurate deep learning model in the context of UAVs videos by comparing three distinct categories: classical machine learning, established deep learning architectures, and computationally efficient models based on Multi-layer Perceptrons (MLPs). We propose an analysis based on only bounding box center coordinates instead of image scenes. The results show that a simple linear architecture provided the best performance, highlighting the importance of these mechanisms in predicting human motion from trajectory data alone.

Multi-technique approach for 3D documentation of rock walls in narrow gorges

Antonio Tomás Mozas-Calvache, José Luis Pérez-García, José Miguel Gómez-López, Diego Vico-García, Jorge Delgado-García

University of Jaén, Spain

This study presents a robust multi-technique methodology for generating complete, high-accuracy 3D documentation of highly constrained natural heritage sites, addressing the limitations of single-technique geomatic approaches. The research focuses on two challenging gorge environments in Southern Spain: Los Cañones de Río Frío and El Caminito del Rey. Both sites feature extreme vertical walls (up to 300 meters and narrow passages that complicate GNSS-RTK positioning and render individual UAV, TLS, or terrestrial photogrammetry techniques unfeasible due to occlusions and safety/logistical constraints. The proposed workflow centers on data fusion, leveraging LiDAR data for core geometry and photogrammetry for texture and gap-filling. Data acquisition integrated multiple sensors, including UAV LiDAR/Photogrammetry, Terrestrial Laser Scanning (TLS), Mobile Mapping Systems (MMS), and Spherical Photogrammetry (SP). A key methodological innovation involves deriving second-order Ground Control Points (GCPs) from UAV photogrammetry to georeference other data in areas with poor satellite coverage, significantly reducing fieldwork while maintaining accuracy. The highly precise TLS point cloud was used as the geometric base for the final model. The resulting products—including high-density point clouds and 2 cm orthoimages and 3D models—demonstrate comprehensive coverage and high accuracy (about 4 cm for georeferenced data), enabling 2.5D rockfall simulation and establishing a foundation for a Digital Twin of both gorges.

Augmented and Mixed Reality Scene Alignment Through 3D-to-3D Learning-Based Cross-Source Point Cloud Registration

Juan Sebastian Sardi Barzallo¹, Volker Coors²

¹Stuttgart,Technical University of Applied Sciences; ²Stuttgart,Technical University of Applied Sciences

With the fast development of reality capture technology and the increasing availability and accessibility to devices capable of capturing 3D point clouds, a wide range of applications where cross-source Point Cloud Data (PCD) data interact appears to be more frequent. Augmented and Mixed Reality (AR/MR) technologies are pivotal for the integration between digital and physical environments by overlaying Digital Twin (DT) models into real contexts, and show themselves as capable of producing real-time 3D point cloud data. Nevertheless, the integration of AR/MR real-time 3D point cloud data with others such as LiDAR data still an open field for research specially at fundamental tasks such as scene alignment and camera localization. Conventional vision-based methods are vulnerable to environmental variations making achieving suitable camera localization and scene alignment challenging. Conventional vision-based methods are vulnerable to environmental variations, making achieving suitable camera localization and scene alignment challenging. This work proposes an exclusively 3D-o-3D-based methodology for AR/MR scene align alignment and camera localization addressing the challenges of cross- source point cloud registration in large size disparity scenarios. By combining cross-source point cloud registration via Voxel Representation and Hierarchical Correspondence Filtering (VRHCF) learning-based method TEASER++ algorithm, our approach effectively manages asymmetric heterogeneous point cloud data, achieving promising registration results especially in extensive indoor settings. The qualitative results suggest improvements over existing studies, despite outlier challenges in outdoor environments that warrant further research. This study highlights the potential and the essential need for advanced methodologies to enable seamless interactions between digital and physical worlds.

Semantic-Guided High-Fidelity Indoor Scene Reconstruction Based on 3D Gaussian Splatting

Mingyue Dong¹, Xianwei Zheng¹, Jiansi Yang¹, Linwei Yue², Jianya Gong¹

¹Wuhan University; ²China University of Geosciences

Indoor 3D scene reconstruction is essential for digital twins and intelligent spatial applications but remains challenging due to severe occlusions, weak textures, and complex geometric structures. This paper presents a semantic-guided high-fidelity indoor reconstruction framework based on 3D Gaussian Splatting (3DGS), which achieves high-precision geometry and photorealistic rendering through semantic-aware optimization. First, a high-quality geometric prior generation scheme is developed by integrating a 2D depth prediction network to enhance noisy depth data captured by mobile devices. The refined depth maps are processed by computing spatial gradients to derive surface normals in world coordinates, providing geometric supervision for the position and orientation of Gaussian ellipsoids. A projection-error-based filtering mechanism ensures consistency across multiple views. Second, a semantic-guided differentiated reconstruction framework is introduced. Using a pretrained segmentation model (SAM), the method distinguishes between large weak-texture areas and fine-detail regions. Normal regularization improves surface smoothness in planar regions, while detail-aware weighting strengthens local geometric fidelity. Additionally, a multi-view semantic consistency strategy jointly optimizes color and geometry across viewpoints, enhancing global coherence and reducing overfitting. Experiments on ScanNet++ and Mushroom datasets demonstrate that the proposed method surpasses state-of-the-art baselines in rendering quality and geometric accuracy. It effectively reconstructs continuous surfaces and detailed structures, showing strong potential for applications in virtual reality, digital twins, and real-time indoor modeling.

Enhanced DUSt3R for Underwater 3D Reconstruction in Shallow Water Environments

Tsuyoshi Shimano, Takashi Fuse

The University of Tokyo, Japan

Shallow-water environments present significant challenges for underwater photogrammetry due to light caustics and the combined effects of absorption and scattering caused by water turbidty. These optical disturbances degrade image quality, disrupt feature matching, and ultimately reduce the reliability of 3D reconstruction using traditional SfM (Structure from Motion) pipeline. In this study, we focus on these two dominant factors and investigate a 3D reconstruction framework inspired by recent feed-forward architectures such as DUSt3R (Dense and Unconstrained Stereo 3D Reconstruction). To support this approach, we develop a synthetic data generation pipeline capable of simulating shallow-water visual conditions. Preliminary experiments indicate a possible trend for integrating physics-aware image formation with DUSt3R-type feed-forward reconstruction. However, several limitations remain: the current model does not yet achieve stable accuracy, real-world underwater validation has not been conducted, and computation costs remain high due to complex training procedures. Future work will focus on refining the network architecture, exploring DUSt3R-derived multi-view and high-fidelity extensions, accelerating computation, and validating the pipeline in real

shallow-water environments. Additionally, integrating advanced rendering techniques may further improvethe refinement of 3D reconstruction.

Evaluating SfM Techniques for DEM Production from VHR Satellite Imagery in Urban Contexts

Gabriele Lo Grasso, Valentina A. Girelli, Emanuele Mandanici

Alma Mater Studiorum - University of Bologna, Italy

Digital Surface Models (DSMs) provide the fundamental elevation data required for generating 3D city models, which support a wide range of analyses such as solar potential estimation, urban heat island assessment, and infrastructure monitoring.

Advances in very high-resolution satellite stereo imaging, airborne LiDAR, and aerial photogrammetry have made it possible to generate DSMs at fine spatial resolution using different acquisition geometries and multi-view reconstruction techniques. However, these data sources differ substantially in terms of spatial resolution, viewing geometry, and surface visibility, leading to variations in elevation accuracy and morphological completeness. Airborne LiDAR surveys can provide highly detailed and accurate three-dimensional point clouds compared to aerial photogrammetry, but are associated with high acquisition and processing costs, as well as logistical constraints.

This study presents a comparative analysis of the DSMs derived from WV-3 panchromatic stereo imagery and oblique aerial photographs processed with the Structure-from-Motion (SfM) approach, focusing on the capability of SfM to reconstruct the complex urban morphology. The study area, a district of the city of Bologna, is characterized by a heterogeneous urban texture including compact mid-rise residential blocks, industrial facilities, vegetated zones, and open spaces, making it an ideal test site for comparing elevation models derived from different sensors and acquisition geometries.

Canopy Entropy Sensitivity Analysis for Scalable Canopy Structural Complexity Estimation

Bin Wang, Yuqi Lei, Zheng Xu, Wen Xiao

China University of Geosciences（Wuhan）, China, People's Republic of

Canopy Entropy (CE) quantifies 3-D forest heterogeneity from LiDAR, but its reliability depends on point density and kernel bandwidth. Using 11 sub-sampled airborne datasets (12–240 pts m⁻²) and bandwidths 0.1–2 m over a 20 ha Jiangxi plot, we show CE is stable (CV < 0.6 %) above 72 pts m⁻², whereas below 50 pts m⁻² it falsely inflates (> +5 %). CE grows logarithmically with bandwidth, saturating beyond 1 m; 0.2 m is optimal at landscape scale. Maintain ≥ 50 pts m⁻² and h ≈ 0.2 m for unbiased canopy-complexity mapping. An Investigation of the Application of GCE for Comparing Cross-Scale Structural Complexity Using Simulated Datasets.

High-Precision Point Cloud Registration Method Based on Planar and Linear Features

Chenxin Yang, Kazuha Kumazawa, Saki Komoriya, Hiroshi Masuda

The University of Electro-Communications, Japan

Accurate registration of point clouds obtained from different viewpoints is essential for constructing consistent and reliable 3D models. Terrestrial laser scanner (TLS) data are typically represented in local coordinate systems centered at individual scanner positions, requiring transformation into a common reference frame. However, achieving high-accuracy registration for large-scale datasets remains challenging. Even small rotational errors in rigid transformations can result in significant positional deviations over long distances. Conventional registration methods, such as the Iterative Closest Point (ICP) algorithm, perform well in dense regions but often produce misalignments in sparse or geometrically uniform areas.

This study presents a high-precision point cloud registration approach that integrates global geometric features—such as planes and lines—with local point-based constraints. Plane and line features are extracted using RANSAC-based detection and incorporated into an enhanced ICP framework, improving both stability and convergence in large-scale environments.

Experimental evaluations using real TLS datasets acquired from an industrial factory demonstrate that the proposed hybrid ICP method significantly outperforms conventional approaches. The integration of global geometric features effectively reduces local misalignments and improves registration accuracy, particularly in regions with uneven point density or limited structural variation.

RTK-Guided Gaussian Splatting Pipeline for Georeferenced Urban 3D Reconstruction

Cheolhwan Kim¹, Wonjun Choi¹, Youngmok Kwon¹, Jungho Lee², Minhyeok Lee², Hong-Gyoo Sohn¹

¹Dept. of Civil and Environmental Engineering, Yonsei University, Seoul, Republic of Korea; ²Dept. of Electrical and Electronic Engineering, Yonsei University, Seoul, Republic of Korea

Automated 3D reconstruction technologies utilizing multi-source spatial data have gained significant attention in recent years. While conventional approaches rely on registration-based multi-sensor integration, recent Gaussian Splatting techniques have shown strong potential for large-scale modeling using only monocular imagery. However, existing 3DGS frameworks operate in relative coordinate systems and lack alignment with absolute geospatial references, limiting their applicability for real-world mapping.

To address these challenges, we propose a georeferenced Gaussian Splatting framework that integrates RTK-GPS camera position measurements directly into the training process. Initial camera parameters and sparse point clouds are estimated using an image-based SfM pipeline and subsequently aligned to a global coordinate frame through a similarity transformation based on RTK-GPS measurements acquired alongside the imagery. During coarse GS training, per-camera translation and rotation corrections are jointly optimized to compensate for geometric errors introduced during global frame alignment. The translation updates are guided toward RTK-GPS-measured positions, while a reprojection constraint based on SfM sparse 3D observations preserves the multi-view geometric consistency established by bundle adjustment.

The proposed method generates 3DGS outputs aligned with an absolute coordinate system with only marginal degradation in rendering metrics such as PSNR, SSIM, and LPIPS. Mesh conversion and surface-distance comparison with laser scanning data further validate the reliability of the reconstructed geometry. This work demonstrates the feasibility of real-world georeferenced modeling using Gaussian Splatting-based scene representation.

Shape Reconstruction from Large Scale Point Clouds Using Planar Adjacency Relations

Yusuke Nagasawa, Hiroshi Masuda

The university of Electro Communication, Japan

Digital twins of production facilities, represented as 3D virtual environments generated from point cloud data, are increasingly demanded for efficient facility management. Although terrestrial laser scanners (TLS) enable high-density 3D acquisition of such environments, the resulting point clouds are extremely large in data size. In practical applications, lightweight mesh models are therefore required as a substitute for raw point cloud data. However, TLS measurements often contain occlusions and missing regions, making it challenging to reconstruct complete mesh models directly from incomplete point clouds. Many objects installed in production facilities, such as equipment platforms, fences, columns, and ladders, consist mainly of planar surfaces. Efficient plane detection methods have been developed for large-scale point clouds (Masuda, 2015; Takeda, 2024). For objects composed of planes, 3D models can be reconstructed from the detected planes. However, industrial point clouds are extremely large, including many densely sampled planar regions. Furthermore, many existing methods focus on standard components with fixed shapes, such as pipe structures, and are not applicable to objects with more flexible geometries. To overcome these limitations, this study first converts the detected planar regions into simplified mesh representations to reduce data volume. We then construct a planar adjacency graph that preserves spatial relationships and geometric attributes between planes. Finally, we reconstruct the target structure by identifying and assembling appropriate subsets of planes.

In-situ LiDAR-assisted backpack camera system calibration for forest mapping

Raja Manish, Songlin Fei, Ayman Habib

Purdue University, United States of America

Backpack mapping systems equipped with LiDAR sensors and RGB cameras, and an optional GNSS/INS direct georeferencing unit, are increasingly used in forest inventory applications. A key prerequisite to deriving accurate mapping products from these platforms is system calibration to establish the mounting parameters relating the LiDAR and camera sensors to the IMU body frame of the GNSS/INS unit. Conventional system calibration procedures entail specific trajectory and target deployment at the calibration site, followed by a labor-intensive identification of targets in imagery and LiDAR point cloud. Given the significance of multi-modal data alignment for forest inventory, this study explores an alternative approach for camera–LiDAR system calibration.

Bundle Adjustment for Satellite Attitude Jitter

Shun Zhou, Hongbo Pan

Central South University, China, People's Republic of

To address the limitations of existing RFM bias-compensation methods, which difficult to handle complex attitude jitter and lack fully automated processing, this study introduces an innovative Bundle Adjustment (BA) approach that incorporates adaptively determined spline smoothing parameters. The method constrains the smoothing term of the spline using prior matching accuracy and enables the adaptive estimation of the smoothing parameter within the BA process. Because the procedure requires no manual intervention and the adaptive smoothing term retains reasonable physical interpretation, the proposed approach is broadly applicable to the correction of attitude jitter in linear pushbroom satellite systems.

A Comparative Study of MVS and NeRF Approaches for Dense 3D Reconstruction of Mediterranean Coral

Paolo Rossi¹, Riccardo Roncella¹, Cristina Castagnetti²

¹University of Parma, Department of Engineering and Architecture, 43124, Parma, Italy; ²University of Modena and Reggio Emilia, Department of Engineering, 41125, Modena, Italy

This work investigates the potential of optimizing underwater image acquisition while preserving reconstruction quality. A comparative evaluation of Multi-View Stereo (MVS) and Neural Radiance Fields (NeRF) is conducted, focusing on their performance in terms of completeness and robustness under conditions of reduced image availability. The study concentrates on underwater scenes involving Mediterranean coral species, where traditional photogrammetric methods often encounter difficulties due to occlusions and low-texture surfaces. The analysis is based on datasets acquired under controlled conditions, allowing for a direct comparison of the dense reconstruction capabilities of both approaches. The impact of decreasing the number of input images on reconstruction completeness and model accuracy is assessed, with results benchmarked against a reference dataset obtained using a triangulation laser scanner.

A progressive framework for 3D scene understanding from multi-view satellite imagery

Xuejun Huang¹, Yi Wan^1,2, Xinyi Liu^1,2, Yongxiang Yao¹, Dong Wei¹, Yongjun Zhang^1,2

¹School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, Hubei, China; ²Technology Innovation Center for Collaborative Applications of Natural Resources Data in GBA, Ministry of Natural Resources, Guangzhou, 510075, Guangdong, China

3D scene understanding is critical for applications like smart city management and urban planning. However, existing methods often treat 2D semantic understanding and 3D reconstruction as independent tasks, limiting the ability to create a unified 3D semantic representation. This separation hinders the accuracy, interpretability, and scalability of large-scale 3D scene understanding.

In this work, we propose a progressive, three-stage pipeline that seamlessly connects multi-view semantic understanding, self-supervised 3D reconstruction, and end-to-end semantic-level scene understanding. The approach gradually integrates semantic and geometric cues—first establishing reliable semantic priors, then recovering scene geometry without height supervision, and ultimately combining both into a unified 3D representation for more accurate scene understanding.

Beyond geometry: Reflectance-calibrated 3d Gaussians using LiDAR and imagery for photometrically robust Reconstruction

Yaoyu Li¹, Dedong Zhang^2,3

¹Hinton STAI Institute, East China Normal University, Minhang, Shanghai 200241, China; ²Department of Geography and Environmental Management, University of Waterloo, 200 University Avenue West, Waterloo, Ontario N2L 3G1, Canada; ³TianfuJiangxi Laboratory, Chengdu, Sichuan, 641419, China

This paper introduces LIG-3DGS, a novel framework for robust 3D reconstruction and novel view synthesis under conditions where standard image-based methods struggle. The core of our approach lies in the deep integration of LiDAR geometry and intensity information with a 3D Gaussian Splatting (3DGS) representation. Our qualitative and quantitative experiments demonstrate that LIG-3DGS significantly outperforms standard 3DGS and geometry-only baseline methods under challenging photometric conditions. By bridging the geometric precision of active sensing with the high-fidelity rendering of neural approaches, this work opens a promising pathway toward all-weather, high-fidelity 3D scene understanding.

Non-destructive extraction of vertical leaf base and inclination angles distribution in field maize

Lei Lei^1,2, Zhenhong Li^1,2, Guijun Yang^1,2,3, Hao Yang³

¹Key Laboratory of Loess, Xi’an 710054, China; ²College of Geological Engineering and Geomatics, Chang'an University, Xi’an 710054, China; ³Information Technology Research Centre, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

Distributions of leaf base and inclination angles are important crop phenotypic traits, influencing light interception and productivity. LiDAR provides unprecedented detail of the 3D structure of the crop canopy. Recent research mainly focuses on the leaf base and inclination angles of maize at the individual level or at lower planting density. It is difficult to extract the distributions of leaf base and inclination angles of maize in the field due to the interlocked and overlapped nature of leaves. In this study, we have proposed a high-throughput method to extract the distributions of leaf base and inclination angles of maize in the field. Following the separation of the leaf and stem of maize, hollow cylinders with different thicknesses were used to extract the local leaf points from the separated leaf points based on each stem fitted line, and the DBSCAN algorithm and singular value decomposition were used to calculate the leaf base and inclination angles. The distributions of leaf base and inclination angles of maize in the field with different cultivars, planting densities, and growth stages were extracted and analyzed, and these performed well against the validation data. The high-throughput extraction of these distributions in maize fields holds significant importance for studying the optimal maize cultivar in conjunction with radiative transfer models.

Extraction of CCTV Surveillance Coverage Based on UAV Mesh and CCTV Image

Wonjun Choi¹, Youngmok Kwon², Cheolhwan Kim³, Hong-Gyoo Sohn⁴

¹Dept. of Civil and Environmental Engineering, Yonsei University, Seoul, Republic of Korea; ²Dept. of Civil and Environmental Engineering, Yonsei University, Seoul, Republic of Korea; ³Dept. of Civil and Environmental Engineering, Yonsei University, Seoul, Republic of Korea; ⁴Dept. of Civil and Environmental Engineering, Yonsei University, Seoul, Republic of Korea

This study presents a geometric framework for recovering missing CCTV camera parameters and deriving reliable three-dimensional viewshed coverage by matching UAV-based 3D mesh models with real CCTV imagery. Most CCTV metadata only contains approximate latitude and longitude, while essential calibration parameters such as azimuth, tilt angle, focal length, and field of view are unavailable. Without these parameters, visibility analysis in urban environments becomes inaccurate due to unaccounted building occlusions. To address this, a coarse-to-fine pipeline is proposed. In the coarse stage, camera tilt is estimated from the CCTV image using a monocular surface normal estimation model, and camera yaw is determined by matching cylindrical panoramic renderings of the mesh against the CCTV image using a dense feature matching network. In the fine stage, perspective projection images are rendered at 1 m height intervals using the estimated orientation, and each candidate is matched against the CCTV image to identify the optimal camera height. The rendering process simultaneously records world coordinates for every visible pixel, enabling direct extraction of 3D-2D ground control point correspondences from the best-matched candidate. Outlier correspondences are removed through Fundamental Matrix RANSAC, and spatially distributed representative points are selected via agglomerative clustering. Camera parameters are then estimated using an improved Perspective Projection Model with rotation matrix orthogonality constraints and weighted least squares adjustment. The recovered parameters are used to generate three-dimensional viewshed polygons. The method was tested on 41 CCTV cameras on a university campus and validated using reprojection error and ground-truth camera positions.

Volume estimation and accuracy assessment of unauthorised material deposits using airborne photogrammetry and laser scanning for environmental inspection

Markéta Potůčková¹, Alex Šrollerů¹, Martin Marko², Eva Štefanová¹

¹Charles University, Faculty of Science, Department of Applied Geoinformatics and Cartography, Albertov 6, Prague 2, Czechia; ²Czech Environmental Inspectorate, Na Břehu 267/1a, Prague 9, Czechia

Determining the volume of unauthorised stockpiles or material deposits is a common task for environmental inspection authorities. Although UAV photogrammetry and laser scanning are widely adopted in many fields today, their use within environmental inspectorate practice may still be limited in some countries. In addition, archives of aerial imagery and laser scanning data maintained by national mapping agencies offer valuable resources for retrospective analyses of terrain changes caused by unauthorised material deposits; however, their potential has not yet been fully realised.

The objectives of this study are to: (1) present a comparative analysis of UAV photogrammetry and laser scanning in relation to terrestrial GNSS measurements for determining the volume of larger stockpiles and apply a model for volume accuracy assessment; and (2) demonstrate both the potential and limitations of using archived aerial imagery and laser scanning data for retrospective terrain-change analysis, with a focus on estimating the thickness and volume of deposits and their accuracy.

Both objectives stem from the current need for environmental inspectorate. Volume estimation can be highly sensitive because of associated penalties; therefore, understanding the accuracy and limitations of the applied methods is crucial. When time constraints are not an issue and dense vegetation poses a challenge (including grass cover that cannot be penetrated by laser signals), terrestrial GNSS or traditional surveying remain the most reliable options. Nevertheless, airborne photogrammetry and laser scanning offer undeniable advantages in terms of operability and retrospective analysis.

Improved ICP Algorithm Constrained by Intensity Gradient for Urban Airborne Array InSAR Point Cloud Registration

Lijun Lu¹, Fangfang Ji², Shucheng Yang¹, Chunquan Cheng¹, Hanchao Zhang¹

¹State Key Laboratory of Spatial Datum, Chinese Academy of Surveying and Mapping; ²Capital Normal University,China, People's Republic of

Airborne array InSAR achieves high-precision three-dimensional reconstruction through multi-baseline interferometric height measurement, holding significant application value in urban spatial structure monitoring and surface deformation analysis. However, the acquired urban InSAR point clouds are often affected by multiple factors, including platform attitude errors, system calibration inaccuracies, and multi-angle imaging geometric discrepancies, leading to noticeable spatial biases among different datasets. To achieve geometric consistency across multi-baseline data, high-accuracy point cloud registration has become a crucial step in InSAR data fusion processing. Therefore, the research proposed an improved ICP Algorithm Constrained by Intensity Gradient for Urban Airborne Array InSAR Point Cloud Registration. The improved ICP algorithm constrained by intensity gradients, which integrates geometric and electromagnetic scattering features. Experimental results demonstrate that the proposed method exhibits superior robustness and registration performance in complex urban scattering environments, providing effective technical support for 3D reconstruction of SAR point clouds.

AiDroneTree: A Novel AI Deep Learning Based Network for Individual Tree Detection Using UAV-Derived Point Cloud in Dense Urban and Forest Landscapes

Sina Jarahizadeh, Bahram Salehi

State University of New York College of Environmental Science and Forestry, Department of Environmental Resources Engineering, 1 Forestry Dr., Syracuse, NY 13210 USA

Individual Tree Detection (ITD) is a primary step for estimating tree attributes such as spatial distribution, geometry, and species used in forest management, urban planning, and carbon accounting. While traditional field-based inventories are accurate, they are costly, labour-intensive, and limited in coverage. High-resolution UAV LiDAR offers a scalable alternative, and Deep learning (DL)-based object detection methods further enable automated ITD at large scales. In contrast to RGB imagery, UAV LiDAR can be transformed into multi-band representations that capture rich structural and textural information, which enhances ITD performance. However, previous methods still confront challenges presented by complex forest conditions, including overlapping crowns, and computational inefficiency when processing high-resolution, multi-band data. We propose AiDroneTree: a novel one-stage DL object-detection framework for multi-band rasterized UAV LiDAR, empowering more accurate and efficient tree detection in dense and heterogeneous forests to address this issue. The AiDroneTree architecture detects and segments the individual trees by combining a custom-built backbone and head optimized for detecting small trees in complex canopy environments with integration of Convolutional Blocks with Concatenate (CBC), LeakyReLU activations, and tunable layers throughout to detect bounding boxes and confidence scores for each tree. The results have been evaluated against YOLO on datasets captured from various environments with different tree shapes, sizes, and densities. The quantitative and qualitative results show that AiDroneTree outperforms YOLO in various forest conditions and achieves 91% accuracy, 93% precision, and 92% recall and F1-score.

Integrated MBES-based Assessment of Dam Tailrace Structure and Geomorphology

Sehoon Oh, Jiwan Hong, Geonu Park, Daegeon Woo, Joon Heo

Yonsei University, Korea, Republic of (South Korea)

The dam tailrace is a critical zone for dam safety, as high-energy spillway flows can deteriorate concrete slabs and drive scour along the downstream riverbed. However, this zone is difficult to access, and structural and geomorphic conditions are often assessed independently, limiting integrated understanding of their coupled behavior. Multibeam echo sounding (MBES) helps close this gap by providing high-resolution underwater topography and enabling simultaneous mapping of engineered concrete surfaces and erodible beds within a single survey. When deployed on unmanned surface vehicle (USV) platforms, MBES allows safe and efficient bathymetric mapping in narrow or high-energy downstream channels, supporting more complete characterization of tailrace conditions.

In this study, a USV-mounted MBES was used to acquire high-density underwater measurements across the tailrace of Daecheong Dam, capturing both the concrete stilling basin and the downstream alluvial bed. The resulting point cloud was segmented into two functional zones: (1) the concrete slab zone, where planar-deviation metrics quantified slab misalignment, elevation offsets, and localized deformations; and (2) the downstream zone, where terrain-based depression analysis delineated scour features and characterized their depth, extent, and morphology. By relating structural anomalies observed along the slab surface to the spatial distribution and severity of downstream scour, we perform a coupled slab–scour assessment that links block-level distress to localized erosion patterns near the apron-end transition.

This integrated approach demonstrates how MBES, combined with geospatial analysis, can support comprehensive underwater inspection and contribute to improved operational monitoring and hazard mitigation for large hydraulic structures.

High-detail 3D surveying and digital restoration of historical xylographic stamps: The Ulisse Aldrovandi case

Anna Forte, Maria Alessandra Tini, Valentina Girelli, Gabriele Bitelli, Luca Vittuari

University of Bologna, Dept. of Civil, Chemical, Environmental and Materials Engineering DICAM, Bologna, Italy

This contribution presents a digital workflow for the virtual restoration and functional recovery of a historic xylographic matrix created by the 16th-century naturalist Ulisse Aldrovandi and preserved at Palazzo Poggi, University of Bologna. Although not physically broken, the pearwood block had undergone subtle yet significant geometric deformation over the centuries, preventing it from producing a complete and accurate print.

The project employed high-resolution structured-light scanning to generate a detailed 3D model of the engraved surface, capturing its geometry with sub-millimetric accuracy. From the resulting 31-million-polygon mesh, approximately 7000 points corresponding to the peaks of the engravings were manually extracted and interpolated to model the deformation. A corrective digital transformation was then applied directly to the mesh vertices, restoring the planarity originally required for printing without altering the object itself.

This case study demonstrates the potential of integrating high-resolution 3D surveying and digital modelling to address subtle geometric deterioration in historical artefacts. The method offers a fully non-invasive and reversible approach that can be extended to other wooden matrices or similarly sensitive cultural heritage objects. Future work includes testing additional surveying techniques and evaluating the reproducibility of the proposed workflow across a wider set of materials and conditions.

Multi-class deterioration detection using data-centric approach from UAV-based bridge inspection applications

Ya-Li Lin¹, Jiann-Yeou Rau¹, Chao-Hung Lin¹, Wei-Shen Lai², Chih-Chao Hu²

¹National Cheng Kung University, Chinese Taipei; ²Institute of Transportation, Ministry of Transportation and Communications, Chinese Taipei

Modern AI applications increasingly rely on visual data for perception and decision-making, yet their reliability is fundamentally constrained by data quality and representativeness. Bridge inspection exemplifies this challenge: UAV imagery of bridge surfaces often exhibits complex textures, overlapping deterioration types, and severe class imbalance, limiting the performance of conventional deep models. To address these issues, this study proposes a data-centric approach within an integrated UAV-based bridge inspection framework. High-resolution UAV images are processed through photogrammetric calibration using Structure-from-Motion (SfM) and bundle adjustment, while a Swin-Unet segmentation model is trained with a data-centric sampling strategy that evaluates image patches through coverage, boundary, texture, and edge-entropy indicators to select representative samples. Experiments demonstrate that the proposed method achieves substantial improvements in mean IoU and F1-score compared with random cropping. The resulting multi-class deterioration maps are spatially integrated with 3D bridge models, forming a foundation for digital-twin-based inspection and confirming the effectiveness of data-centric optimization in enhancing the robustness of AI-driven infrastructure assessment.

DamViT: Vision Transformer–Based Robust Segmentation and 3D Mapping of Concrete Dam Damage from UAV Imagery

Jiwan Hong, Sehoon Oh, Joon Heo

Yonsei University, Korea, Republic of (South Korea)

Concrete dams require regular inspection because surface cracking and spalling can threaten durability and safety, yet UAV images of dam faces are often affected by low-light, blur, over-exposure, and stain-like discoloration that confuse automated crack segmentation. This contribution presents DamViT, a Vision Transformer–based framework for robust pixel-wise segmentation and 3D mapping of damage on concrete dams. UAV RGB images are annotated into three classes (background, crack, spalling) and used to train a SegFormer-based network equipped with two lightweight components: a degradation-aware module that estimates a per-pixel degradation map and guides feature extraction under low-quality imaging, and a stain-aware training strategy that explicitly balances stain-rich non-damage patches with damaged regions to reduce false positives on surface stains. The resulting three-class masks are back-projected onto a photogrammetrically reconstructed 3D dam mesh using camera poses and intrinsics, enabling computation of crack length, spalling area, and their spatial distribution in the structural coordinate system. The proposed pipeline links UAV imaging, robust segmentation, and quantitative 3D damage mapping to support dam safety management.

An end-to-end pipeline for 3D building modeling, texturing, and semantic integration from uav data

HyunSoo Kim¹, DinhMinh Bui¹, Ji Sang Park², Jun Su Kim³, Changjae Kim¹

¹Dept. of Civil and Enviromental Engineering, college of Engineering, MyongJi University, Republic of Korea; ²Principal Researcher, Mobility and Navigation Research Section, Electronics and Telecommunication Research Institute , Daejeon, Republic of Korea; ³AI Technology Team, Geostory Co., Republic of Korea

This study proposes an end-to-end automated pipeline for the generation, texturing, and semantic enhancement of 3D building models using UAV-based multi-source data, including imagery, image-derived point clouds, and orthophotos. The pipeline consists of three sequential stages: automatic 3D modeling, post-processing and texturing, and semantic integration. In the first stage, building candidates are automatically extracted from UAV-derived point clouds and orthophotos to generate geometric 3D models. The second stage refines the geometry through manual correction and applies texture mapping using UAV imagery and camera orientation parameters to enhance visual realism. In the third stage, façade images derived from building textures are processed through learning-based operators to detect semantic components such as windows. The detected 2D semantic information is converted into 3D coordinates and integrated into the textured 3D models, forming CityGML-like hierarchical structures within a .json framework. The resulting models contain both geometric and semantic information, offering high compatibility with CityGML and CityJSON standards. The proposed workflow demonstrates the potential for efficient, data-driven, and automated urban model generation that supports digital twin construction and spatial database updating. Future work will focus on incorporating LiDAR-based point clouds to further improve automation and semantic accuracy within the CityGML 3.0 framework.

Comparison of Crack Detection Performance According to Caustic Noise Removal Methods in Shallow-Water ROV Imagery

Daegeon Woo, Jiwan Hong, Changjoon Oh, Geonu Park, Sehoon Oh, Joon Heo

Yonsei University, Korea, Republic of (South Korea)

This contribution investigates how caustic noise—bright, wave-induced light patterns—affects crack detection performance in shallow-water ROV imagery acquired at Daecheong Dam. Although many studies address underwater challenges such as turbidity, color attenuation, and motion blur, the optical distortions caused by caustic flicker have received little attention, despite being one of the most dominant artifacts in the 0–3 m depth range. Using real ROV video frames, we generated paired datasets with and without caustic-removal preprocessing and evaluated their impact on two lightweight CNN-based crack detection models (YOLOv5 and a transfer-learning AlexNet variant). Four filtering strategies were tested, including physics-based temporal median and motion-compensated averaging, as well as learning-based DeepCaustics and an FFT-residual method adapted from RecGS. Experimental results show that caustic-removal preprocessing consistently reduces false positives and improves crack visibility under diverse lighting conditions. The findings demonstrate that caustic noise is a critical but often overlooked source of detection instability in shallow-water inspections. The study emphasizes the importance of integrating simple, unsupervised caustic-mitigation steps into ROV-based monitoring pipelines to enhance the reliability of underwater infrastructure assessment.

Efficient Boundary Refinement for Classification of MMS Point Clouds

Makoto Nakano¹, Keita Hiraoka¹, Genki Takahashi², Hiroshi Masuda¹

¹The University of Electro-Communications, Japan; ²Kokusai Kogyo Co., Ltd., Japan

Mobile Mapping Systems (MMS) provide dense point clouds essential for 3D mapping and infrastructure management, where semantic labeling is required to segment points into meaningful objects. Previous studies have shown that multiscale geometric features effectively capture local context for this task. Building on our previous work using multiscale features with efficient two-stage neighborhood search, we applied Contrastive Boundary Learning (CBL) to enhance classification accuracy near object boundaries. While CBL significantly improved boundary recognition, it also increased computational cost compared to Random Forest–based segmentation, limiting its practicality for large-scale datasets. In this study, we analyze the trade-off between segmentation accuracy and inference time in CBL-based boundary refinement. We further explore strategies to reduce computation while maintaining sufficient accuracy, aiming to achieve an optimal balance for practical MMS point cloud processing.

Reconstruction and Evolution Simulation of Ancient Road Networks in the Yuncheng Region Based on Multi-Modal Data Fusion

Jingjue Jia¹, Mingyi Du¹, Qiang Chen¹, Zhenhua Gao²

¹Beijing University of Civil Engineering and Architecture, China, People's Republic of; ²Shanxi Provincial Research Institute of Archaeology,China, People's Republic of

Ancient transport networks are central to studies of historical geography, regional socio-economic systems, and human mobility patterns. Traditional network reconstruction has relied primarily on the Least-Cost Path (LCP) model; however, the LCP’s “single-optimal” assumption is overly simplistic and cannot capture common historical realities such as the coexistence of multiple routes. Although probabilistic approaches such as Circuit Theory (CT) and behaviorally explicit methods such as Agent-Based Modeling (ABM) have been developed, a systematic, integrated framework that combines these approaches remains underdeveloped. Using the Yuncheng area of Shanxi Province as a case study, this paper systematically compares and integrates three distinct network models by constructing LCP, CT, and ABM networks and quantitatively comparing their differences in path morphology and predictive logic. The resulting multimodal, integrated probabilistic road network synthesizes the strengths of the three approaches and provides precise, high-confidence target areas for archaeological survey.

Assessing Stream Morphology Using High Resolution and Thermal UAV Imagery

Anastasia Umstott, Yanli Zhang, Carmen Montana Schalk

Stephen F Austin State University, United States of America

To protect and promote fish resources, fish habitat needs to be assessed and establish a “standard” for good or poor habitat for specific fish species. For this study, High resolution UAV images, including thermal image, are collected with an Anzu Raptor T for selected streams in East Texas. Orthomosaic and classification analysis were performed to make accurate map to represent open water, channel substrate and riparian vegetation. This approach provides a rapid means to assess streams. Future efforts will target finer geomorphic unit classifications (e.g., pool, riffle, run) across multiple river systems. This information can be critical for freshwater habitat management and restoration.

Road marking condition assessment from drone imagery via detector-guided segmentation and gaussian mixture damage modeling

Dinh Minh Bui, JuBin Lee, HyunSoo Kim, SoMin Han, ChangJae Kim

Department of Civil and Environmental Engineering, College of Engineering, Myongji University.

Road marking condition assessment is essential for transportation safety and road asset management, yet conventional inspection methods remain labor-intensive and inefficient. This study proposes an automated workflow for assessing road-marking conditions from drone imagery by combining object detection with a detector-guided segmentation strategy. First, road-marking regions are localized through a lightweight detector optimized for aerial viewpoints. The detected regions are then refined using a segmentation module that produces pixel-accurate masks, enabling reliable extraction of surface-level deterioration such as fading, cracking, and structural discontinuities.

The proposed approach was evaluated on drone datasets collected under varying flight altitudes and illumination conditions. Experimental results indicate that detector-guided segmentation significantly improves robustness to background clutter and enhances segmentation accuracy compared to single-stage models. The method also supports quantitative condition scoring, making it suitable for integration into municipal inspection workflows.

This contribution demonstrates the potential of combining detection and segmentation for large-scale, drone-based road-marking assessment, offering a practical solution for automated infrastructure monitoring.

Quantitative Analysis of LiDAR Accuracy for Mapping Applications

Ahmed Elaksher¹, Tarig Ali², Abdullatif Alharthy³

¹NMSU, United States of America; ²American University of Sharjah, UAE; ³Ministry of National Guard, KSA

Airborne laser scanning (LiDAR) technology has demonstrated exceptional capability in rapidly capturing dense point clouds and accurately representing complex surface features. It has been successfully applied across numerous geospatial and engineering disciplines with highly promising outcomes. The accuracy of any derived product inherently depends on the quality of the original LiDAR data and the processing methods employed. Therefore, evaluating data quality is an essential prerequisite for reliable analysis and application.

This study presents a quantitative assessment of LiDAR system performance, focusing on the intrinsic accuracy of the laser measurements themselves—an aspect often underexplored in existing literature. The evaluation was conducted through detailed field surveying using GPS triangulation and leveling techniques. Results reveal both planimetric and vertical accuracy characteristics, with a total elevation discrepancy of approximately 0.12 m and a horizontal RMSE near 0.50 m. The identified discrepancies exhibit two distinct components: a short-period random variation associated with the LiDAR ranging system, and a lower-frequency component influenced by biases in the geopositioning subsystem.

Image-assisted aerial LiDAR completion with morphology-guided gaussian splatting

Siyuan Zou¹, Yongjun Zhang², Zhiwei Li¹, Hongbo Pan¹, Xinyi Liu², Haojun Tang¹, Hai Kan³

¹School of Geoscience and Info-Physics, Central South University; ²School of Remote Sensing and Information Engineering, Wuhan University; ³School of Resource and Environmental Sciences, Wuhan University

Airborne LiDAR offers high geometric accuracy and efficient wide-area coverage, and has been widely used in applications such as urban 3D reconstruction, forestry inventory, topographic mapping, and powerline extraction . However, due to near-nadir acquisition geometry and occlusions, vertical structures such as building façades are often under-sampled, resulting in large voids in the point cloud . Traditional geometric hole-filling methods, including Moving Least Squares, Poisson surface reconstruction, and mesh repair, are effective for small gaps, but they often suffer from over-smoothing, structural distortion, and topological discontinuities when applied to large-scale missing regions.

Meanwhile, multi-view imagery can recover continuous surfaces through dense matching or Gaussian Splatting, but the reconstruction quality still depends heavily on the completeness of the initial geometry. When the initial triangulated points or geometric priors are incomplete, façade regions remain prone to fragmentation and noise This paper proposes an image-assisted LiDAR completion framework that models LiDAR completion as continuous surface reconstruction with explicit Gaussians. Through anisotropic Gaussian initialization and tangent-plane-guided densification, the method preserves façade geometry and improves the completeness and accuracy of LiDAR-image fusion reconstruction.