JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at isprs2026@icsevents.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Daily Overview

Location: 716B
175 theatre

Date: Monday, 06-July-2026

8:30am - 10:00am

ThS14: AI-Augmented Photogrammetry - Bridging Learning-based Approaches and Classical Geometric-based 3D Methods
Location: 716B

8:30am - 8:45am

Combining Photogrammetry and Gaussian Splatting

Fabio Remondino¹, Elisa Mariarosaria Farella¹, Gianluca Bertolasi¹, Simone Rigon¹, Rongjun Qin²

¹3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK), Trento, Italy; ²Department of Civil, Environmental and Geodetic Engineering, The Ohio State University, USA

Among the image-based methods, traditional photogrammetry is a consolidated 3D reconstruction technique able to provide highly accurate metric products, widely exploited in many domains for documentation and mapping purposes. The reconstruction capability of this technique is, however, conditioned by the characteristics of the captured scene, with high performance in well-textured areas and limits when non-collaborative surfaces, such as reflective or transparent objects, are present. In such cases, the photogrammetric reconstruction is often affected by noise, incomplete geometry and artifacts, reducing its final reconstruction quality. In recent years, different AI-based reconstruction methods have emerged as alternative (or complementary) 3D reconstruction and rendering solutions. In particular, 3D Gaussian Splatting (GS) has demonstrated impressive capabilities in rendering photorealistic scenes in challenging situations with high visual fidelity. However, its application in large-scale scenarios or when highly accurate 3D metric products are required is still limited, due to the hight computational resources needed and the intrinsic optimization of GS methods for photometric rendering quality. To address these bottlenecks, this work proposes a hybrid reconstruction pipeline, leveraging the strengths and benefits of each technique. The method exploits the accurate geometry of photogrammetry in well-textured regions and the higher GS capabilities to improve completeness and visual aspect in areas featuring non-collaborative surfaces. A fusion strategy is proposed to combine the two products into a single 3D model, presenting results on two aerial and one terrestrial dataset.

8:45am - 9:00am

Refraction-Aware Two-Media NeRF for Underwater 3D Reconstruction

Markus Brezovsky¹, Anatol Guenthner², Frederik Schulte³, Lukas Winiwarter³, Boris Jutzi², Gottfried Mandlburger¹

¹Department of Geodesy and Geoinformation, TU Wien, Vienna, Austria; ²Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany; ³Unit of Geometry and Surveying, University of Innsbruck, Innsbruck, Austria

Neural Radiance Fields (NeRFs) (Mildenhall et al., 2020) have revolutionized novel view synthesis, but standard formulations assume straight rays in a single, homogeneous medium. In underwater scenarios, refraction at the air–water interface leads to bent light paths and, if ignored, to distorted 3D structure and missing underwater points. Refraction-aware NeRF variants such as NeRFrac (Xue et al.,2022) demonstrate the benefit of modeling refraction, but are limited to a single underwater medium and standalone implementations. Recent work has applied NeRFrac to through-water reconstruction (Brezovsky et al., 2025), introduced a simulation framework for two-media scenes (Schulte et al., 2025). Building on these ideas, we introduce the general concept of a twomedia NeRF and demonstrate its integration into the Nerfstudio framework (Tancik et al., 2023) with the goal of extracting metrically meaningful underwater point clouds rather than only improving image-based metrics.

9:00am - 9:15am

CENS: A Coverage-efficient Pixel Sampling Strategy for enhancing NeRF-generated Point Cloud Fidelity

Perpetual Hope Akwensi, Frederik Schulte, Lukas Winiwarter

Unit of Geometry and Surveying, Universität Innsbruck, Austria

Many geospatial workflows critically depend on high-fidelity 3D point clouds for applications such as change detection, orthophoto generation, and modeling. However, NeRF-generated point clouds often suffer from sampling inefficiencies inherent in the predominant random pixel sampling approach. We identify spatial redundancy as one such inefficiency: random sampling has the inevitable consequence of sampling large, low-texture patches more frequently than detailed, high-frequency textured regions. As a result, low-texture areas turn to be oversampled and other pixels remain unsampled -- regardless of their importance to the reconstruction task.

To overcome this, we propose CENS (Coverage-Efficient Non-Redundant Sampling), a deterministic pixel sampling strategy that maximizes spatial coverage, eliminates intra-image sample repetition, and ensures reproducibility via structured initialization.

Evaluated on the Jamtal valley dataset, CENS achieves comparable geometric accuracy (C2M: mean = -0.0027 vs. -0.0011 m; standard deviation = 0.027 vs. 0.028 m) using 50% fewer training steps (11,232 vs. 22,464), while yielding 28.2% more points, higher orthophoto fidelity, and improved point cloud completeness. Beyond CENS, we also explored NeRFs for ALS point cloud simulation, achieving realistic occlusion patterns and accuracy within UAV photogrammetry standards (Vertical RMSE} = 24 mm; Horizontal RMSE = 17 mm).

Crucially, CENS positions NeRFs as a scalable, practical solution for geospatial point cloud and orthophoto generation, advancing them toward real-world mapping workflows, and integrates seamlessly into NeRFStudio.

9:15am - 9:30am

Explicit Reconstruction of thermal Environments based on dual-modal neural Radiation Fields for diagnosing Building Facade Defects

Sirui Chen¹, Yipeng Lu², Zhe Chen², Fuxun Liang^1,2, Zhen Dong²

¹School of Urban Design, Wuhan University, Wuhan, China, China, People's Republic of; ²State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University, Wuhan, China

This research presents an innovative multi-modal framework for the explicit 3D reconstruction of building thermal environments to diagnose facade defects. We propose a framework which is centered on a dual-branch Neural Radiance Field (NeRF) architecture, which effectively fuses fine-grained geometric information from RGB data with precise quantitative thermal data from TIR data. For practical diagnostics, the framework integrates the Signed Distance Function (SDF) to implicitly learn a high-fidelity surface representation. Subsequently, a final, explicit triangular mesh is extracted from this implicit field using the Marching Cubes algorithm. The resulting model achieves geometric accuracy and thermal fidelity, enabling the clear visualization, localization, and analysis of thermal anomalies such as thermal bridges, cavities, and moisture ingress in their correct spatial context.

9:30am - 9:45am

Assessing the Reconstruction Potential of 3D Vision Foundation Models for Oblique Photogrammetry

Junfan Wang¹, Feng Liu², Zhihao Jia¹, Han Hu^1,3, Min Chen^1,3, Xuming Ge^1,3, Ping Wen^3,4, Chong Wang^3,4, Qing Zhu^1,3

¹Faculty of Geosciences and Engineering, Southwest Jiaotong University, 611756 Chengdu, China; ²CRSC Communication & Information Group Co., Ltd.; ³Yunnan Engineering Research Center of 3D Real Scene, Kunming 650500, China; ⁴Kunming Engineering Corporation Limited, Kunming 650500, China

3D vision foundation models, which directly regress 3D geometry from 2D images in an end-to-end manner, have recently attracted growing attention in the computer vision community. However, their potential for oblique 3D reconstruction has not been systematically evaluated. To this end, we establish an automated evaluation pipeline to benchmark these models on oblique imagery. Our experiments reveal that: benefiting from the powerful zero-shot generalization, 3D vision foundation models can robustly estimate camera parameters and generate dense point clouds under sparse-view and low-overlap conditions, with some rivaling traditional photogrammetry configured with redundant observations. Counterintuitively, two-view reasoning foundation models employing explicit PnP-RANSAC for global alignment consistently outperform multi-view reasoning foundation models inferring multi-view relationships via implicit attention mechanism when processing more than 2 views. Notably, incorporating known camera parameters as conditioning inputs, which act as weak supervision rather than rigid geometric constraints, yields only marginal accuracy improvements. Based on ViT architecture, these foundation models face scalability bottlenecks to large-scale and high-resolution oblique imagery, and their prevalent ideal pinhole camera assumption still makes explicit distortion correction an unavoidable preprocessing step.

9:45am - 10:00am

Evaluating the Performance of 3D Vision Foundation Models for DSM Reconstruction from Satellite Images

Liupeng Su¹, Yuhao Ye¹, Han Hu¹, Zeyuan Dai^2,3, Qianrui Guo⁴, Heyi Li⁴, Yulin Ding¹, Qing Zhu¹

¹Faculty of Geosciences and Engineering, Southwest Jiaotong University, Chengdu 611756, Sichuan, China; ²Department of Military Oceanography and Hydrography and Cartography, Dalian Naval Academy, Dalian 116018, China; ³Key Laboratory of Hydrographic Surveying and Mapping of PLA, Dalian Naval Academy, Dalian 116018, China; ⁴Institute of Remote Sensing Satelite, China Academy of Space Technology, Beijing 100094, China

Three-dimensional (3D) reconstruction from satellite imagery is a critical research topic in the fields of remote sensing and geoinformation science. Although 3D Vision Foundation Models (3D VFMs) have demonstrated remarkable performance in reconstructing natural scenes, their capability to handle high-resolution satellite imagery has not been systematically evaluated. This study presents a comprehensive assessment of seven representative 3D VFMs for satellite-based 3D reconstruction and integrates four point-cloud alignment strategies. Rigorous comparisons were conducted against high-precision LiDAR-derived Digital Surface Models (DSMs) using two publicly available multi-view satellite datasets--WHU-TLC and MVS3D. Experimental results show that, on the high-resolution MVS3D dataset, the Depth Anything v2 (DAV2) model combined with the Affine alignment strategy achieved the best overall performance, producing DSMs with a Mean Absolute Error (MAE) of 1.75 m and a Root Mean Square Error (RMSE) of 3.24 m, corresponding to accuracy improvements of 8.4 % and 13.6 %, respectively--significantly outperforming all other model-strategy combinations. In contrast, on the lower-resolution WHU-TLC dataset, all 3D VFMs exhibited notable performance degradation, and the reconstructed results showed limited practical value, revealing persistent generalization challenges for current models in low-resolution scenarios. Overall, this study systematically quantifies the performance of 3D VFMs in satellite image-based 3D reconstruction, confirming their strong potential for high-resolution satellite applications and providing valuable insights for enhancing model robustness and generalization across complex urban and low-resolution environments.

1:30pm - 3:00pm

Forum1A: Observing the Earth as One: Making space for everyone in Remote Sensing, Photogrammetry, and Spatial Information Science
Location: 716B

3:30pm - 5:15pm

Forum1B: Observing the Earth as One: Making space for everyone in Remote Sensing, Photogrammetry, and Spatial Information Science
Location: 716B