JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at isprs2026@icsevents.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Daily Overview

Session

WG II/4C: AI/ML for Geospatial Data

Time:

Thursday, 09-July-2026:

3:30pm - 5:15pm

Location: 715B

125 theatre

Session Topics:

AI/ML for Geospatial Data (WG II/4)

External Resource: http://www.commission2.isprs.org/wg4

Presentations

3:30pm - 3:45pm

DeepChoice: Learning View Weighting for Image-Guided 3D Semantic Segmentation

Antoine Carreaud^1,2, Digre Frinde¹, Shanci Li¹, Jan Skaloud², Adrien Gressin¹

¹University of Applied Sciences Western Switzerland (HES-SO / HEIG-VD); ²ESO lab, EPFL, Switzerland

Multi-view image-to-point label transfer is an effective strategy for 3D semantic segmentation, but its performance largely depends on how predictions from multiple image observations are fused for each 3D point. Most existing pipelines rely on hard voting or handcrafted weighting rules, which do not explicitly learn the reliability of each view under varying geometric and image-quality conditions. In this paper, we introduce DeepChoice, a lightweight view-weighting module for image-guided 3D semantic segmentation. For each visible observation of a 3D point, DeepChoice exploits a compact set of visibility cues, including incidence angle, range, contrast, sharpness, signal-to-noise ratio, and saturation, to predict normalized per-view weights used to aggregate 2D semantic class probabilities into final 3D point-wise predictions. The method is sensor-agnostic, requires no meshing, and can be integrated as a replacement for standard multi-view fusion rules. Experiments on the full GridNet-HD benchmark show that DeepChoice improves over hard voting by 3.85 mIoU points and over mean-probability fusion by 1.26 points, while reducing the gap with the AnyView oracle upper bound. The largest gains are observed on thin and difficult classes such as conductors, pylons, and insulators. Furthermore, a complementary evaluation on the Images PointClouds Cultural Heritage}dataset shows that the proposed weighting strategy remains beneficial under a very different acquisition context and scene structure, yielding a 1.55 mIoU point improvement over hard voting. These results show that learning how to weight views is a simple yet effective way to strengthen image-guided 3D semantic segmentation pipelines. Code is publicly available at: https://huggingface.co/heig-vd-geo/DeepChoice.

3:45pm - 4:00pm

Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers

Mohammadreza Heidarianbaei, Max Mehltretter, Franz Rottensteiner

Leibniz University Hannover, Germany

Textured 3D meshes jointly encode geometry, topology, and appearance, yet their irregular structure poses significant challenges for deep-learning-based semantic segmentation.

While a few recent methods operate directly on meshes without imposing geometric constraints, they typically overlook the rich textural information also provided by such meshes. We introduce a texture-aware transformer that learns directly from raw pixels associated with each mesh face, coupled with a new hierarchical learning scheme for multi-scale feature aggregation.

A texture branch summarizes all face-level pixels into a learnable token, which is fused with geometrical descriptors and processed by a stack of Two-Stage Transformer Blocks (TSTB), which allow for both a local and a global information flow.

We evaluate our model on the Semantic Urban Meshes benchmark and a newly curated cultural-heritage dataset comprising textured roof tiles with triangle-level annotations with damage types.

Our method achieves 81.9\% mF1 and 94.3\% OA on SUM, and 49.7\% mF1 and 72.8\% OA on new dataset, substantially outperforming existing approaches.

4:00pm - 4:15pm

Pothole Classification using Point Cloud Data: a Comparison between Machine Learning and Deep Learning

Kristin Eggen, Hongchao Fan

Norwegian University of Science and Technology, Norway

Automatic pothole detection is important for improving road maintenance and transportation safety. While image-based pothole detection often struggles under poor lighting and weather conditions, point cloud data provides a robust alternative by capturing detailed surface geometry. Machine learning has demonstrated strong performance in point cloud classification. While traditional machine learning is simpler and relies on handcrafted features, deep learning models are more powerful, as they learn complex, high-dimensional patterns directly from the input data. While most existing work relies on deep learning models, which are time-consuming to train and require extensive labelled datasets, potholes can be well described by geometric features, making pothole detection well-suited for feature engineering. This paper compares traditional machine learning and deep learning approaches for pothole classification using point cloud data, to evaluate whether the added complexity and data demands of deep learning models are justified, or if traditional machine learning techniques are sufficient for accurate classification. A dataset with labelled pothole instances is created to train both models. The machine learning approach uses manually engineered geometric features as input to an ensemble classifier, while the deep learning model is trained on sampled data. Experimental results show that the machine learning approach outperformed the deep learning model. These results suggest that for this particular task, where informative domain-specific features can be manually engineered, the machine learning approach offers a more practical and efficient solution for real-world deployment, where labelled data may be limited.

4:15pm - 4:30pm

From Canopy to Crown: High-Fidelity Tree Facade Synthesis from Nadir LiDAR data

Raghav Sharma¹, Frank Zhang¹, Jane Liu², Baoxin Hu³

¹University of Fraser Valley; ²University of Toronto; ³York University

Synthesizing realistic fac¸ade views of individual trees from nadir-view remote sensing data would transform large-scale forest

analysis, yet remains unsolved due to data scarcity and task ambiguity. We present the first conditional diffusion model to generate

structurally plausible fac¸ade views of individual tree crowns from single nadir-view LiDAR rasters, leveraging the FOR-species20K

benchmark dataset. Our approach integrates nadir projections with tree species and height within a U-Net-based denoising diffusion

framework. Experiments demonstrate that nadir imagery alone is insufficient, but conditioning on species and height enables

synthesis of visually realistic, species-specific fac¸ade views. The fully conditioned model achieves substantial gains in perceptual

(LPIPS: 0.184) and structural (SSIM: 0.576) similarity, outperforming nadir-only baselines by more than twofold. Our results

establish that ancillary attributes critically constrain the solution space, enabling diffusion models to infer plausible structures

from ambiguous nadir input. This work demonstrates a scalable path to enriching nadir-based forest inventories with synthesized

structural detail, reducing the need for resource-intensive ground surveys.

4:30pm - 4:45pm

Evaluation of Metric Monocular Depth Estimation Models Under Adverse Weather Conditions in Driving Scenarios

Nour Khalefa, Roberto Souza, Naser Elsheimy

University of Calgary, Canada

Metric monocular depth estimation has become increasingly important and is often used as a redundancy mechanism in autonom

ous driving, where accurate scene understanding is essential for safe decision-making. In this work, we evaluate three recently

proposed models that represent the state-of-the-art (Depth Anything, PackNet-SfM, and UnidDepth) using zero-shot testing on the

DrivingStereo dataset across diverse weather conditions, and benchmark their performance. Our analysis considers not only metric

depth accuracy metrcis but also each model’s ability to generalize under challenging environmental variations. While UniDepth

achieves notable improvements over Depth Anything and PackNet-SfM, our results show that substantial progress is still needed for

robust real-world deployment. To further assess its practical suitability for autonomous driving applications, we conduct a detailed

examination of UniDepth’s strengths, limitations, and failure modes.

4:45pm - 5:00pm

Out-of-Distribution Detection for Real-World Honey Bee Monitoring Using Simulated Permanent Laser Scanning

William Albert^1,2, Ronald Tabernig^1,2, Jannik S. Meyer¹, Bernhard Höfle^1,2

¹3DGeo Research Group, Institute of Geography, Heidelberg University; ²Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University

We present the first Open-Set Recognition (OSR) workflow for environmental monitoring for Permanent Laser Scanning (PLS) setups, using a Deep Neural Network (DNN) solely trained on simulated data. Such monitoring systems were previously only trained with real-world data and under the closed-set assumption, because they are commonly designed to observe a specific and predefined phenomenon (e.g., beach erosion, rockfall activity, vegetation change, animal behavior). The use of real-world data requires manual labeling, which is tedious given the great amount of point clouds. For this reason, we use Virtual Laser Scanning of Dynamic Scenes (VLS-4D) in a PLS setup to investigate how knowledge from synthetic data can be applied to real-world PLS monitoring systems in open-set settings. We introduce a novel framework that enables Open-Set Recognition (OSR) for animal monitoring (e.g. honey bees) using PLS data. The DNN is fine-tuned exclusively on a simulated LiDAR point cloud time series of flying honey bees, and integrates OSR to handle unknown classes during real-world deployment (e.g., butterflies, leaves, wren, and hare). By leveraging deviations in feature embeddings of the DNN, our method reliably distinguishes the known honey bee class from previously unseen classes, supporting robust monitoring under persistent distribution shifts. This approach reduces the dependence on extensive manual annotation of real-world point clouds, while maintaining reliable classification performance. It also highlights the potential of synthetic training data and OSR for environmental monitoring with PLS systems.