JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at isprs2026@icsevents.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Agenda Overview

Session

ThS15: Data-Centric Learning for Geospatial Data

Time:

Saturday, 11-July-2026:

10:30am - 12:00pm

Location: 714A

175 theatre

Session Topics:

Data-Centric Learning for Geospatial Data (ThS15)

Presentations

10:30am - 10:45am

The Potential of Copernicus Satellites for Disaster Response: Retrieving Building Damage from Sentinel-1 and Sentinel-2

Olivier Dietrich¹, Merlin Alfredsson¹, Emilia Arens², Nando Metzger¹, Torben Peters¹, Linus Scheibenreif¹, Jan Dirk Wegner², Konrad Schindler¹

¹ETH Zurich, Switzerland; ²University of Zurich, Switzerland

Natural disasters demand rapid damage assessment to guide humanitarian response. Here, we investigate whether medium-resolution Earth observation images from the Copernicus program can support building damage assessment, complementing very-high resolution imagery with often limited availability. We introduce xBD-S12, a dataset of 10,315 pre- and post-disaster image pairs from both Sentinel-1 and Sentinel-2, spatially and temporally aligned with the established xBD benchmark. In a series of experiments, we demonstrate that building damage can be detected and mapped rather well in many disaster scenarios, despite the moderate 10m ground sampling distance. We also find that, for damage mapping at that resolution, architectural sophistication does not seem to bring much advantage: more complex model architectures tend to struggle with generalization to unseen disasters, and geospatial foundation models bring little practical benefit. Our results suggest that Copernicus images are a viable data source for rapid, wide-area damage assessment and could play an important role alongside VHR imagery. We release the xBD-S12 dataset, code, and trained models to support further research.

10:45am - 11:00am

From Text to Map: AI-Based Graphic Translation of Information

Francesca Biolo, Franco Guzzetti, Isabella C.R. Balestreri

Politecnico di Milano, Department of Architecture, Built Environment and Construction Engineering, 20133 Milan, Italy

In recent years, technological advancements, particularly in artificial intelligence (AI), are changing various fields and spurring new research. This study focuses on the use of AI in cartography and historical studies. It is part of the PRIN project "Crafted in Stone / Recorded on Paper," which aims to document the heritage of small Italian municipalities by creating an open-access database. The research discovered significant documents in Gandino, Italy, including a large-scale map and a 139-page textual register from the mid-eighteenth century. These documents come from land surveyors who measured municipal boundaries and properties using physical landscape markers. The original surveying method, although lost, shares similarities with modern land descriptions.

The study seeks to generate new maps from these textual registers using AI capabilities, aiming to replicate a historical mapping effort from the 1700s. Initial tests with an AI model involved reading the register, computing measurements, and creating coordinate tables. The results showed promise despite some inaccuracies. The goal is to develop an interdisciplinary method that graphically reconstructs information from written documents, enhancing access for historical and territorial analysis. The research will also explore further AI models and larger case studies to achieve this aim.

11:00am - 11:15am

From Pixels to Semantics: Can a Single Instruction-Tuned VLM Unify Geospatial Building Analysis?

Guneet Mutreja¹, Harisankar Harikumar², Chaikal Amrullah¹, Ksenia Bittner¹

¹Remote Sensing Technology Institute (IMF), German Aerospace Center (DLR); ²Karlsruhe Institute of Technology

The analysis of buildings from aerial imagery is a fundamental task for urban planning and disaster response, yet it traditionally requires a suite of specialized models for tasks like segmentation, detection, and semantic querying. The advent of generalist Vision-Language Models (VLMs) offers a new paradigm, but their adaptation to the specific, high-resolution remote sensing domain remains a significant challenge. This paper proposes and investigates a novel methodology for adapting a general-purpose VLM,

Google’s PALIGEMMA2, to function as a unified geospatial building analyzer. The core of this contribution is a data-centric pipeline that converts single-modality annotations (building polygons) into a rich, multi-task instruction-tuning dataset (16,500 samples) spanning segmentation, detection, Visual Question Answering (VQA), and captioning. A rigorous study is conducted to answer three critical questions: (1) Can a single instruction-tuned VLM outperform specialized models in a multi-task setting? (2) What are the synergistic benefits of multi-task learning? (3) How data-efficient is this adaptation process? The results demonstrate that the unified model significantly outperforms the zero-shot PaliGemma2 baseline and strong single-task fine-tuned variants on three out of four tasks, while remaining competitive on the fourth. A strong synergistic effect is found: multi-task training on both visual localization and semantic tasks improves performance on individual localization tasks. Furthermore, the analysis shows that high performance can be achieved with a surprisingly small instruction dataset. This work provides a complete methodology for efficiently adapting VLMs to multi-task geospatial analysis, suggesting a new path towards generalist models in remote sensing.

11:15am - 11:30am

Geolocation-aware pretraining strategies for globally applicable remote sensing foundation models

Mojgan Madadikhaljan, Jonathan Prexl, Michael Schmitt

University of the Bundeswehr Munich, Germany

Foundation models have achieved remarkable success across various domains due to their ability to learn generalizable representations from large-scale, unlabeled datasets. In the geospatial domain, several foundation models have been developed to leverage the abundance of unlabeled remote sensing data and support Earth observation tasks across diverse regions and sensor types. However, the geolocation-dependent characteristics of remote sensing data introduce unique challenges in adapting these models to region-focused applications. By conducting a comprehensive empirical analysis across diverse geographical regions and tasks, we explore whether incorporating regional information during pretraining or fine-tuning improves performance on region-specific downstream tasks. We show that regional representation learning, as well as regional adaptation of features extracted from a globally trained foundation model, is beneficial when the region-specific performance of the downstream tasks is of interest. To this end, we also propose a regional adaptation to the globally trained foundation models to balance global diversity with regional representation learning for improved performance.

11:30am - 11:45am

An assessment of data-centric methods for label noise identification in remote sensing data sets

Felix Kröber^1,2, Genc Hoxha², Ribana Roscher²

¹Forschungszentrum Juelich GmbH, Germany; ²University of Bonn, Germany

Label noise in the sense of incorrect labels is present in many real-world data sets and is known to severely limit the generalizability of deep learning models. In the field of remote sensing, however, automated treatment of label noise in data sets has received little attention to date. In particular, there is a lack of systematic analysis of the performance of data-centric methods that not only cope with label noise but also explicitly identify and isolate noisy labels. In this paper, we examine three such methods and evaluate their behavior under different label noise assumptions. To do this, we inject different types of label noise with noise levels ranging from 10 to 70% into two benchmark data sets, followed by an analysis of how well the selected methods filter the label noise and how this affects task performances. With our analyses, we clearly prove the value of data-centric methods for both parts – label noise identification and task performance improvements. Our analyses provide insights into which method is the best choice depending on the setting and objective. Finally, we show in which areas there is still a need for research in the transfer of data-centric label noise methods to remote sensing data. As such, our work is a step forward in bridging the methodological establishment of data-centric label noise methods and their usage in practical settings in the remote sensing domain.

11:45am - 12:00pm

Automatic Extraction and Multi-Class Instance Segmentation of Rural Road Networks from Orthoimagery using YOLOv11 and SAHI Sliced Inference for Cadastral Update

Marsia Sanità¹, Lindo Nepi², Eva Savina Malinverni¹, Adriano Mancini², Roberto Pierdicca¹, Artur Warchoł³, Monika Balawejder⁴

¹Dept. of Civil, Building and Architecture, Marche Polytechnic University, 60131 Ancona, Italy; ²Department of Information Engineering (DII), Marche Polytechnic University, 60131 Ancona, Italy; ³Kielce University of Technology – Kielce, Poland; ⁴PANS State University of Applied Sciences in Jaroslaw, Poland

Extracting road networks from high-resolution imagery remains a significant challenge in geomatics, particularly in fragmented rural landscapes. The big difficulty is the spectral similarities between unpaved tracks and agricultural backgrounds that can lead to classification errors. This study proposes an automated geospatial pipeline based on the YOLOv11 architecture. Specifically, the approach is made on the optimization of the multi-class road detection in the rural areas of Kosina and Markowa, two villages in Poland. To reduce the computational effort, due to large-scale 9000x9000 px orthophotos and to improve the detection of small-scale features, Slicing Aided Hyper Inference (SAHI) strategy was integrated. High-resolution imagery has been decomposed into optimized tiles, ensuring feature continuity across boundaries and preventing GPU memory overhead. The instance segmentation model was trained on a custom-annotated dataset, with seven labels (categories) such as internal paved roads, rural tracks, and railway infrastructures. Therefore, a high level of robustness has been achieved reaching a mean Average Precision value (mAP@0.5) of 0.90. A confusion matrix reveals quantitatively that the pipeline effectively distinguishes between complex classes and low omission rates. As a result, the generated outputs are converted into interoperable GeoJSON format ensuring their integration into GIS environments. In conclusion, the experimental result demonstrates that the framework is valuable for emergency response logistics and urban planning. It offers a scalable and near real-time solution for updating national topographic databases.