Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Agenda Overview |
| Session | ||
WG II/4B: AI/ML for Geospatial Data
Session Topics: AI/ML for Geospatial Data (WG II/4)
| ||
| External Resource: http://www.commission2.isprs.org/wg4 | ||
| Presentations | ||
1:30pm - 1:45pm
From Pixels to Polylines: Extracting City-scale Vectorized Roof Structures with Line Segment Detection Networks 13D Optical Metrology (3DOM) Unit, Bruno Kessler Foundation (FBK), Trento, Italy; 2Technische Universität Berlin, Institute of Geodesy and Geoinformation Science, Berlin, Germany; 3GeoPlato Engineering Inc., Bilkent Cyberpark, Ankara, Türkiye Automatic extraction of vectorized roof structures above LOD2.0 remains challenging due to their geometric complexity and the presence of small and occluded elements over the roofs. Detecting fine-scale roof objects such as chimneys and dormer windows in very high resolution aerial imagery is still an active research topic. This study presents a workflow for automated detection and vectorization roof structures at city scale using Line Segment Detection (LSD) networks. Compared to model-based building reconstruction approaches, LSD networks do not rely on pre-defined roof typologies and are able to extract complex roof structures and small objects over the building roofs. For this purpose, a dataset comprising approximately 139,000 buildings with LOD2.2 roof structures and more than 2.2 million roof segments is generated using 8 cm GSD aerial imagery. An automated end-to-end workflow is developed, trained and tested from the available data. Experimental results indicate that roof structures suitable for LOD2.2 3D roofs can be extracted and vectorized with high accuracy, achieving 58.4% msAP and 73.1% mAPJ with ULSD network. Robustness is further assessed by visual inspection in areas affected by roof-blocking objects such as trees and cast shadows. 1:45pm - 2:00pm
Automatic Large-Scale Topographic Mapping from High-Resolution Aerial Imagery University of Twente, ITC Faculty Geo-Information Science and Earth Observation, Netherlands, The Topographic maps provide structured, polygonal representations of the Earth’s surface, delineating land-cover classes such as buildings, roads, water bodies, and vegetation. They form the foundation of national geospatial data infrastructures and support a wide range of applications, including urban planning, environmental monitoring, and cadastral management. However, the production and maintenance of such large-scale topographic maps still rely heavily on manual photo-interpretation and vector editing. While such human-in-the-loop workflows ensure geometric accuracy, they are labor-intensive, costly, and non-reproducible, limiting scalability and update frequency. However, most existing polygonal outline extraction methods are restricted to single-class, which typically leads to overlaps, gaps, and inconsistent shared boundaries when extended to multi-class mapping. Moreover, few studies have demonstrated nationwide implementation or validation, leaving the scalability and generalization of current methods largely unexplored. To address these challenges, this study develops a fully automated framework for large-scale topographic mapping directly from high-resolution aerial imagery. The framework aims to produce seamless, multi-class topographic maps in a single run that remain topologically consistent across diverse urban and rural regions in the Netherlands and beyond. 2:00pm - 2:15pm
Todo Fir Crown Instance Segmentation in dense Plantation Forest using Polar-FFT and Treetop Queries 1Graduate School of Engineering, Hokkaido University; 2Forestry Research Institute, Hokkaido Research Organization; 3Faculty of Engineering, Hokkaido University Instance segmentation of individual trees from UAV-derived orthomosaics and DSMs remains challenging in dense planted forests in Japan because SfM-derived DSMs often have blurred crown boundaries and unstable quality. We propose a PFFT-based method that encodes the local DSM shape around treetop candidates and integrates it into Mask2Former to suppress unreliable candidates and improve crown separation. Experiments on Abies sachalinensis plantation (Todo fir) data from two sites in Hokkaido showed that the method improved mAP75 from 52.18% to 55.47% and F1 at a confidence threshold of 0.5 from 89.86% to 92.08%, while reducing false positives by 41% without increasing false negatives. The results indicate that treetop-centered local shape cues are useful for instance segmentation in densely planted forests. 2:15pm - 2:30pm
An integrated yolo-seg and geometric analysis framework for construction zone detection and tubular marker damage assessment 1Department of Civil and Environmental Engineering, College of Engineering, Myongji University,; 2Department of Future & Smart Construction Research, Korea Institute of Civil and Building Technology; 3Department of Geoinformatic Engineering, Inha University This study presents an integrated framework combining YOLOv9e-Seg and photogrammetric geometric analysis for detecting road-safety assets and assessing their condition using UAV imagery. Traffic cones and tubular markers, which define construction-zone boundaries, are difficult to detect due to their small size in high-resolution images. To address this, a crop-tiling strategy (512×512 pixels) was applied to enhance the representation of small objects. Polygon-based labeling was used to preserve fine object geometry, and YOLOv9e-Seg was trained to output instance masks and polygon coordinates. During testing, tiled predictions were restored to the global coordinate frame, and duplicate detections were removed by retaining only the highest-confidence results. Geometric analysis utilized segmentation-derived polygons to compute centroids and principal axes, distinguishing intact and damaged tubular markers through vector angle difference analysis. For traffic cones, convex hulls constructed from centroid positions accurately delineated construction-zone boundaries. The proposed approach achieved the highest F1 score at a 512-pixel tile size, improving detection and segmentation of small, slender objects. These results demonstrate that the framework goes beyond basic detection and segmentation by enabling quantitative geometric interpretation and reliable construction-zone reconstruction from UAV data. 2:30pm - 2:45pm
From Aerial to Satellite: Can Super-Resolution Enable Label-Free Model Transfer? German Aerospace Center (DLR), Germany Satellite imagery enables large-scale remote sensing applications by providing frequent and large-scale coverage. However, its limited spatial resolution often restricts the use of satellite images in tasks that require detailed, fine-scale information. In contrast, aerial images offer a much higher spatial resolution, allowing the extraction of fine-grained features, but typically cover smaller, more localized areas. In this work, we investigate whether super-resolution (SR) methods can bridge the gap between aerial and high-resolution satellite imagery, enabling a label-free model transfer without additional manual annotations. The idea is to enhance the spatial resolution of high-resolution satellite images, allowing models trained on aerial data to be directly applied to satellite images. Towards this goal, a state-of-the-art SR algorithm is used to upscale three high-resolution satellite images, matching the resolution of the aerial training data. Then, a segmentation network trained on an aerial image dataset is applied to segment roads and parking areas in the super-resolved satellite images. The approach is evaluated on an annotated dataset and compared to the results in the original satellite images. Additionally, we investigate its performance on a low-resolution aerial image. Our results demonstrate that SR facilitates the utilization of models trained on aerial image datasets for large-scale satellite applications without requiring new labels. 2:45pm - 3:00pm
Beyond Vision: How Language effects Visual Grounding in UAV Imagery 1Hinton STAI Institute and Key Laboratory of Geographic Information Science (Ministry of Education), East China Normal University, Shanghai 200241, China; 2Shanghai Jiao Tong University, Shanghai 200241, China; 3Department of Geography and Environmental Management, University of Waterloo,Waterlo0,ON N2L 3G1,Canada This study tackles multilingual and explicit-implicit gaps in Visual Grounding (VG) for UAV imagery, focusing on real-world UAV needs (e.g., disaster response) that require implicit reference understanding. It evaluates Qwen2.5-VL-7B’s cross-linguistic robustness via Acc@0.5% across nine languages (Chinese, English, Japanese, Russian, Korean, German, French, Spanish, Portuguese). Key results: Explicit VG (using visual attributes) outperforms implicit VG (needing context/common sense) universally. East Asian languages lead in both tasks; Indo-European languages (e.g., Portuguese, 48.63% implicit accuracy drop) lag. Attention analysis shows the model better aligns with East Asian linguistic structures. This work informs LVLM optimization for multilingual UAV applications, guiding future cross-model comparisons. | ||

