JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at isprs2026@icsevents.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Daily Overview

Session

WG III/1G: Remote Sensing Data Processing and Understanding

Time:

Saturday, 11-July-2026:

10:30am - 12:00pm

Location: 713A

125 theatre

Session Topics:

Remote Sensing Data Processing and Understanding (WG III/1)

External Resource: http://www.commission3.isprs.org/wg1

Presentations

10:30am - 10:45am

YOLOv8m-CCFM-GSConv: Research on Lightweight Marine Oil Spill Target Detection Based on Improved YOLOv8m Model

Junjie Lu¹, Qingyang Wang^1,2,3, Bo Song¹, Jianwu Jiang^1,2,3, Bin Yang¹, Chen Jiao¹

¹College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541004, China; ²Guangxi Ecological Spatiotemporal Big Data Perception Service Laboratory, Guilin 541004, China; ³Guangxi Key Laboratory of Spatial Information and Geomatics, Guilin 541004, China

In the application of target detection for marine oil spills, deep learning methods are gradually replacing traditional remote sensing image recognition approaches. While complex models designed for higher accuracy may compromise recognition speed, they often fail to meet the rapid response requirements of terminal device applications (Chai et al, 2025). Therefore, developing a lightweight detection model that balances high accuracy and real-time performance is crucial for enhancing marine oil spill emergency response capabilities (Liang et al, 2024). Based on the yolov8m model, this study introduces GSConv (Li et al, 2024) lightweight convolution and CCFM (Guo et al, 2025) cross-scale feature fusion module, which significantly improves the adaptability of multi-scale target detection and recognition accuracy in complex backgrounds while maintaining model lightweightness, thereby offering a novel and effective solution for marine oil spill target detection.

10:45am - 11:00am

Detecting moving vehicles on Sentinel-2 imagery using semi-automatic labeling from S2A/S2C tandem phase

Guillaume Buthmann¹, Florentin Poucin¹, Jérémy Anger^1,2

¹Kayrros SAS; ²Université Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, 91190, Gif-sur-Yvette, France

During the commissioning phase of ESA's Sentinel-2C, tandem images with Sentinel-2A were acquired with a delay of 30 seconds. We present a novel, automated method for labeling moving vehicles in Sentinel-2 images, leveraging the temporal offset between these tandem acquisitions. We propose a filtering process that isolates pixels corresponding to vehicles that moved between the two acquisitions. We generate a training dataset based on this process, removing the need for a large manual labeling phase. The dataset is used to train a standard deep-learning-based vehicle detection model. Experimental results, as well as a validation study using ground-truth data from California, highlight the quality of the proposed labeling method, and show that a vehicle detection model can be successfully trained from quasi-simultaneous acquisitions.

11:00am - 11:15am

LAD-Enhancer: A Lightweight All in One Aerial Detection Enhancer Under Adverse Weather

Yu Wan¹, Jie Li¹, Liupeng Lin², Zaiyan Zhang¹, Qiangqiang Yuan¹, Huanfeng Shen²

¹School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; ²School of Resource and Environmental Sciences, Wuhan University, Wuhan 430079, China

With the rapid development of aerial imaging technology, aerial target detection has become a research hotspot with broad applications in intelligent transportation, agricultural monitoring, and military surveillance. However, the performance of aerial detection models is often degraded under adverse weather conditions such as fog, sandstorms, and low illumination. In such environments, aerial images typically suffer from reduced contrast and color distortion, which significantly affects the model’s ability to accurately identify targets. To this end, a Lightweight All-in-One Aerial Detection Enhancer Under Adverse Weather (LAD-Enhancer) has been proposed. The designed enhancer processes and restores degraded aerial images, thereby enhancing the detection model’s ability to perceive potential targets. Unlike conventional image restoration models, LAD-Enhancer integrates detection labels as additional supervision during training to ensure that enhancement is detection-oriented rather than purely visual. Furthermore, a three-stage training strategy and a Mixture of Experts (MoE) framework are employed to adaptively classify and process images captured under different degradation conditions. Experimental results demonstrate that, with an increase of fewer than 3K parameters, the proposed LAD-Enhancer significantly improves detection performance under adverse weather conditions while maintaining almost unchanged performance on clear-weather images.

11:15am - 11:30am

A Collaborative Detection Method of Small Unmanned Aerial Vehicle Target via Multi-modal Feature Fusion in Complex Background

Wen Jiang, Keyi Zhang, Yanping Wang, Yun Lin, Fukun Bi

North China University of Technology, Beijing, People's Republic of China

Currently, the state-of-the-art methods for detecting small unmanned aerial vehicles (UAVs) continue to struggle in complex urban settings due to several persistent challenges, namely, frequent target occlusion, high similarity in thermal radiation signatures between UAVs and their surroundings, and the inherently low visual saliency of small UAV targets, all of which contribute to degraded detection performance. To tackle these issues, this paper introduces a novel multi-modal feature fusion collaborative detection (MFFCD) framework grounded in learnable spatial mapping. The architecture consists of three key components: firstly, a multi-branch parallel feature extraction module (MBPFE) that simultaneously processes infrared, visible, and radar range-azimuth images, complemented by a feature fusion module (FFM) designed to enhance both intra-modal and inter-modal feature interactions; then, an adaptive spatially-aware dynamic detection head module (DDH) that dynamically recalibrates feature weights to strengthen target representation and boost detection accuracy; and a feature collaborative enhancement module (FCE) that employs a learnable affine transformation to align and fuse multi-modal features, thereby producing more robust and reliable detection outcomes. Extensive experiments show that the proposed MFFCD framework substantially outperforms existing methods under challenging urban conditions, achieving a 56.89% gain in Mean Average Precision (mAP) for small UAV detection.

11:30am - 11:45am

Infrared-Visible Image Fusion Method Based on Differential Feature Enhancement and Cross-Modal Attention

Huang Zhang¹, Lina Xu¹, Qing Zhou¹, Tiyou Zhou², Siyu Liu¹, Xincai Chang¹, Hao LI¹

¹Hubei Subsurface Multi-scale Imaging Key Laboratory, School of Geophysics and Geomatics, China University of Geosciences, Wuhan, 430074, China; ²State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, 430079, China

Infrared and visible remote sensing image fusion is crucial for improving scene perception in complex environments, but existing autoencoder-based methods suffer from insufficient information interaction between modalities, inadequate deep feature fusion, and ineffective loss functions in extreme scenarios. To address these issues, this study proposes a Differential Feature Enhancement and Cross-modal Fusion (DFECF) method. The DFECF adopts an end-to-end architecture consisting of dual-stream encoders, cross-modal fusion modules, Transformer global perception modules, and decoders. Specifically, the Differential Enhancement (DE) module extracts differential information between infrared and visible features, combined with spatial and channel attention to enhance feature representation. The cross-modal fusion module adaptively integrates deep features based on channel attention, adjusting feature weights according to scene characteristics. The Transformer module supplements the global receptive field to capture long-range feature dependencies, and a joint loss function is designed to optimize fusion performance. Experimental results on public datasets show that the proposed method outperforms existing state-of-the-art methods in both subjective visual effects and objective evaluation metrics, especially in extreme environments such as strong light and thick smoke. It effectively improves the integrity of scene perception and provides high-quality data support for practical applications such as forest fire prevention, mining area monitoring, and autonomous driving.