JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at isprs2026@icsevents.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Daily Overview

Location: 714A
175 theatre

Date: Tuesday, 07-July-2026

8:30am - 10:00am

WG I/6B: Orientation, Calibration and Validation of Sensors
Location: 714A

8:30am - 8:45am

Evaluation and performance assessment of a novel UAV-borne laser scanner system

Gottfried Mandlburger¹, Elisabeth Ötsch¹, Philipp Knopf²

¹TU Wien, Department of Geodesy and Geoinformation, Austria; ²Knopfhoch GmbH, Austria

Miniaturized UAV laser scanning systems have advanced rapidly over the past decade, especially in the low-cost sector. DJI entered this field with the Zenmuse L-series, integrating GNSS/INS with compact scanners. While the first-generation L1 showed moderate precision, the L2 improved notably through reduced beam divergence. In November 2025, DJI released the Zenmuse L3. In this contribution, we assess its performance.

The main upgrade from L2 to L3 lies in the LiDAR unit: L3 uses a single 1535 nm laser instead of multiple 905 nm diodes, offers a symmetric 0.25 mrad beam divergence, and supports pulse repetition rates from 350 kHz to 2 MHz. High PRR operation is limited to altitudes ≤50 m due to missing multiple-time-around resolution. Scan modes include linear, non-repetitive, and a new star-shaped pattern.

L2 and L3 were tested at three sites in Lower Austria covering a warehouse, power-lines, and forests. Flights were conducted at 80 m AGL (350 kHz) and, for the warehouse, 50 m AGL (2 MHz). Precision, strip consistency, point density, feature separability, and vegetation penetration were evaluated using the scientific software OPALS.

L3 data showed sharper edges, reduced noise, and higher separability, yielding spline-fit residuals of 0.9 cm versus 2.6 cm for L2 for reconstructing a double-threaded power-line. Ground point coverage in forests increased from 18 % (L2) to 51 % (L3). Strip height differences are around 2 cm for both sensors and L3 achieved sub-centimeter precision on sealed surfaces. Overall, L3 offers substantial gains in spatial resolution, precision, and vegetation penetration.

8:45am - 9:00am

Geometric and radiometric Calibration of a rotating multi-beam Lidar using a rotating tilted Platform

Heikki Hyyti, Matias Mäki-Leppilampi, Harri Kaartinen

Finnish Geospatial Research Institute FGI, Finland

Intrinsic calibration of rotating multi-beam lidars (RMBL) enables more precise measurements. We calibrated our sensor to improve its geometric and radiometric accuracy using a rotating tilted platform. The rotating mechanism widens the field of view of each lidar channel and allows all lasers of the sensor to measure the same areas in a room containing planar wall and floor sections. Therefore, we can collect measurements for geometric and radiometric calibration with minimal amount of calibration targets. Furthermore, we used data based numerical minimization to estimate the calibration parameters for all 128 lidar channels in our RMBL sensor. For the intrinsic geometric calibration of the sensor, we estimated the elevation and azimuth angles of each laser. For the radiometry, we estimated a linear model for each laser to correct the intensity measurement. For a linear model, two different known diffuse reflectance targets are sufficient for the radiometric calibration. We tested our methods in two different environments, in an office room and a longer corridor. We showed that the methods can improve the precision of the RMBL sensor significantly. Regarding geometry, we were able to reduce the error on average from 16.1 mm to 15.1 mm (6.2% improvement). For radiometry, we were able to improve the reflectance measuring accuracy on average from 9.5% errors down to -0.9% errors (91% improvement).

9:00am - 9:15am

Tightly-coupled joint Adjustment of static and kinematic Laser Scanning Data

Florian Pöppl, Philipp Amon, Nikolaus Studnicka, Martin Pfennigbauer, Andreas Ullrich

RIEGL Laser Measurement Systems GmbH, Austria

In recent years, laser scanning has evolved into a core surveying technology for 3D mapping, both statically from stationary scan positions (terrestrial laser scanning, TLS) and kinematically from moving platforms (kinematic laser scanning, KLS). Consequently, there is a growing demand for methods that efficiently and coherently support both static and kinematic data acquisition modes. This contribution presents a tightly-coupled approach for the co-registration of TLS and KLS data, which simultaneously integrates GNSS positions, inertial measurements, planar features extracted from both static and kinematic point clouds, and control information in a joint non-linear least-squares adjustment. This is neither just a transformation of the kinematic onto the static point cloud nor a simple correction of the trajectory in e.g., a strip adjustment, but rather a tightly coupled adjustment of static and kinematic data. This approach avoids the need for additional survey control for kinematic data by leveraging the static scan data as a proxy, enabling accurate georeferencing even in scenarios where the individual datasets cannot be reliably tied to control points. Results show that the co-registration notably improves the relative consistency of kinematic datasets with respect to a static reference. Such co-registration enables new use-cases for multi-modal data acquisition, such as change-detection in repeated kinematic data acquisitions with respect to a static reference dataset, or more flexible ways of integrating ground control in kinematic surveys.

9:15am - 9:30am

Position and Orientation from Asynchronous Lidar in GNSS Denied Environments

Craig Glennie, Francisco Haces-Garcia

University of Houston, United States of America

This study investigates the use of a distributed asynchronous lidar system for augmented position and orientation determination in Global Navigation Satellite Systems (GNSS) denied environments. An asynchronous lidar design is one in which the laser transmitter and detectors/receivers are disconnected and carried on separate platforms. This unique geometry offers observational redundancy that can be used to estimate the trajectory of the receiver platforms. The paper presents the results of simulation experiments, first examining single epoch solutions and then considers estimates of position and orientation along simulated flight trajectories. The results show that as long as the laser transmitter is operated above the GNSS denied environment, the system is able to simultaneously estimate position and orientation for multiple receiver drones, even for extended periods of GNSS outages. The accuracy of position and orientation estimation is dependent on the exact flight path and the number of lidar receivers in the solution, but with favorable geometry the accuracy of position estimation can approach that provided by a high precision GNSS solution.

9:30am - 9:45am

Extraction of Image-to-Lidar Correspondences and their Impact on Optimal Sensor Fusion

Kyriaki Mouzakidou, Aurélien Arnaud Brun, Jan Skaloud

Earth Sensing & Observation Laboratory (ESO), Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland

This work extends our initial proof-of-concept via emulations on the benefits of relative spatial constraints between imagery and lidar point clouds in a factor graph based optimization with satellite positioning (GNSS) and raw inertial readings (Mouzakidou et al., 2025). Here, we demonstrate practically the automatic extraction and integration of 2D-3D correspondences established in the 3D domain within rough natural terrain flown over by an aircraft with sensors of high quality. We show that considering cross-domain (i.e. 2D-3D) constraints enables the calibration of internal camera parameters and its boresight on job, i.e. within mapping flight configurations, where conventional approaches fail. The common optimization of raw IMU data with such constraints improves the respective agreements between the lidar and image dense clouds, achieving consistency at ground resolution level, which is not the case for the conventional (standard) processing of acquired data.

9:45am - 10:00am

GNSS-Constrained Motion Estimation for Robust Visual-Inertial-Odometry Initialization

Chunqi Dai, Sagi Filin

Technion - Israel Institute of Technology, Haifa, Israel

Visual-inertial odometry (VIO) plays a key role in modern navigation and mapping systems.

For their successful integration, an initialization phase, in which IMU-related bias factors are estimated, becomes a fundamental step.

Without one, the subsequent nonlinear estimation of the platform pose may fail to converge or completely diverge.

As reliance on visual and inertial information may exhibit instability due to error accumulation with time, incorporating absolute positioning information through global navigation satellite system (GNSS) measurements, may enhance its robustness and accuracy.

Accordingly, GNSS and visual-inertial initialization frameworks have been receiving growing attention in recent years where current strategies tend to follow a loosely-coupled formulation that first initializes the VIO trajectory, and then aligns it with GNSS measurements.

Such strategies are multi-stage, nonlinear, and computationally expensive, motivating us to introduce an alternative framework in which GNSS position is integrated with the raw visual-inertial measurements to form absolute translation constraints.

Accordingly, we achieve a closed-form, linear and globally consistent drift-free solution which is computationally efficient and requires neither 3D reconstruction nor nonlinear refinement, as common approaches do.

Testing our initialization formulation on benchmark multi-sensor datasets, results show that we outperform current baselines while exhibiting robustness in challenging scenarios.

1:30pm - 3:00pm

WG I/3: Multispectral, Hyperspectral and Thermal Sensors
Location: 714A

1:30pm - 1:45pm

First Field Validation of a New VNIR/SWIR-Based Six-Band Multi-Camera System for UAVs over Winter Wheat

Alexander Jenal¹, Fabian Reddig², Andreas Bolten², Leon Vehlken², Hubert Hüging³, Thuy Huu Nguyen³, Jens Bongartz¹, Georg Bareth²

¹Application Center for Machine Learning and Sensor Technology (AMLS), University of Applied Sciences Koblenz, Germany; ²Institute of Geography, GIS & Remote Sensing Group, University of Cologne, Germany; ³Institute of Crop Science and Resource Conservation (INRES), University of Bonn, Germany

Shortwave infrared (SWIR) imaging from uncrewed aerial vehicles (UAVs) remains rare despite strong sensitivity to canopy water and protein. We present the first field validation of a six-band VNIR/SWIR multi-camera system designed for plot-scale monitoring of winter wheat using mid-sized UAVs. The payload utilized narrow bandpass filters (910, 980, 1100, 1200, 1510, and 1650 nm; FWHM 10–12 nm) and was operated at an altitude of approximately 30 meters above ground level, achieving a ground sampling distance of approximately 4 cm. Empirical line calibration, employing in-scene gray panels, was validated against material-distinct panels and spectroradiometer measurements. The spectral response functions were approximated using Gaussian convolution due to the narrow passbands. Five bands (980–1650 nm) exhibited excellent performance: empirical line model fits achieved R² values approaching 1.000 (RMSE = 0.003–0.009), independent panel validation demonstrated near-unity slopes (R² = 0.998–0.999; RMSE = 0.005–0.013), and plot-level canopy measurements (n=36) maintained strong agreement between camera and spectroradiometer (slopes = 0.943–1.079; R² = 0.58–0.85; RMSE = 0.010–0.023). Two SWIR normalized ratio indices exhibited robust cross-sensor agreement: NRI[1100,1200] (R² ≈ 0.93) and NRI[1650,1510] (R² ≈ 0.90). The 910 nm channel displayed systematic errors (slope = 0.442±0.040 for plots; MAPE ≈ 33%) due to identified out-of-band leakage from incomplete long-wave blocking, leading to its exclusion from accuracy claims. Mitigation strategies include higher optical density short-pass blocking and system-level spectral response function verification. The filter-reconfigurable payload provides quantitative reflectance and robust SWIR indices at the plot scale by integrating panel-anchored empirical line modeling with bandpass-aware harmonization, thereby advancing operational SWIR monitoring capabilities for precision agriculture applications.

1:45pm - 2:00pm

PanX.4: A Gyrocopter‑Borne Six‑Band VNIR Multicamera System for Sentinel-2‑Aligned Multitemporal Vegetation Monitoring

Alexander Jenal¹, Felix Kröber^2,5, Christopher Frank³, Lina Krisztian⁴, Markus Metz⁴, Ribana Roscher⁵, Jens Bongartz¹

¹Application Center for Machine Learning and Sensor Technology (AMLS), University of Applied Sciences Koblenz, Germany; ²Institute of Bio- and Geosciences, Forschungszentrum Jülich, Germany; ³CISS TDI GmbH, Germany; ⁴mundialis GmbH & Co. KG, Germany; ⁵Institute of Geodesy and Geoinformation, University of Bonn, Germany

This contribution presents PanX.4, a gyrocopter-borne six-band VNIR multicamera system developed within the KIBI project on AI-based identification and classification of protected plant communities (mFUND, FKZ 19F2276) to support cross-scale monitoring at Natura 2000 sites. The system is designed for spectral alignment with Sentinel-2 MSI bands B02–B06 and B08 and is integrated into a tri-sensor airborne suite on the FlugKit carrier platform together with a high-resolution RGB camera and a complementary six-band VNIR–SWIR imaging system. Using system-level spectral response characterization and spectral band adjustment factor (SBAF) analysis based on 1,057 ECOSTRESS spectra, the study quantifies the harmonization quality between PanX.4 and Sentinel-2A, S2B, and S2C. All bands achieved R² > 0.99, while comparative screening of alternative spectral configurations showed that careful band design is critical, particularly in the red-edge region. An additional inter-satellite sensitivity analysis further indicates that harmonization should account for band-dependent differences between Sentinel-2 units when multitemporal airborne and satellite observations are combined. To support multitemporal habitat monitoring, the paper also analyzes 86,947 first-mowing observations from 2017 to 2024 and derives a three-window acquisition concept synchronized with pre-mowing, post-regrowth, and senescence phases. This creates an operationally relevant framework for planning repeated airborne campaigns that can support validation, boundary refinement, and future machine-learning workflows for habitat classification. The contribution therefore establishes the sensor-design, spectral-harmonization, and temporal-planning basis for Sentinel-2-consistent airborne monitoring at sub-meter resolution. Operational airborne image products and in-flight validation are beyond the present contribution and form the next step for future deployment.

2:00pm - 2:15pm

Atmospheric correction of aerial imagery using satellite-derived reflectance data

Alexane Nghien, Manchun Lei, Mathieu Brédif

Univ Gustave Eiffel, Géodata Paris, IGN, LASTIG

Atmospheric correction of large-scale aerial imagery remains a major challenging, mainly due to the difficulty of accurately estimating atmospheric parameters within the images. This study proposes a novel atmospheric correction method based on satellite-derived Surface Reflectance (SR). The method is a semi-empirical linear correction approach that leverages Pseudo-Invariant Features (PIFs) as reference points. Experimental results show that, the proposed method achieves performance comparable to radiative transfer models approach when accurate atmospheric parameters are available, and provides more reliable corrections when such parameters are uncertain or unavailable.

2:15pm - 2:30pm

Abundance Estimation Methods in Spectral Unmixing for Real Data

Daniele Cerra, Miguel Pato, Emiliano Carmona

German Aerospace Center (DLR), Germany

Spectral unmixing estimates the fractional abundances of materials, having associated spectra called endmembers, in pixels acquired by imaging spectrometers. Validation of abundance estimation methods typically relies on synthetic data or comparisons to results obtained by other algorithms. This study considers results of typical abundance estimation algorithms on the DLR HySU (HyperSpectral Unmixing) benchmark dataset, which contains actual imaging spectrometer data acquired over several arrangements of known-size material patches for physically traceable validation. Abundance estimates are compared against measured target areas in pixels with different degrees of mixtures. We evaluate least squares and sparse unmixing methods across different noise scenarios on real data, and by contaminating the library through addition of non-relevant endmembers. Additionally, as a way to approximate hard sparsity constraints, we enforce cardinality constraints on endmember subsets, identifying those minimizing abundance errors relative to the full library. Results suggest that fully constrained least squares yields usually the best results, but struggles in cases of highly mixed pixels. Finally, we test quantization of abundance values as a way to enforce sparsity in non-negative least squares with limited but encouraging results. Overall, the increase in accuracy of results enforcing sparse solutions support the use of computationally efficient sparse unmixing methods in practical scenarios, part of which may become feasible if quantum computing capabilities improve in the future.

2:30pm - 2:45pm

Operational Band-to-Band Correction and Attitude Refinement of Pelican-2: dual-panchromatic Attitude Restitution and selective Bundle Adjustment with preliminary Application to Earthquake Displacement and DEM Generation

Saif Aati, Antonio Martos, Eric L. Peters, Frank Warmerdam, Graham Mills, Adam Weber, Luna Gray, Minh Radel

Planet Labs PBC

The Pelican satellite constellation, first launched by Planet Labs in 2025, continues the high-resolution imaging capability established by the SkySat program. The change to pushbroom sensor in Pelican presents new geometric challenges: satellite attitude variations and platform instabilities during acquisitions can produce band misregistration and geolocation errors that degrade downstream products. This paper presents an operational workflow developed for Pelican imagery, validated on Pelican-2, a technology demonstration satellite. The approach exploits the dual-panchromatic focal plane configuration to independently measure satellite wobble to greater accuracy than on onboard attitude sensors, combined with selective bundle adjustment and B-spline spatial correction to achieve sub-pixel band alignment without dense ground control points. Validation on 963 Pelican-2 scenes demonstrates sub-pixel band-to-band registration accuracy (RMSE < 0.12 px) and 4 m CE90 geolocation accuracy. Applications illustrate the potential for operational geoscience workflows: earthquake surface displacement mapping of the March 2025 Myanmar M7.7 rupture detects 4.0 m co-seismic offsets on the Sagaing Fault with minimal post-processing, and digital surface model generation from an opportunistic multi view acquisition yields preliminary elevation products free of jitter artifacts, demonstrating operational feasibility for constellation-scale processing.

Initial applications showcase operational potential: earthquake surface displacement mapping detects 4.0 m co-seismic offsets from the March 2025 Myanmar M7.7 rupture with minimal post-processing; digital surface model generation yields elevation products free of jitter artifacts. Results establish feasibility for constellation-scale processing and inform next-generation Pelican development.

3:30pm - 5:15pm

WG III/4A: Landuse and Landcover Change Detection
Location: 714A

3:30pm - 3:45pm

ChangeDINO: DINOv3-Driven Building Change Detection in Optical Remote Sensing Imagery

Ching-Heng Cheng¹, Chih-Chung Hsu²

¹National Cheng Kung University, Tainan, Taiwan; ²National Yang Ming Chiao Tung University, Hsinchu, Taiwan

Remote sensing change detection (RSCD) aims to identify pixel-wise surface changes from co-registered bi-temporal images. However, many deep learning–based RSCD methods rely solely on change-map annotations and underuse the semantic information in non-changing regions, which limits robustness under illumination variation, off-nadir views, and scarce labels.

This paper presents ChangeDINO, an end-to-end multiscale Siamese framework for optical building change detection. The model fuses a lightweight backbone stream with features transferred from a frozen DINOv3, yielding semantic- and context-rich pyramids even on small datasets. A spatial–spectral differential transformer decoder then exploits multi-scale absolute differences as change priors to highlight true building changes and suppress irrelevant responses. Finally, a learnable morphology module refines the upsampled logits to recover clean boundaries. Experiments on four public benchmarks demonstrate that ChangeDINO achieves strong accuracy and robustness under cross-temporal appearance variations, yielding cleaner building boundaries with improved data efficiency.

3:45pm - 4:00pm

Hie-DinoMamba: Hierarchical DINOv3 and Mamba Architecture for Multi-Class Building Change Detection

Youngwoong Yoon¹, Jangwoo Cheon¹, Hwiyoung Kim¹, Impyeong Lee²

¹Geospatial Team, Innopam, Seoul, Republic of Korea; ²Department of Geoinformatics, University of Seoul, Seoul, Republic of Korea

Multi-class building change detection in high-resolution aerial imagery is essential for urban monitoring, yet remains challenging due to severe class imbalance and the limited representational capacity of encoders trained from scratch. We propose Hie-DinoMamba, a novel architecture that integrates a frozen 1.1B-parameter DINOv3-L encoder—pre-trained on the SAT-493M satellite dataset—with a newly designed Hierarchical Mamba FPN decoder. To bridge the domain gap between satellite pre-training and aerial imagery without incurring prohibitive computational costs, we adapt the encoder using parameter-efficient Low-Rank Adaptation (LoRA), updating only a small fraction of parameters while preserving the encoder's rich pre-trained knowledge. The decoder fuses multi-scale feature pairs from both time points via channel-wise concatenation and 1×1 projection, then refines them in a top-down manner using Visual State Space Model (VSSM) blocks that capture long-range spatial context with linear complexity. A dual-loss strategy decouples semantic classification (Focal Loss) from boundary delineation (Focal Tversky + Dice Loss), optimizing each objective at a different hierarchical level. On a 4-class aerial building change detection benchmark (41,548 image pairs, 0.1 m resolution, Seoul), Hie-DinoMamba achieves a state-of-the-art mIoU of 65.12% and Kappa of 75.77%, improving over the strongest baseline by 2.1 percentage points. An ablation study confirms that LoRA adaptation is the most critical component. Qualitative analysis further demonstrates robust generalization to geographically unseen regions.

4:00pm - 4:15pm

Stepwise Optimization and Ensemble Pipeline for Building Change Detection in High Resolution Satellite Imagery Using Mamba-Based Model

DongHyuk Jin¹, Junhwa Chi²

¹Department of Data Engineering, Pukyong National University, Busan, Republic of Korea; ²Division of Data Information Sciences, Pukyong National University, Busan, Republic of Korea

This study presents a stepwise optimization pipeline for high-resolution building change detection in dense urban environments using imagery from CAS500-1, Korea’s national land observation satellite. A dataset of 3,816 bi-temporal patch pairs from 29 urban regions was constructed to support model development and evaluation. A Mamba-based architecture, incorporating efficient global context modeling, was adopted as the baseline for binary change detection.

To enhance performance, the pipeline introduced three sequential optimization stages. First, normalization techniques suited for 12-bit radiometric imagery were compared, including percentile-based scaling, gamma adjustment, and logarithmic transformation. Second, augmentation strategies were evaluated, contrasting standard geometric augmentation with extended optical and temporal augmentation designed to improve generalization in structurally complex urban environments. Third, multiple ensemble strategies, ranging from simple averaging to confidence-weighted and hierarchical aggregation, were examined to overcome the limitations of individual model sizes.

Model performance was assessed using a comprehensive set of pixel-level, change-pixel-level, contour-based, and object-based metrics to ensure robust evaluation of both spatial precision and structural consistency. Experimental results showed that gamma-based normalization, comprehensive augmentation, and selected ensemble strategies each contributed measurable improvements. Combining these optimized components yielded a final hierarchical ensemble that improved the F1-Score from 0.7629 to 0.8070, representing a substantial gain over the baseline model.

Overall, this work provides a validated and extensible optimization strategy for high-resolution satellite-based change detection, offering practical guidance for operational applications and adaptability to future ensemble configurations across diverse architectures.

4:15pm - 4:30pm

Leveraging Geospatial Foundation Models for Bi-Temporal Land-Cover Change Detection

Mozhdeh Shahbazi, Mikhail Sokolov, Charles Authier, Marjan Asgari

Canada Centre for Mapping and Earth Observation, Natural Resources Canada, Canada

Recent advances in geospatial foundation models have enabled scalable and transferable solutions for Earth observation (EO) tasks, which can make them good candidates to achieve the requirements mentioned above. Foundation models are types of large-scale artificial intelligence (AI) models trained on massive and diverse datasets. In the EO domain, these datasets may include imagery, elevation models, geographic coordinates, temporal tags, sensors spectral information, and descriptive metadata. These models excel at representation learning through self-supervised training, enabling them to capture rich descriptive features without requiring labelled data. Consequently, they can serve as powerful backbones for downstream tasks such as land-cover change monitoring.

Accordingly, this paper provides an overview of the development process of a geospatial foundation model, Planaura. It demonstrates how this model is best adapted to Canadian landscapes and how it is used to achieve the task of land-cover change detection. Planaura is now accessible publicly via the model hub at HuggingFace: [Link hidden for blind review process]

4:30pm - 4:45pm

A Transformer-Based Framework for Spatiotemporal Unmixing of Land–Water Mixtures in Multispectral Satellite Data

An Bao Nguyen¹, Andreas Schenk², Stefan Hinz²

¹KU Leuven, Leuven, Belgium; ²Karlsruhe Institute of Technology, Karlsruhe, Germany

This paper presents a novel transformer-based framework for spatiotemporally dynamic spectral unmixing of multispectral satellite imagery. Spectral unmixing is essential for analyzing mixed pixels in remote sensing, especially in analyzing small objects such as narrow rivers when using coarse-resolution observations such as Sentinel-2 data. Most deep-learning based unmixing models typically account for a single scene and ignore the tempo-spatial variation of spectra and land-cover proportions.

To address this challenge, we introduce a unified deep learning architecture that leverages transformer attention mechanisms to exploit both spectral and auxiliary information causing spectral variations. The framework models the temporal and spatial evolution of abundances while simultaneously learning representative endmember spectra. By integrating cross-attention between spectral inputs, auxiliary variables, and temporal embeddings, the model can adapt to seasonal changes, illumination conditions, and scene-specific variability. The method is trained using synthetic mixtures derived from Sentinel-2 surface reflectance data.

Applied to monitoring small rivers with strong temporal, and spatial, and intrinsic variability, the proposed approach demonstrates improved accuracy in estimating water abundances and extracting water spectra in highly mixed river pixels (mixed with water and riverbank). The model effectively captures tempo-spatial transitions in water extent and sediment-laden river inflows, offering a more consistent representation than conventional unmixing techniques.

This work contributes a generalizable and end-to-end framework for handling dynamic unmixing scenarios in multispectral remote sensing. It provides new insights into the use of transformers for modeling spatiotemporal interactions and supports applications in environmental monitoring and water resource assessment.

4:45pm - 5:00pm

Land Cover Classification of Optical–SAR Imagery via Cross-Modal Interaction and Feature Alignment

Junqi Zhao, Min Chen, Wei Guo, Jinbo Zhang, Zelan Fu, Xuming Ge, Han Hu, Bo Xu, Qing Zhu

Faculty of Geosciences and Engineering, Southwest Jiaotong University, Chengdu, 611756, China

Land cover classification (LCC) plays a crucial role in geoscientific research and resource monitoring applications. Compared

with traditional single-modal classification methods, multimodal fusion models can more effectively leverage the complementary

information of optical and synthetic aperture radar (SAR) imagery, thereby improving classification performance in complex scen-

arios. However, due to the significant differences in the imaging mechanisms of the two sensors, inconsistencies in radiometric

properties and spatial structures arise between optical and SAR images, posing challenges for cross-modal feature interaction and

fusion. To address this issue, we propose a multimodal optical–SAR fusion network (MOSFNet) for high-precision LCC, which

incorporates two core modules: the Feature Interaction Module (FIM) and the Feature Fusion Module (FFM). The FIM achieves

complementary feature interaction between optical and SAR images through channel splitting and cross concatenation, while in-

corporating a coordinate attention mechanism to enhance the responsiveness of key land cover regions. The FFM leverages a 2D

selective scan (SS2D) mechanism to implement bidirectional cross-modal feature alignment and gated fusion in the hidden state

space, enabling deep correlation and adaptive integration of optical and SAR features. Experiments on the WHU-OPT-SAR dataset

demonstrate that MOSFNet significantly outperforms existing methods in terms of classification accuracy and model generalization,

providing an efficient and robust solution for high-precision land cover mapping with multi-source remote sensing imagery.

5:00pm - 5:15pm

Seasonal-Aware Scale-Semantic Consistency Alignment Change Detection Network

Bing Shao¹, Hanchao Zhang¹, Mingzhu Li², Yunkun Zou³, Ruiqian Zhang¹, Xiaogang Ning¹, Hao Wang¹

¹Chinese Academy of Surveying and Mapping Beijing, China; ²Liaoning Technical University Geomatics and Geographical Sciences, Fuxin, China; ³Joint Laboratory of Spatial Intelligent Perception and Large Model Application, Nanjing, China

Change detection in remote sensing imagery is a crucial method for obtaining dynamic information about land cover. However, pseudo-changes caused by seasonal variations pose a significant challenge to detection accuracy. Seasonal variations, such as vegetation phenology and snow cover, introduce global appearance differences that are often mistaken for actual land cover changes. This phenomenon is particularly prominent in long-term monitoring tasks, where pseudo-changes dominate the detection results. Addressing the issues of global appearance differences and multi-scale feature fusion induced by seasonal changes, We propose a novel Seasonal-Aware Scale-Semantic Consistency Alignment Change Detection Network (SSCANet) for remote sensing image change detection. This approach incorporates a Seasonal-Aware Scale Alignment (ASA) module and a Seasonal-Aware Semantic Guided Fusion (SGF) module. By employing spatial scale transformation and semantic alignment, it reduces information mismatch in multi-scale feature fusion and enhances the perception of details in change regions. Experiments conducted on the GZ-CD and CDD datasets demonstrate that SSCANet achieves overall accuracy with F1 scores of 89.21% and 97.82%, with precision rates of 89.02% and 98.37%, respectively. These results represent significant improvements over other methods, demonstrating that SSCANet outperforms its counterparts in both overall accuracy and seasonal robustness. The findings confirm that this approach effectively suppresses seasonal false changes, enhancing the accuracy and reliability of change detection.