JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at isprs2026@icsevents.com.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Daily Overview

Location: 714B
175 theatre

Date: Thursday, 09-July-2026

8:30am - 10:00am

WG III/9: Geospatial Environment and Health Analytics
Location: 714B

8:30am - 8:45am

Urban Livability Analysis Based on Multi-Source Remote Sensing Data

Hailun Dai^1,2, Yu Liu¹, Tao Zhang¹, Guanghui Wang¹, Lijuan Zheng¹, Lei Miao¹, Ting Liu¹, Yunjia Zou¹, Wei Zhang^1,3, Jin Liu^1,2

¹Land Satellite Remote Sensing Application Center, Ministry of Natural Resources, China, People's Republic of; ²Beijing Satlmage Information Technology Co. Ltd, China; ³China University of Geosciences, Beijing, 100083, P. R. China

Under the background of city physical examination and assessment in territorial spatial planning, urban livability has become a focus of interest. Urban livability reflects residents' overall satisfaction with their living environment. Previous studies have been constrained by issues such as low data precision, coarse spatial scales, and limited practical applicability. To address these limitations, this study developed a refined livability evaluation framework by multi-source remote sensing data, with a primary emphasis on high-resolution domestic satellite imagery, including Gaofen (GF-1) and Ziyuan (ZY-3). Integrated with Suomi NPP night-time light data and socio-economic datasets, the research assessed four key dimensions, which were safety and resilience, residential comfort, recreation convenience, and quality and vitality in the city of Wuhan and Yibin at a detailed kilometer-grid scale. Results revealed distinct spatial patterns of urban livability of the two cities: Wuhan's central urban areas exhibited higher, more clustered livability, driven largely by quality and vitality, whereas Yibin showed a more fragmented pattern with strengths in recreation convenience but relative weaknesses in residential comfort and urban vitality. This study underscores the significant value of high-resolution, multi-source remote sensing data in enabling precise, spatially explicit livability analysis, thereby providing a scientific basis for targeted spatial planning and urban quality enhancement.

8:45am - 9:00am

Integrated Remote Sensing and GIS-Based Assessment of Urban Morphology, Waterlogging, and Dengue Hotspots in Chennai (2021–2023)

Swetha Sureshkumar, Sulochana Shekhar

Central University of Tamil Nadu, India

Dengue transmission in rapidly urbanising tropical cities is shaped by the combined influence of climate variability, urban morphology, and short-term surface water dynamics. This study develops a remote sensing and GIS-based framework to investigate the interaction between built-up density, waterlogging, and dengue incidence in Chennai from 2021 to 2023. Multi-source datasets, including Sentinel-2 imagery, NICFI high-resolution LULC, NDVI, and NDWI indices, Google Open Buildings footprints, IMD daily climate variables, and geocoded dengue case records, were integrated into a harmonised spatial grid for systematic analysis. Waterlogging-prone zones were delineated using a Sentinel-2 water-frequency method to capture the post-rainfall surface water accumulation rather than only persistent water bodies. Spatial clustering of dengue cases was examined using kernel density estimation, Global and Local Moran’s I, and Getis-Ord Gi*, revealing strong spatial autocorrelation and persistent hotspots in older, densely built neighbourhoods such as Kodambakkam, Adyar, Guindy, Saidapet, and Velachery, where compact built-up patterns and drainage limitations facilitate vector breeding. Peripheral areas showed weaker clustering and lower disease intensity. To assess the climatic influences, a Distributed Lag Non-linear Model (DLNM) was employed to quantify the delayed and non-linear effects of rainfall, maximum temperature, and minimum temperature on dengue incidence. Results showed notable lagged responses, with rainfall and minimum temperature exhibiting strong delayed associations aligned with mosquito development and viral incubation cycles. By integrating climatic, hydrological, and urban structural metrics, this study provides a replicable geospatial workflow for identifying micro-scale dengue-risk environments, supporting evidence-based vector-control strategies and climate-resilient urban planning in tropical cities.

9:00am - 9:15am

From Pixels to Pathogens: Multi-Scale Environmental Modeling of Tick-Borne Disease Risk

Kirsten Noltie, Dongmei Chen

Queen's University, Canada

Ticks are key vectors of human and animal disease, with Borrelia burgdorferi sensu stricto, the causative agent of Lyme disease, posing the greatest risk in North America. In Canada, Lyme disease cases are rising as the blacklegged tick (Ixodes scapularis) expands northward, driven by climate change, land cover shifts, and host movement. The Kingston, Frontenac, Lennox and Addington (KFL&A) region is a well-established hotspot, highlighting the importance of mechanistic models that realistically represent heterogeneous environmental drivers of transmission.

This study integrates multi-sensor Earth observation (MODIS, GEDI, Landsat) with climate, habitat, and ecological data to improve mechanistic tick phenology models. A hierarchical framework incorporates microclimate, landscape, and regional variables, enabling assessment of how sensor type, spatial resolution, and environmental gradients influence seasonal tick activity predictions. Model calibration and validation use field-collected tick and pathogen data, supplemented by citizen science observations.

By systematically linking EO to disease modeling, this approach improves the representation of environmental drivers, enhances predictive performance, and supports public health planning. The framework is transferable to other vector-borne diseases, advancing the integration of remote sensing into epidemiological forecasting at regional to national scales.

9:15am - 9:30am

Detection of Illegal Landfills on Satellite Imagery Using a Multi-agent Framework

Yehor Lytvynov¹, Viktoriia Hnatushenko^1,2, Volodymyr Hnatushenko³, Christian Heipke²

¹Ukrainian State University of Science and Technologies; ²Leibniz University Hannover, Germany; ³Dnipro University of Technology

Illegal waste disposal sites pose significant ecological and public-health risks yet remain difficult to track with traditional field inspections. We propose a multi-agent detection framework that fuses textural, spectral, and contextual cues from medium-resolution satellite imagery for this work. Three specialised agents - Waste-Pile, Road, and Industry detectors - are implemented as YOLO (You Only Look Once) convolutional models that generate partial hypotheses, which are then hierarchically aggregated through rule weights learned from expert-labelled samples. The system provides an interpretable set of object relations, allowing regulators to trace how individual cues contribute to the final decision. The method was validated on an independent test area near Taromske (Dnipropetrovsk region, Ukraine) and corroborated by ground surveys. Joint aggregation raised the posterior probability of the primary target cluster from 0.27 (single-detector confidence) to 0.91, while maintaining robustness to label noise and heterogeneous sensor characteristics. Compared with conventional CNN baselines, the proposed approach delivers three key advantages: explicit explainability of outputs, transferability to 10 m spatial resolution without extensive retraining, and seamless integration of heterogeneous evidence sources. The proposed framework can serve as a cost-effective backbone for regional and national waste-monitoring systems. Future work will focus on near-real-time processing of Sentinel-2 time series, incorporation of hyperspectral and thermal methane indicators to assess remediation stages, and extension of the array of features to other anthropogenic disturbances such as open-pit mining and construction debris.

9:30am - 9:45am

Building Deformation Monitoring and Safety Risk Assessment Based on PSI Technology

Naiyi Li^1,2, Feng Zhao^1,2, Wenqiang Yao^1,2

¹Shanghai Surveying And Mapping Institute, China; ²Shanghai Natural Resources Satellite Application Technology Center,China

Based on traditional PS-InSAR technology, this study proposes a building elevation estimation method based on long and short baseline iteration. It utilizes long-temporal SAR images for multiple iterations to calculate building heights, which are used as prior information. Combined with the Interferometric Point Target Analysis (IPTA) method, it inverts building deformation information. The K-means clustering method is employed for PS point clustering analysis, classifying PS points with similar deformation trends and mapping them to buildings. A building safety risk assessment system is established, which comprehensively evaluates the cumulative deformation amount and deformation rate of both the building structure and its foundation. In this paper, the feasibility of the above method is verified by an example. The deformation of 9442 buildings is extracted in the study area, of which 245 buildings are in a high security risk state, and 2 buildings are in a high security risk state. Through this study, it can provide comprehensive auxiliary decision-making reference data covering macro wide-area and micro single buildings for urban construction management departments.

1:30pm - 3:00pm

WG III/8D: Remote Sensing for Agricultural and Natural Ecosystems
Location: 714B

1:30pm - 1:45pm

Spatial Aerodynamic Roughness of Forested Landscapes from Airborne LiDAR

Mahmoud Ahmed¹, Joris Timmermans¹, Roderik Lindenbergh¹, Massimo Menenti^1,2

¹Department of Geoscience and Remote Sensing, Delft University of Technology, The Netherlands; ²National Key Laboratory of Remote Sensing and Digital Earth, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China

Accurately representing forest canopies in atmospheric models remains a major challenge due to the complex ways in which trees interact with airflow and modulate surface--atmosphere exchanges. Aerodynamic roughness is a key control variable in modelling frameworks related to air quality, meteorology, and atmospheric transport processes. In this study, we develop a physically based and spatially resolved framework to estimate aerodynamic roughness length from remote sensing observations. Specifically, using AHN (Actueel Hoogtebestand Nederland) airborne laser scanning data over a coniferous forest in Loobos, located within the Veluwe Natura 2000 region in the central Netherlands, we derive geometric roughness parameters and compare them qualitatively against eddy-covariance (EC) tower measurements at the site.

Results show that LiDAR-based roughness captures strong directional and structural variability driven by forest stand height and canopy heterogeneity, patterns that closely align with the anisotropy observed in the EC-derived displacement height and roughness length. Seasonal differences between leaf-on and leaf-off conditions further demonstrate the importance of canopy phenology in shaping aerodynamic behaviour. The spatial patterns resolved by the AHN data underscore the capacity of high-resolution laser scanning to reveal fine-scale canopy--atmosphere interactions that are entirely missed by traditional land-use--based roughness representations.

Additional opportunities remain for integrating complementary remote sensing observations (e.g., multispectral vegetation properties) to enhance the dynamical fidelity of the roughness estimates. The proposed framework provides an observation-driven pathway for parameterizing surface roughness, offering substantial potential for improving land-use representations in wind-flow and chemical transport models such as LOTOS--EUROS.

1:45pm - 2:00pm

Forest Canopy Height Mapping in Tanzanian Tropical Rainforests Using Multimodal Remote Sensing Data and Machine Learning

Soheil Zaghain^1,2, Seyed Ehsan Khankeshizadeh^1,2, Sadegh Jamali¹, Torbern Tagesson³, Ernest William Mauya⁴, Ali Mohammadzadeh², Filbert Francis¹

¹Department of Technology and Society, Faculty of Engineering, Lund University, P.O. Box 118, 221 00 Lund, Sweden.; ²Department of Photogrammetry and Remote Sensing, Faculty of Geodesy and Geomatics Engineering, K. N. Toosi University of Technology, Tehran 19967-15443, Iran.; ³Department of of Earth and Environmental Sciences, Lund University, Sölvegatan 12, SE-223 62 Lund, Sweden.; ⁴Department of Forest Engineering and Wood Sciences, College of Forestry, Wildlife and Tourism, Sokoine University of Agriculture, Morogoro, Tanzania.

Forest canopy height (FCH) is a critical biophysical parameter that characterizes forest structure and provides fundamental information for estimating above-ground biomass and carbon stocks. The Global Ecosystem Dynamics Investigation (GEDI) Level 2A (L2A) product offers accurate canopy height observations; however, its point-based nature constrains spatial continuity in FCH mapping. This study integrates the multimodal remote sensing datasets for continuous FCH mapping in Tanzania’s West Usambara (WUSA) forest, recognized globally for its rich biodiversity and ecological significance. Hence, remote sensing data, including Sentinel-1 polarizations (VV and VH), Sentinel-2 spectral bands and vegetation indices, and the SRTM digital elevation model (DEM), were integrated and matched with GEDI canopy height data used as reference for FCH modelling. The optimal feature set was derived by evaluating the performance of several feature selection and extraction methods, including Principal Component Analysis (PCA), Minimum Noise Fraction (MNF), Recursive Feature Elimination (RFE), Sequential Feature Selection (SFS), and the Selected K-Best approach using F-value and mutual information scoring functions. The feature set derived from RFE, comprising ten features from all data sources, demonstrated the highest accuracy and reliability in FCH modelling. Subsequently, four machine learning algorithms, including Random Forest (RF), Gradient Boosting Regressor (GBR), Support Vector Regressor (SVR), and Ordinary Least Squares (OLS), were evaluated for FCH modelling. Accordingly, RF achieved higher R² than GBR, SVR, and OLS, with differences of 0.9%, 8.7%, and 16.4%, respectively. Therefore, the RF model, as the most reliable model, was employed for FCH mapping across the WUSA forest.

2:00pm - 2:15pm

Comparing DeepLabv3+ and Depth Anything V2 on Canopy Height Model Prediction on a Continental Scale Dataset of Australia

Kevin Yuelin Qiu¹, Rewanth Ravindran², Nicolas Pucino^3,4,5, Dimitri Bulatov¹, Shaun R. Levick⁶, Martin Brandt⁷, Dorota Iwaszczuk², Tim McVicar³

¹Scene Analysis Department, Fraunhofer IOSB Ettlingen, Germany; ²Remote Sensing and Image Analysis, Technical University of Darmstadt, Germany; ³CSIRO Environment, Canberra, ACT, Australia; ⁴Fenner School of Environment and Society, Australian National University, Canberra, ACT, Australia; ⁵Climate Friendly Pty Ltd, Sydney, NSW, Australia; ⁶CSIRO Environment, Urrbrae, SA, Australia; ⁷Department of Geosciences and Natural Resource Management, University of Copenhagen, Copenhagen, Denmark

Canopy height models (CHMs) are raster maps representing normalized tree canopy height above ground and are often used as co-products for estimating carbon storage, forest degradation, and biodiversity at regional to global scales. While airborne LiDAR delivers the most accurate canopy height (CH) measurements, its high cost and limited temporal coverage motivate the use of spaceborne (multispectral) imagery combined with machine learning.

In this study, we compare two distinct deep-learning approaches for continental-scale CHM estimation from 3 m PlanetScope imagery: (1) a CNN-based regression model (DeepLabv3+), and (2) a monocular depth-estimation model (Depth Anything V2) based on a foundation model. We train/fine-tune both models on a curated dataset of 16,973 pairs of airborne point cloud-derived CHMs and PlanetScope imagery of Australia using a stratified sampling scheme to ensure balanced representation of vegetation structural classes. We then evaluate their generalizability on independent validation sets across Australia, across different heights, and under limited-data scenarios.

Through extensive quantitative and qualitative analysis, we show that the DeepLab-based regression model outperforms Depth Anything across all evaluation metrics, partly because it can incorporate additional spectral channels. DeepLab also learns more effectively from less data. On our dataset, the conventional CNN-based regression model performs better than the fine-tuned foundation model.

2:15pm - 2:30pm

Data-Driven vs Functional Approaches for Regionally Transferable Biomass Modeling Using Airborne LiDAR

Maxim Okhrimenko^1,2, Guillermo Castilla², Craig Coburn¹, Chris Hopkinson¹

¹University of Lethbridge, Canada; ²Canadian Forest Service, Canada

To address the critical challenge of regional transferability for ALS-based above-ground biomass (AGB) models, we developed and applied a rigorous leave-one-region-out cross-validation (LORO-CV) framework. This protocol integrates a <1 SE “near-zero” bias filter to ensure models are not just accurate, but statistically free of regional bias.

With this framework, we compared two distinct modeling methods: a data-driven Best-Subset Selection (BSS) method and a Functional Regression (FR) method. The analysis was based on 163 field plots and co-located multispectral Titan ALS data from four regions in the Taiga Plains ecozone, Canada.

The BSS method identified a transferable linear model using height skewness, p95, and an intensity-weighted metric, which achieved 19.3% LORO-CV %RMSE and 2.0% mean absolute bias. Crucially, it passed our <1 SE bias screen in all regions. The FR model, relying only on height, achieved 22.4% LORO-CV %RMSE (4.1% bias) but failed the bias screen in two regions.

Our findings demonstrate that a systematic, bias-controlled data-driven method is effective for producing regionally transferable models. The results highlight the critical importance of ALS intensity metrics for this success, while also showing that the data-driven method currently surpasses the functional approach.

2:30pm - 2:45pm

Optimization of the National Biomass Allometric Equation Using Remote Sensing Data

Shweta Parajuli¹, Tramo K. Remmel², Richard L. Bello³

¹York University, Canada; ²York University, Canada; ³York University, Canada

The role of forests in carbon sequestration and regulation is important to understand, given the alarming rate of global warming caused by greenhouse gases. Understanding the structural characteristics of trees can help assess the potential of forests for carbon storage. Light Detection and Ranging (LiDAR) has emerged as a powerful remote sensing tool that is capable of providing detailed three-dimensional information of the forest. The increasing availability of aerial LiDAR data has provided opportunities to estimate the forest biomass over a larger extent. This study utilizes the available LiDAR data from the provincial repository of geospatial data to estimate the diameter at breast height (DBH), which is a key parameter in existing biomass allometric models. LiDAR-derived tree metrics were integrated with the optical images to further differentiate the forest type to assess how it influences the aboveground biomass estimates in a heterogeneous mixed-wood forest. This research contributes to improving our understanding of LiDAR's potential for estimating DBH, an area that has not been explored much. It also demonstrates how existing global biomass allometric equations can be utilized in combination with remote sensing technology to provide a pathway to a larger extent and an efficient method of biomass estimation across diverse ecosystems.

2:45pm - 3:00pm

Turning rural infrastructure into smart sensors: high‑frequency agricultural monitoring for next‑generation precision farming

Xingli Qin¹, Bingfang Wu^1,2, Miao Zhang^1,2, Fangming Wu¹, Mengxiao Li^1,2, Hongwei Zeng^1,2, Fuyou Tian¹, Kaimin Sun³

¹State Key Laboratory of Remote Sensing and Digital Earth, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China; ²College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China; ³State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, 430079, China

Communication towers equipped with cameras are widely distributed across rural landscapes but remain largely unused for scientific observation. This presentation introduces an AI-driven framework that transforms such existing infrastructure into a high-frequency, real-time agricultural monitoring system, complementing traditional satellite and UAV remote sensing.

The proposed system resolves three fundamental challenges that hinder tower-based sensing: (1) precise georeferencing of highly oblique imagery through a quaternion-based spatial transformation; (2) automated delineation of cultivated parcels via a GIS-guided, iterative segmentation process integrating the Segment Anything Model (SAM); and (3) intelligent recognition of crop types, growth stages, and farming activities using a multimodal large language model that fuses time-series imagery with contextual field data.

Validated through deployments in varied agricultural regions of China, the framework demonstrates stable operation and parcel-level accuracy for continuous monitoring within 1–2 km of each tower. The results indicate a practical pathway toward scalable, cost‑efficient, and autonomous agricultural information acquisition at high spatio‑temporal resolution.

3:30pm - 5:15pm

WG II/1A: Image Orientation and Fusion
Location: 714B

3:30pm - 3:45pm

AI-based Camera Pose Estimation on mixed Aerial and Ground Images: A comparative Study

Zichao Zeng, June Moh Goo, Jan Boehm

University College London, United Kingdom

Estimating camera poses jointly from aerial and ground imagery remains difficult because large viewpoint changes reduce overlap, alter appearance, and weaken the geometric assumptions relied on by both classical photogrammetry and recent AI-based reconstruction models. This paper presents a controlled comparison between a classic photogrammetric approach represented by COLMAP and a cross-view fine-tuned end-to-end model based on Dust3R. Tests are carried out on a London building scene containing 10 aerial and 29 ground images. Fine-tuned Dust3R reconstructs the full image set, whereas COLMAP successfully registers 24 ground-level images. Because both reconstructions are defined only up to an unknown similarity transform and no ground-truth poses are available, we evaluate the shared subset through 7-DoF similarity transformation analysis rather than direct metric pose errors. After transformation, the translation RMSE of the shared camera centres is 10.0\% of the reconstructed scene diagonal in the fine-tuned Dust3R coordinate frame. We further compare pairwise geometric support using a unified fundamental-matrix RANSAC evaluation over 406 image pairs. The AI-based pipeline achieves substantially higher inlier ratios than photogrammetric pipeline under the same verification settings, indicating more successful cross-view orientation. The study contributes a clearer evaluation protocol for mixed aerial-ground pose estimation without ground truth, together with an empirical analysis of robustness, alignment behaviour, and current limitations of both pipelines.

3:45pm - 4:00pm

Epipolar Rectification of a Generic Camera

Marc Pierrot Deseilligny, Ewelina Rupnik

Univ Gustave Eiffel, Géodata Paris, IGN, LASTIG

We propose a generic method for epipolar resampling that is not tied to a specific camera model. We demonstrate the effectiveness of the approach on a central perspective, pushbroom and pushbroom panoramic camera models. We also devise an \textit{epipolarability index} that measures the suitability of an image pair for epipolar rectification, and provide a formal derivation of the ambiguity bound to epipolar resampling. An open-source implementation of the algorithm is available at github.com/micmacIGN/micmac

4:00pm - 4:15pm

ThermalAssist: Towards Efficient Annotation of Thermal Imagery

Jingwei Zhu^1,2, Manoj Biswanath^1,3, Benjamin Busam^1,3

¹Chair of Photogrammetry and Remote Sensing, Technical University of Munich, Germany; ²School of Geospatial and Artificial Intelligence, East China Normal University, China; ³Munich Center for Machine Learning (MCML), Munich, Germany

Thermal infrared (TIR) imaging provides surface temperature of the objects and reveals heat-transfer patterns of buildings, which supports applications such as insulation inspection, energy leakage, and thermal bridge detection. However, the TIR image dataset with reliable annotations for deep learning remains scarce, as the labeling process is time-consuming and tedious, and particularly challenging due to the low-texture and blurred features of TIR images. To address this challenge, we propose ThermalAssist, a geometry and gradient-aware framework designed to assist thermal anomaly labeling in TIR imagery. By combining sparse manual annotations with dense correspondence via flow-based propagation, the framework efficiently transfers labels across image sequences while preserving semantic consistency and boundary integrity. Experiments on the TBBR dataset demonstrate that ThermalAssist can transfer labels between images, achieving up to 21% higher F1-score and 35% higher precision compared to state-of-the-art tracking-based baselines. It also helps identify missing annotations and boundary inconsistencies for quality checks. This work provides a foundational tool for quality-assured thermal annotation pipelines and represents a key step toward more scalable, reliable, and intelligent labeling of thermal imagery.

4:15pm - 4:30pm

Evaluation of recent AI-based point matching algorithms applied on aerial images

Pablo d'Angelo, Franz Kurz, Alaa Eddine Ben Zekri, Reza Bahmanyar

German Aerospace Center, Germany

Accurate image matching is essential for the precise orientation of airborne imagery, yet modern feature matchers are rarely evaluated on real aerial data with great temporal, seasonal, and radiometric changes. For this study, we introduce the AerialRefMatch dataset, which comprises 51 challenging aerial images and corresponding true-ortho reference data. We benchmark classical and deep learning–based matching algorithms on AerialRefMatch, considering two scenarios: matching original images and matching approx-orthorectified images generated using GNSS/IMU orientations. For each method, image-based ground control points are derived and used for single-image pose estimation; accuracy is assessed via independent checkpoints. Results show that directly matching on original images is very difficult: fewer than 14\% of images can be oriented with pixel-level accuracy. When approx-orthorectification is used, performance improves substantially. JamMa, SIFT, and SuperPoint+LightGlue achieve pixel-level accuracy for up to 30\% of images, with JamMa being most robust on difficult cases and SIFT-based variants being more precise on the easier ones. Deep detector-free models such as ELoFTR and RoMa are less accurate but more robust to the original images than other models. Overall, state-of-the-art deep learning-based matchers still struggle with large rotations, scale differences, and semantic differences, and strongly benefit from prior image orientation knowledge and lack sub-pixel precision.

4:30pm - 4:45pm

Faster than Light: An Embedded-Efficient Matching Model with ReLU Linear Attention

Ziang Wang¹, Tao He¹, Wei Cui¹, Yu Duan¹, Kaimin Sun¹, Haoyun Miao²

¹State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan China; ²North Automatic Control Technology Institute. Taiyuan, China

Deep learning-based image matching faces a critical challenge when deployed on computationally constrained embedded aerial devices. Transformer-based architectures, particularly the scaled dot-product attention mechanism, incur high computational costs that limit inference speed for real-time applications. To address this bottleneck, we propose FastGlue, a sparse feature matching algorithm that adapts the LightGlue architecture through two targeted modifications: replacing the scaled dot-product attention with a ReLU-based linear attention module, and reducing the depth of the graph neural network. These changes reduce computational complexity while maintaining competitive matching performance. Evaluations on HPatches and MegaDepth-1500 benchmarks show that FastGlue achieves accuracy comparable to LightGlue while improving inference speed—from 20.05 ms to 17.05 ms on GPU, and from 840.45 ms to 665.85 ms on an RK3588 embedded CPU. Our work demonstrates that targeted architectural simplifications can yield meaningful efficiency gains for deep learning-based feature matching on resource-constrained platforms.

4:45pm - 5:00pm

SCOP: An Open-Source and Educational JAX-Powered Framework for Generic Photogrammetric Bundle Adjustment

Adrien Gressin

University of Applied Sciences Western Switzerland (HES-SO / HEIG-VD)

We present SCOP, an open-source and educational framework for generic photogrammetric bundle adjustment built in Python and powered by JAX automatic differentiation. SCOP removes the need for manual Jacobian derivation by expressing all projection models as pure mathematical functions with automatically computed exact derivatives.

The framework supports multiple camera geometries (pinhole, fisheye, equirectangular) and optimization methods (Gauss-Newton, Gauss-Newton-Armijo, Levenberg-Marquardt, Gradient Descent). Its modular architecture, separating cameras, images, and observations, allows easy extension to new sensors and constraint types, including GNSS positions, ground control points, and geodetic observations.

A hybrid computation pipeline combines JAX for differentiation with a Rust backend for sparse Schur complement elimination, achieving ~0.5 s per iteration on a real-world dataset with 79k unknowns and 181k observations. Following classical least-squares photogrammetry, SCOP provides rigorous uncertainty estimation through covariance matrices, normalized residuals, and reliability indices. With synthetic data tools and interactive 3D visualization, it enables transparent teaching and reproducible research.

5:00pm - 5:15pm

TriCo-Net: Learning Semantically Aware Local Features via Triple Consistency

Longze Zhu^1,2, Li Yan^1,2, Hong Xie^1,2, Hao Wu^1,2, Shan Su^1,2, Binbing Wang¹, Xiaoteng Yang¹, Junjie Yuan¹, Aoran Li³

¹Wuhan University, The School of Geodesy and Geomatics, Wuhan 430079, Hubei, China; ²Hubei Luojia Laboratory, Wuhan 430079, Hubei, China; ³Henan Normal University, The College of Software, Xinxiang 453000, Henan, China

Local feature matching in complex scenes is hindered by semantic ambiguity, where detectors often latch onto transient or repetitive patterns. We present TriCo-Net, which learns semantically aware and discriminative local features by enforcing a Triple Consistency (TriCo) principle across implicit semantics, scale, and spatial context. During training, an Implicit Semantic Strategy (ISS) distills cues from a segmentation teacher to modulate keypoint reliability and descriptor learning, while introducing no overhead at inference. A Scale-wise Semantic Harmonizer (SSH) aligns and fuses feature-pyramid levels to ensure cross-scale coherence, and a Global Context Propagator (GCP) broadcasts scene-level dependencies to resolve local ambiguities. On Aachen Day–Night v1.1, TriCo-Net achieves strong and consistent gains in visual localization, particularly under night conditions, and exhibits robustness to blur, noise, and large homographies. Ablations show complementary benefits from ISS, SSH, and GCP, with ISS contributing most at tight thresholds and at night. TriCo-Net narrows the day–night performance gap while maintaining mid-range throughput, offering a practical trade-off between robustness and efficiency.