Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Daily Overview | |
|
Location: 713B 125 theatre |
| Date: Tuesday, 07-July-2026 | |
| 8:30am - 10:00am | WG III/1I: Remote Sensing Data Processing and Understanding Location: 713B |
|
|
8:30am - 8:45am
OG-TPTV: A texture-preserving regularizer for hyperspectral image denoising Wuhan University, China Hyperspectral images (HSIs) are often severely degraded by mixed noise, such as Gaussian, stripe, and impulse noise during acquisition and transmission, which seriously impedes their subsequent applications. Therefore, HSI denoising is both crucial and challenging. In this work, we present a gradient-domain outlier-guided texture-preserved total variation (OG-TPTV) regularizer designed to remove mixed noise in HSIs. First, we utilize the mode-3 low-rank property of HSI gradient maps along the spectral dimension and apply a low-rank decomposition model to extract their spatial representation coefficients (SRCs). To improve the sparsity characterization of SRCs in the gradient subspace, an outlier-guided strategy is introduced. Specifically, we perform outlier detection on gradient maps to distinguish noise from texture structures and remove outliers to generate precise texture weighting maps. The resulting texture weight maps offer adaptive guidance for adjusting the strength of the sparsity constraints. Finally, a denoising method for HSIs is developed based on OG-TPTV. Extensive experiments on both synthetic and real HSIs demonstrate the superior denoising performance of our method. 8:45am - 9:00am
SpectralNet-X: Transformer-based Lossy Compression for Hyperspectral Satellite Data 1Fraunhofer IOSB, Germany; 2Karlsruhe Institute of Technology (KIT) Hyperspectral satellite missions generate massive data volumes that are difficult to transmit and store under tight onboard resource constraints, making effective lossy compression a key enabling technology. We propose SpectralNet-X, a transformer-based autoencoder for spectral-only compression of spaceborne hyperspectral imagery at a fixed compression ratio of 16. The encoder maps each spectrum to a low-dimensional latent code using a 1D convolutional projection followed by stacked self-attention layers with rotary position embeddings, and aggregates information via cross-attention pooling. The decoder reconstructs full-band spectra through an upsampling stack and per-band affine calibration. To improve reconstruction fidelity and generalization, SpectralNet-X is first pretrained in a masked-signal reconstruction task inspired by SimMIM and then fine-tuned with a mixed objective that combines mean-squared error and spectral angle mapper (SAM) terms using a scheduled weighting scheme. We evaluate SpectralNet-X on the large-scale HySpecNet–11k benchmark and in a mission-realistic cross-sensor setting, where models trained on HySpecNet–11k are tested on PRISMA hyperspectral scenes. Across PSNR, SSIM, and SAM, and when compared to three different compression autoencoders, SpectralNet-X achieves the lowest angular reconstruction errors while maintaining competitive distortion metrics and substantially reducing the fraction of spectra with large SAM outliers. These results indicate that transformer-based spectral compression is a promising candidate for robust, mission-realistic onboard hyperspectral data reduction. 9:00am - 9:15am
Sensitivity of Deep Learning Validation to Spatial Scale–Sample Size Interactions in Hyperspectral Imaging 1College of Civil Engineering, Taiyuan University of Technology, Taiyuan, China; 2Shanxi Key Laboratory of Civil Engineering Disaster Prevention and Control, Taiyuan,China; 3School of Design and the Built Environment, Curtin University, Perth, Australia; 4School of Computer Science and Technology, Aba Teachers College, Aba Zhou Validating the performance of deep learning models in satellite imagery is essential for ensuring model generalizability, decision reliability, and spatial transferability—particularly in the context of hyperspectral images, which contain high-dimensional, spatially complex data. While it is well recognized that multiple spatial characteristics influence deep learning model performance, few studies have systematically examined how the interactions among these characteristics affect model validation sensitivity in hyperspectral contexts. This study aims to investigate how the interaction between spatial scale (e.g., surrounding 3, 5, 7 grids) and training sample size (e.g., 10%, 30%, 50% of all data) influences the validation accuracy and sensitivity of deep learning models. An innovative validation sensitivity index is developed to quantify the change in accuracy per unit of spatial scale and sample size, enabling a more refined assessment of model robustness. The index is applied to three representative hyperspectral datasets, covering diverse environmental and spectral conditions. Results show that spatial scale accounts for 0~21.0% accuracy variation, training sample size contributes 5.6~36.5% variation, but their interaction leads to 5.4~70.3% variation, indicating a nonlinear amplification enhanced effect. These findings may be explained by the compounded influence of data contextuality, spatial redundancy, and model overfitting dynamics. This study demonstrates the critical need to consider spatial interactions in validation design, offering new insights for enhancing the reliability of geospatial artificial intelligence (GeoAI) applications in remote sensing and spatial data science. 9:15am - 9:30am
Assessment of RTM-induced Surface Reflectance Differences between 6SV and VLIDORT under a Single Atmospheric-correction Framework 1Division of Earth Environmental Science (Major of Spatial Information Engineering), Pukyong National University, Republic of Korea; 2Professor, Division of Earth Environmental Science (Major of Spatial Information Engineering), Pukyong National University, Republic of Korea Surface reflectance is a foundational variable in optical remote sensing, as inaccuracies introduced during atmospheric correction can propagate and amplify across subsequent satellite-derived products. Nonetheless, the extent to which the choice of Radiative Transfer Model (RTM) affects reflectance retrieval has not been sufficiently examined. This study investigates how two widely used RTMs—6SV and VLIDORT—produce different surface reflectance outcomes when applied under consistent atmospheric and geometric conditions for the GEO-KOMPSAT-2B/GEMS instrument. To ensure comparability, both models were driven by identical GEMS aerosol properties and an equivalent LUT configuration. The comparison shows that while the two RTMs reproduce broadly similar spatial patterns, systematic quantitative differences remain in the retrieved reflectance. These differences vary depending on atmospheric and viewing conditions, particularly under higher aerosol loading. A sensitivity analysis further indicates that aerosol amount and scattering characteristics, alongside viewing geometry, are key factors influencing the magnitude of RTM divergence. Overall, this study provides a structured assessment of RTM-dependent variability in atmospheric correction and highlights the importance of model choice when interpreting or harmonizing surface reflectance products. The findings offer a basis for improving consistency in future GEMS-based retrievals and for advancing reliable surface reflectance generation in geostationary remote sensing. 9:30am - 9:45am
Attention-driven Cross-modal Self-supervised Learning for Label-efficient Hyperspectral-LiDAR DSM Classification 1Fraunhofer IOSB, Germany; 2Institute for Photogrammetry and Geoinformatics (ifp), University of Stuttgart, Germany Remote sensing acquisition systems rely on a range of platforms, from drones to satellite missions, to record multimodal Earth surface data. This fact encourages the preparation of datasets with complementary properties, thereby increasing their discriminative potential. A common complementary combination is between Hyperspectral and LiDAR-generated digital surface model data. While engaging, this fusion poses challenges for specific applications. Multiple works fuse these modalities at the feature level using vector concatenation, maximization, or averaging. Although functional, these methods omit target interactions between the modalities. Another challenge in remote sensing is the quantity and quality of labels required by deep learning methods, which are expensive, error-prone, and difficult to scale. We address the challenges above by proposing a self-supervised processing framework based on cross-modal attention that effectively fuses features at multiple levels, thereby exploiting complementary information across data streams. Specifically, our method is founded on a pseudo-Siamese network that reweights each modality’s features with information from the other via a mirrored cross-modal attention. The network’s objective is to maximize the similarity between the feature representations of both streams. A fusion network builds a latent representation using the learned encoders and attention modules. Then, a k-Nearest Neighbor classifier categorizes each sample within the representation using ten labels per class. Our experiments show that our spatial- and channel-spatial cross-modal attention approaches outperform well-established fusion methods for label-efficient land cover classification across datasets. Our findings lay the groundwork for fusion methods that effectively exploit inter-stream data relationships to encourage complementarity. 9:45am - 10:00am
GAN-based pan-to-rgb Image Translation for remote sensing Data 1Nanjing University of Aeronautics and Astronautics, China, People's Republic of; 2Yangtze Delta Region Institute of Intelligent Sensing (Nantong) Despite the rapid development of satellite sensors, acquiring high-resolution RGB images remains a challenge. In this paper, a GAN-based multiscale feature-based pan-to-rgb model is proposed to establish a novel framework for high-resolution, high-fidelity RGB images generation from remote sensing panchromatic images. The spatial structure, texture, and color of the results are consistent with the real images, and the colors are naturally realistic and vibrant. Multiscale features and symmetric luminance color decoders are utilized to overcome color desaturation, inaccuracy, and distortion in conventional algorithms. By combining CNNs for local feature modeling and transformers for global feature modeling, this approach learns pan-to-rgb mappings to produce high-resolution, high-fidelity RGB images in CIELAB space. Besides, the luminance distance loss and the color distance loss are utilized to prevent the coupling of luminance and color. We also conducted experimental validation on Gaofen-7 satellite data, and the results demonstrated that the FID, CF, and △CF indicators of the proposed algorithm improved by 2.90%, 11.77%, and 64.51%, respectively, compared to the comparison algorithms. |
| 1:30pm - 3:00pm | WG II/9B: Vision Metrology Location: 713B |
|
|
1:30pm - 1:45pm
Quantization-Aware Training for Efficient Object Detection on FPGAs: Case Studies Technical University of Munich, Germany Deploying object detection models for resource-constrained remote sensing applications necessitates on-board model inference capabilities. While Field Programmable Gate Arrays (FPGAs) offer massive parallelism as energy-efficient hardware platforms, model quantization remains essential to further balance computational efficiency with detection accuracy. Compared to post-training quantization methods that involve multiple-stage development with consistent dependency on domain datasets, quantization-aware training (QAT) integrates quantization constraints into training, providing a simpler pipeline for model compression. However, QAT introduces quantization errors to which smaller objects are more vulnerable. To address this issue, we propose object-scale-aware (OSA) regularization that amplifies quantization error penalties for smaller targets. Our approach is validated through two case studies: bird detection at airports and aerial-view building detection. We perform 8-bit QAT on YOLOX series models using the MVA2023 dataset and the Bavarian Building Dataset for the respective studies. Our method achieves up to 50.2 times inference acceleration with minimal accuracy loss on Xilinx Kria KV260 FPGAs compared to full-precision models. The ablation study and detection examples further demonstrate the effectiveness of OSA regularization in small object detection. 1:45pm - 2:00pm
Evaluation of Visual Place Recognition Methods for Image Pair Retrieval in 3D Vision and Robotics 1Karlsruhe Institute of Technology, Germany; 2Delft University of Technology, Netherlands A broad evaluation of state-of-the-art Visual Place Recognition methods is presented. The evaluation focuses on tasks where a fast image pair retrieval is of high importance, such as image-driven scene registration, SLAM or Structure-from-Motion correspondence search. This implies, that the focus of the study is geared away from typical Visual Place Recognition and towards scenarios of interest in computer vision and robotics. A sophisticated evaluation pipeline for retrieval and runtime performance is presented. Prepared datasets based on widely used benchmarks from different domains are utilized, such as indoor-SLAM, outdoor object-centric as well as autonomous navigation in urban and sub-urban areas. 2:00pm - 2:15pm
MVM-IOD: An Industrial Object-Centric Benchmark Dataset for the Evaluation of 3D Reconstruction Methods KIT, Germany 3D object reconstruction, camera pose estimation, and novel view synthesis in industrial applications are challenging tasks, as errors are costly while the timewindow for solving these tasks is often limited. The complexity of typical industrial objects further complicates these tasks. Different datasets that can be used to evaluate current methods on these tasks exist, however, most of them do not depict realistic industrial scenarios. We introduce the Machine Vision Metrology Industrial Object Dataset (MVM-IOD) that addresses this lack of datasets. The hardware setup to acquire the dataset consists of a camera, mounted upside down due to space restrictions, at the end effector of an industrial robot arm. Images of typical industrial objects are captured systematically, by moving the camera on a hemisphere around the objects. MVM-IOD contains the camera poses, the acquired RGB images, and the 3D point cloud of 9 objects and 2 background choices resulting in 18 scenes, which allows evaluation of all image based methods that compute a 3D reconstruction, camera poses, and/or novel views. Based on our dataset, we extensively evaluate current state-of-the-art 3D reconstruction and camera pose estimation methods, such as Structure from Motion, Multi-View Stereo, Visual Geometry Grounded Transformer (VGGT), π3, as well as 2D Gaussian Splatting and report our findings to create a baseline for future research. 2:15pm - 2:30pm
A Critical Synthesis of Uncertainty Quantification and Foundation Models for Semantic Segmentation Karlsruhe Institute of Technology, Germany Foundation models are increasingly breaking what seemed to be impossible not long ago by enabling unprecedented accuracy and cross-domain generalization. Yet their lack of interpretability, tendency to be overconfident, and sensitivity to real-world domain shifts pose critical challenges for safety- and mission-critical applications. Uncertainty quantification (UQ) offers a principled way to address these issues, but its integration into segmentation foundation models has yet to be explored. In this paper we present the first systematic evaluation of UQ methods applied to a foundation model for semantic segmentation. We fine-tune a lightweight DPT decoder on top of the pretrained SAM2 encoder to establish a simple yet competitive baseline and benchmark four representative UQ approaches – Monte Carlo Dropout, Deep Sub-Ensemble, Test-Time Augmentation, and Evidential Deep Learning – across Cityscapes, NYUv2, and two challenging out-of-domain settings. Our analysis compares segmentation accuracy, calibration, uncertainty quality, and inference time, revealing clear trade-offs between predictive performance, reliability, and computational cost. These results highlight both the promise and the current limitations of uncertainty-aware foundation models, pointing to the need for future work that jointly optimizes accuracy, robustness, and efficiency for real-world deployment. 2:30pm - 2:45pm
The Impact of CutMix on Reliability and Robustness in Semantic Segmentation Karlsruhe Institute of Technology, Germany Ensuring not only high accuracy but also reliable and robust predictions is critical for the deployment of semantic segmentation models in safety-critical applications such as autonomous driving. Despite the widespread use of CutMix – a simple yet powerful data augmentation strategy – its effect on the reliability and robustness in dense predictions tasks remains unexplored. Motivated by recent findings that semi-supervised segmentation methods, where CutMix is a core component, can severely degrade reliability, this study isolates and systematically analyzes the influence of CutMix on segmentation accuracy, calibration, and uncertainty quality. We evaluate two representative architectures, the CNN-based DeepLabV3+ and the transformer-based SegFormer, across both in-domain and out-of-domain scenarios. Our results show that CutMix has only a minor impact on segmentation accuracy but consistently improves the reliability, particularly under distribution shifts. These improvements indicate that CutMix primarily enhances the trustworthiness of the model’s calibration and uncertainty rather than the raw segmentation prediction itself. This distinction is crucial for safety-critical deployment, where reliable confidence estimates are as important as raw performance. 2:45pm - 3:00pm
Uncertainty Quality of VGGT: An Analysis on the DTU Benchmark Dataset Karlsruhe Institute of Technology, Germany Visual Geometry Grounded Transformer (VGGT) has already attracted a great deal of attention in a short period of time, not least due to the Best Paper Award at CVPR-2025. Similar to DUSt3R and MASt3R, VGGT aims to bring about a paradigm shift by replacing established methods like bundle adjustment and feature matching with a simple, unified, feed-forward neural network that predicts camera poses, depth maps, and dense 3D structure directly from multiple images of a scene in a few seconds. A key aspect is its ability to process an arbitrary number of views consistently in a single forward pass without any post-processing or iterative optimization. For photogrammetry, this opens new possibilities for real-time, scalable, and accessible 3D reconstruction. In this context, not only high reconstruction accuracy but also high-quality uncertainty estimates are crucial, as they foster trust and enable robust quality assurance. This paper therefore investigates the quality of VGGT’s uncertainty predictions. The analysis identifies an effective confidence threshold for filtering VGGT’s raw output and demonstrates that enhancing uncertainty quality holds strong potential for improving the accuracy of its 3D reconstructions. |
| 3:30pm - 5:15pm | WG IV/2B: Artificial Intelligence and Uncertainty Modeling in Spatial Analysis Location: 713B |
|
|
3:30pm - 3:45pm
Chat2Map: A ReAct-based Agent Framework for Automated Web Map Generation from Natural Language Instructions 1National Geomatics Center of China, China, People's Republic of; 2Nanjing Normal University, School of Geography, Nanjing, Jiangsu,China WebGIS platforms have revolutionized geospatial data dissemination, yet their adoption remains constrained by the steep learning curve of mapping library APIs. Frontend libraries like Leaflet, OpenLayers, and platforms such as Tianditu contain hundreds of classes and methods, requiring substantial programming expertise. This technical barrier prevents domain experts—urban planners, environmental scientists, public health officials—from independently creating the visualizations they need for analysis and decision-making.While Large Language Models (LLMs) have revolutionized code generation, they struggle with domain-specific, low-resource APIs common in geospatial applications. When applied to specialized geospatial APIs, these models exhibit critical failures: they frequently "hallucinate" non-existent functions, misuse parameters, or generate syntactically plausible but semantically incorrect code. This unreliability stems from the underrepresentation of domain-specific libraries in LLMs' training corpora, creating a "last mile" problem that renders them unsuitable for professional geospatial development. This study proposes a ReAct-based agent framework for automated web map generation from natural language instructions. The framework constructs a stateful, cyclic workflow and enables human–AI interactive WebGIS code generation based on the Tianditu JavaScript API. Its effectiveness and generality are validated through multi-model evaluation (GPT-4, Claude 3, Llama 3, Qwen-Max), demonstrating robust performance across diverse application scenarios. Experimental results show that the framework achieves professional-grade quality in both directive-driven and data-driven geospatial visualization tasks. 3:45pm - 4:00pm
Bridging Human Intent and Geospatial Services: A Conceptual Framework and Feasibility Study for Text2GeoAPI National Geomatics Center of China, 100830 Beijing, China With the proliferation of online geospatial services, Geospatial Application Programming Interfaces (GeoAPIs) have become the backbone of modern spatial data interoperability. However, the high technical barriers of GeoAPIs, characterized by complex RESTful syntax and deterministic parameter requirements, create a significant "digital divide" for non-expert users. To bridge the gap between intuitive human spatial intent and technical service execution, this study proposes Text2GeoAPI, a novel conceptual framework for the automatic invocation and composition of geospatial services via natural language. We introduce the Intent-Entity-Operation (IEO) model to formalize spatial tasks, decoupling high-level semantic goals from atomic technical operations. We developed a modular prototype leveraging Large Language Models (LLMs) as cognitive engines to perform structured intent parsing, dynamic workflow planning, and multi-source result synthesis. Experimental evaluations using 100 diverse spatial queries demonstrate an overall task success rate of 86%, with the system effectively orchestrating multi-hop service chains (e.g., Geocoding → Isochrone Analysis → POI Search). The results confirm that Text2GeoAPI significantly lowers the threshold for accessing professional geospatial analysis, shifting the GIS paradigm from "tool-centric" to "intent-centric" intelligence. 4:00pm - 4:15pm
AI for Inclusive Winter Mobility: Multimodal Integration for Detecting Barriers Affecting People with Disabilities 1Center for Research in Geospatial Data and Intelligence (CRDIG), Department of Geomatics Sciences, Université Laval, 1055, Avenue du Séminaire, Quebec City, QC G1V 0A6, Canada; 2Center for Interdisciplinary Research in Rehabilitation and Social Integration (Cirris), Quebec City, QC G1M 2S8, Canada Winter accessibility poses critical challenges in cold-climate cities such as Québec, where snow and ice accumulation restrict the mobility of people with disabilities. This study presents an AI-driven multimodal framework designed to detect, classify, and map winter barriers affecting pedestrian accessibility in Québec City. Building upon the SNOWMAN project, synthetic image and textual datasets were developed to represent seven major snow- and ice-related obstacle categories, including icy ruts, deep snow, and uncleared sidewalks. The visual modality employed a self-supervised SimCLR model for snow-barrier classification (F1-score = 0.93), while the textual modality used a fine-tuned BERT classifier, achieving a perfect F1-score = 1.00 on validated synthetic descriptions. Canonical Correlation Analysis (CCA) aligned the two modalities into a shared latent space, enabling spatial fusion of visual and semantic embeddings for integrated analysis within the MobiliSIG Winter Mobility platform. The fused data produced dynamic accessibility maps revealing clusters of recurring winter hazards in known high-risk zones. The results confirm the feasibility of using synthetic multimodal data to simulate pedestrian-scale winter conditions and demonstrate the potential of multimodal AI for inclusive, data-driven mobility management in cold-climate cities. 4:15pm - 4:30pm
Assessing residential Land Efficiency with spatial–contextual GMM and human Activity big Data: a Case Study of Shenzhen 1Research Institute for Smart Cities & MNR Key Laboratory of Urban Land Resources Monitoring and Simulation, School of Architecture and Urban Planning, Shenzhen University; 2Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, 315211, China As China’s urban development shifts toward stock-based optimisation, identifying inefficient residential land has become important for urban regeneration. Existing approaches often rely on subjective weighting, linear analytical structures, or homogeneous treatment of different residential types, which weakens robustness and transferability. To address these limitations, this study proposes a data-driven framework that integrates mobile-phone signaling and other multi-source spatiotemporal big data in Shenzhen. Two dominant residential forms—formal residential communities and urban villages—are evaluated separately through a four-dimensional framework covering built form, activity vitality, economic efficiency, and environmental livability. Principal component analysis is used to estimate intrinsic dimensionality and initialize a parametric autoencoder. A spatially constrained Gaussian mixture model is then employed to identify inefficient residential clusters while preserving local coherence. The clustering results are interpreted using a random forest model and TreeSHAP, and externally validated by street-view imagery interpretation and limited field surveys. PCA retained five components for urban villages and six for formal residential communities, and the BIC selected six and five clusters for the two residential types, respectively. The results indicate that inefficient formal residential communities show scattered and island-like spatial patterns, whereas inefficient urban villages tend to form more continuous clusters along the edges of larger village agglomerations. Random forest and TreeSHAP further reveal that inefficient urban villages are more strongly associated with deficiencies in service accessibility and local socioeconomic conditions, whereas inefficient formal residential communities are more closely associated with lower residential vitality and relatively high development intensity. External validation indicates acceptable agreement with observed residential conditions. 4:30pm - 4:45pm
Reproducing Geospatial Crowdsourcing: How Consistent Is the Crowd? University of Stuttgart, Germany This paper investigates the long-term consistency and reliability of paid geospatial crowdsourcing on the online platform Microworkers.com. Over a five-month period, we conducted three crowdsourcing campaigns, each representing a task typical for remote sensing, i.e., pixel classification, point selection, and geometric outline acquisition, to assess whether repeated worker participation enhances data quality and reproducibility. Beyond individual task performance, we examine the broader question of whether crowdsourcing campaigns can yield reproducible results over extended periods. Despite the large and heterogeneous workforce of Microworkers.com, a substantial share of tasks was completed by recurring workers who consistently outperformed one-time participants. Furthermore, across all campaigns, data quality remained largely stable, with only minor variability between epochs. Additionally performed statistical analyses confirm that reproducible outcomes are achievable, highlighting the potential of reliable and reproducible crowdsourcing results for geospatial data acquisition. 4:45pm - 5:00pm
Shaping the Colonial Port: Urban Networks and Spatial Form in the Early Modern Era Harbin Institute of Technology, Shenzhen, China, People's Republic of This abstract presents a comprehensive research framework examining the interplay between colonial trade networks and the spatial form of port cities during the early modern era. Firstly, the study constructs a geographic database of nearly 300 colonial port cities, using intercity trade data from East India Company archives as network edges to analyze their structural and morphological evolution. Secondly, it processes historical maps of colonial ports through a fine-tuned multimodal large language model to extract and classify spatial morphological features, establishing a systematic typology of urban form patterns. Thirdly, the research develops regression models to reveal correlations between network status and morphological patterns. Preliminary findings highlight Batavia's dominant yet volatile role within the network and reveal a trend toward decentralization over the 18th century. The research contributes to both urban historical studies and digital humanities by offering a scalable, comparative approach to interpreting colonial port cities as spatial manifestations of global economic and political forces, while establishing empirical relationships between network status and urban form characteristics. It further provides a refined framework for contextualizing their cultural heritage significance within trans-colonial networks. 5:00pm - 5:15pm
Vector generalization of the drainage network 1University of Brasília, Brazil; 2Institute of Engineering, Rio de Janeiro, Brazil; 3Pontifical Catholic University, Rio de Janeiro, Brazil This study explores the application of Graph Convolutional Networks (GCNs), specifically the GraphSAGE model, to the cartographic generalization of hydrographic networks in the state of Santa Catarina, Brazil. The generalization of river segments is critical for transitioning from detailed (1:25,000) to generalized (1:100,000) scales. It's traditionally a manual, rule-based process. By modeling drainage systems as graphs and training deep learning models with data from the Brazilian Army's Geospatial Database (BDGEx), this research evaluates how geometric and semantic attributes influence generalization outcomes. This data follows Brazilian Technical Specifications of the Geospatial Vector Data Structure (ET-EDGV), therefore they figure as a systematic data from Brazilian institutions. GraphSAGE model was trained four times, each incorporating varying combinations of attributes such as segment length, sinuosity, polygon containment, and river flow regime. The model trained with all attributes achieved the highest accuracy (99.98%). Even models using geometric features surpassed 93% accuracy. These results highlight the effectiveness of GCNs in capturing structural patterns. This study compares GraphSAGE model outputs to those generated by the GeoData Loader for Mapserver (GDLMS), the current operational system for generalization, developed and used by the Geographic Service of the Brazilian Army. It also compares those generalization to reference data acquired by manual generalization using the same 1:25.000 scale input. Visual analysis in GIS environments reveals that GCNs can be an alternative for generalization tasks. This research demonstrates the viability of using GeoAI methods for automating complex cartographic processes, offering a scalable and data-driven solution aligned with national geospatial data standards. |

