Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Agenda Overview |
| Session | ||
ThS23A: Towards Large Cultural Heritage Foundation Models: Datasets, Semantic Alignment, and Component-Level Annotation
Session Topics: Towards Large Cultural Heritage Foundation Models: Datasets, Semantic Alignment, and Component-Level Annotation (ThS23)
| ||
| Presentations | ||
1:30pm - 1:45pm
Investigating The Form And Restoration Of The Diji Altar Beijing University of Civil Engineering and Architecture, China, People's Republic of The restoration of historic buildings is an important topic in today's society and constitutes the primary subject of this study. The Diji Altar, located along the central axis of Beijing, is not only a significant historical landmark but also an important remnant of China's ancient imperial sacrificial architecture. Although some studies have focused on the Diqi Altar, such as its ritual hierarchy and craftsmanship as recorded in historical texts, certain research gaps remain. Due to the damage to the altar structure and insufficient documentation in relevant literature regarding its structural form, platform base specifications, and stylistic evidence, systematic research on restoration techniques remains relatively scarce. There is a need to reconstruct evidence based on architectural principles. Addressing this critical gap is of great significance for understanding the technical achievements and ceremonial principles of official architecture during the Ming and Qing dynasties, and for guiding the restoration and preservation of ancient buildings. 1:45pm - 2:00pm
A Digital Restoration Method for Earth God Altars from Discrete Components to Scene Reconstruction 1Beijing University of Civil Engineering and Architecture, China, People's Republic of; 2Ancient Chinese Architecture Museum,China,People's Republic of; 3Beijing Institute of Archaeology,China,People's Republic of; 4Beijing Digsur Science & Technology Com. Ltd,China,People's Republic of; 5Beijing University of Civil Engineering and Architecture, China, People's Republic of The digital preservation of open-air sites often faces multiple challenges, such as dispersed components, varied forms, and missing historical records. In response, this study focuses on the Beijing Dizhitan and proposes and implements an innovative workflow that deeply integrates architectural morphology theories, archaeological typological methods, and modern digital technologies. This workflow systematically constructs a complete methodological chain, from the semantic annotation, classification, and virtual assembly of stone components, to the virtual restoration and model reconstruction of the site, ultimately achieving scenario-level restoration and display evaluation. The successful restoration of the Dizhitan demonstrates that this approach not only effectively "revives" dispersed components, placing them in their proper positions in a virtual space, but also pioneers a replicable new paradigm that embeds rigorous academic research throughout the digital process. This provides an entirely new technical approach and perspective for the preservation, study, and interpretation of immovable open-air cultural relics. 2:00pm - 2:15pm
Building a Multimodal Dataset of Rock Art: Integrating Text, Images, and 3D Point Clouds Chang'an University, China, People's Republic of This paper addresses the limitations of single-modal data in rock art cultural heritage preservation, such as incomplete information and fragmented semantics. It proposes a method for constructing a multimodal dataset that integrates text, images, and 3D point clouds. Text data is structured and semantically annotated using the ArchaeoBERT model; image data is obtained through web scraping, annotation, and augmentation; and point cloud data is captured using laser scanning, noise reduction, and registration techniques. Feature mapping alignment is employed, combining CNN, BERT, and PointNet++ to extract features and generate unified vector representations. Through a three-level quality control process, the data is accurate and reliable, with information coverage increased by 47.3%. This dataset achieves comprehensive integration of semantic, visual, and spatial information, providing a multidimensional data foundation and practical reference for the digital preservation, 3D reconstruction, and cross-modal retrieval of rock art. 2:15pm - 2:30pm
Monocular Depth Estimation from UAV images for 3D documentation of architectural heritage: a Depth Anything v2-based approach 1Politecnico di Torino (DIATI), Italy; 2Politecnico di Torino (DAD), Italy The rapid evolution of Monocular Depth Estimation (MDE) models — and in particular the emergence of recent foundation models such as Depth Anything v2 (Yang et al., 2024; Ranftl et al., 2022) — is opening concrete perspectives for the application of artificial intelligence in architectural and cultural heritage surveying. This research aims to assess the feasibility of employing such models to obtain metric depth estimations from UAV imagery, acquired in both oblique and nadir views, with the broader goal of integrating neural networks into 3D documentation, HBIM, and GIS workflows for built heritage. The Depth Anything v2 models were trained initially for ground-level scenarios, where the camera typically operates 1–2 m above the ground, with horizon distances extending up to 60–80 m. When applied to aerial imagery, particularly drone-based acquisitions, this results in a substantial domain gap: the network tends to interpret top-down landscapes as distant horizons, thereby compressing the depth scale. To address this issue, this study develops an experimental calibration and adaptation procedure aimed at transforming the depth maps produced by the model into metrically consistent estimates that are coherent with architectural reality. | ||

