| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 77 |
| Year of Publication: 2026 |
| Authors: Jundi Yang, Heng Yao |
10.5120/ijca2026926252
|
Jundi Yang, Heng Yao . A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage. International Journal of Computer Applications. 187, 77 ( Jan 2026), 1-8. DOI=10.5120/ijca2026926252
Intangible Cultural Heritage (ICH) encompasses complex layers of symbolic meaning expressed through motifs, crafts, rituals, and regional traditions. Contemporary multimodal generative models frequently overlook such domain-specific semantics, leading to visually appealing but culturally inaccurate outputs. To address this limitation, this paper introduces a unified knowledge-graph–driven multimodal generation framework that couples a structured ICH Knowledge Graph (KG), a domain-adapted Large Language Model (LLM), and a controllable diffusion-based text-to-image generator. The KG organizes motifs, techniques, symbolic associations, and regional contexts into a structured semantic space, which the LLM leverages to interpret user queries and retrieve culturally grounded constraints. These constraints are injected into the diffusion model through a multi-stage semantic fusion mechanism, enabling culturally faithful and controllable image synthesis. Experimental results across three curated ICH datasets demonstrate that the proposed framework outperforms representative baselines in cultural semantic accuracy, text–image alignment, and robustness to linguistic variation. The proposed approach provides a principled pathway for integrating symbolic cultural knowledge with modern generative models, supporting large-scale preservation, computational interpretation, and creative revitalization of intangible cultural heritage.