Latent prediction-based generative semantic communication for video transmission in wireless networks
Lokumarambage, Maheshi and Sivalingam, Thushan and Dong, Feng and Rajatheva, Nandana and Fernando, Anil (2026) Latent prediction-based generative semantic communication for video transmission in wireless networks. IEEE Open Journal of the Communications Society, 7. pp. 3974-3986. ISSN 2644-125X (https://doi.org/10.1109/OJCOMS.2026.3684230)
Preview |
Text.
Filename: Lokumarambage-etal-IEEE-OJCS-2026-Latent-prediction-based-generative-semantic-communication-for-video-transmission.pdf
Final Published Version License:
Download (1MB)| Preview |
Abstract
The increasing dominance of video traffic in intelligent sensing and control applications introduces a major challenge to the capacity limits of modern wireless networks. Classical information theory defines fixed physical boundaries on channel capacity, beyond which further improvement requires rethinking what information is transmitted. Semantic communication (SemCom) bridges this by only sending the semantics of the intended message. This paper presents a SemCom framework that leverages latent-space procedural video prediction with world-model-guided temporal dynamics. Instead of transmitting pixel data, the transmitter encodes high-level semantic representations of context frames and sends them through the physical channel. A temporal transformer predicts future latent states at the receiver. The framework jointly optimizes perceptual, adversarial, and temporal objectives to preserve both visual quality and trajectory consistency under channel impairments. Experiments conducted on video sequences of the BAIR robot pushing dataset demonstrate that the proposed method achieves lower normalized endpoint and velocity errors compared to learned baseline real-time intermediate flow estimation (RIFE) + better portable graphics (BPG) with reduced bit-rate compared to traditional codecs. The results indicate that incorporating temporal dynamics as semantics into the communication process enables efficient and anticipatory video transmission suitable for applications such as tele-robotic, underwater, and autonomous systems.
ORCID iDs
Lokumarambage, Maheshi, Sivalingam, Thushan, Dong, Feng, Rajatheva, Nandana and Fernando, Anil
ORCID: https://orcid.org/0000-0002-2158-2367;
-
-
Item type: Article ID code: 96105 Dates: DateEvent23 April 2026Published16 April 2026Published Online7 April 2026AcceptedSubjects: Science > Mathematics > Electronic computers. Computer science
Technology > Electrical engineering. Electronics Nuclear engineering > TelecommunicationDepartment: Faculty of Science > Computer and Information Sciences Depositing user: Pure Administrator Date deposited: 27 Apr 2026 11:31 Last modified: 09 Jun 2026 16:22 URI: https://strathprints.strath.ac.uk/id/eprint/96105
Tools
Tools






