A distributed RAG-based framework for automated extraction of information from multiple types of resources
El Gemayel, Charbel and El Gemayel, Joseph and Constantin, Joseph; (2026) A distributed RAG-based framework for automated extraction of information from multiple types of resources. In: 2025 IEEE/ACS 22nd International Conference on Computer Systems and Applications (AICCSA). 2025 IEEE/ACS 22nd International Conference on Computer Systems and Applications (AICCSA) . IEEE, QAT. ISBN 979-8-3315-5693-8
Preview |
Text.
Filename: El-Gemayel-etal-2025-A-distributed-RAG-based-framework-for-automated-extraction.pdf
Accepted Author Manuscript License:
Download (698kB)| Preview |
Abstract
Accessing authoritative information in areas such as healthcare, cybersecurity, and artificial intelligence remains a challenge due to the heterogeneity of data sources and the varying credibility of content. With the increasing integration of advanced technologies into daily life, there is an urgent need for systems that can streamline the retrieval of information and extraction of knowledge from different formats. In this paper, we present a distributed, Retrieval-Augmented Generation (RAG) based framework that aims to automate the extraction and structuring of information from multimodal resources, such as websites, PDFs, images, audio, and video. The framework supports real-time data processing and is optimized for the creation of open data sets in any subject area. To validate our approach, we applied it to cigars and beverages, using content from online articles, reviews, and posts. Our results show the framework’s potential to simplify data integration, improve usability and enable scalable, contextual knowledge generation.
ORCID iDs
El Gemayel, Charbel, El Gemayel, Joseph
ORCID: https://orcid.org/0009-0004-4518-3071 and Constantin, Joseph;
-
-
Item type: Book Section ID code: 95234 Dates: DateEvent5 January 2026PublishedOctober 2025AcceptedSubjects: Science > Mathematics > Electronic computers. Computer science Department: Faculty of Science > Computer and Information Sciences Depositing user: Pure Administrator Date deposited: 09 Jan 2026 11:38 Last modified: 03 Feb 2026 01:33 Related URLs: URI: https://strathprints.strath.ac.uk/id/eprint/95234
Tools
Tools






