Optimizing network performance through AI-driven media-to-text conversion
DOI:
https://doi.org/10.37868/sei.v8i1.id793Abstract
One of the most challenging issues in networks and cloud computing is the overhead caused by the exchanged data. Several approaches in the literature have been proposed to mitigate this issue. However, there is still a lack of innovative methods and techniques to reduce the network's overhead. Hence, this work proposes an AI-driven method that converts media data (e.g., images, audio, or videos) into text to enhance the overall network performance. The proposed method is evaluated based on bandwidth savings, throughput, and latency. The findings demonstrate that the proposed method achieves a 98% bandwidth reduction and a 3.6 times higher throughput, with high accuracy (BLEU-4 > 0.78 for captions, WER < 12% for speech). Moreover, the statistical validation shows a significant improvement in latency (150ms for audio and 950ms for video) and a packet loss rate of 0.3%. Finally, the proposed method is considered adaptable to IoT, edge computing, and cloud systems due to its cost-effectiveness.
Published
How to Cite
Issue
Section
Copyright (c) 2026 Ali Hussein Alnooh, Nawar A. Sultan, Ali Yasir Kuti

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.





