Sadik, Md. and Vijaya, P. and Revathi, Y. and Tanuja, V. Siva Naga and Soudhamini, B. and Vaishnavi, R. (2025) AI Based Voice Cloning System: From Text to Speech. International Journal of Innovative Science and Research Technology, 10 (4): 25apr834. pp. 1453-1461. ISSN 2456-2165

[thumbnail of IJISRT25APR834.pdf] Text
IJISRT25APR834.pdf - Published Version

Download (978kB)

Abstract

The rapid advancements in Artificial Intelligence and Deep Learning have significantly improved Text-To-Speech (TTS) technology, enabling more accurate and natural voice conversion. This project presents a Voice Cloning System that leverages a Transformer-based encoder and a GAN-based vocoder to generate high-quality, natural-sounding speech from text. The system supports both Text-to-Speech (TTS), where textual input is converted into a default synthesized voice, and Voice Cloning, which allows the replication of a new voice using a short audio sample. By employing a one-shot learning approach, the system enables speaker adaptation with minimal training data, making it efficient and scalable for real-world applications. The Transformer-based encoder effectively captures linguistic and prosodic features, while the GAN-based vocoder enhances the realism of the generated speech by refining spectral details. The model's ability to generalize across different speakers ensures robustness, even when trained on limited datasets. This project highlights the potential of deep generative models in speech synthesis and their impact on various domains, including assistive technology, where it can help individuals with speech impairments communicate more naturally, personalized virtual assistants that adapt to user preferences, and entertainment industries for voiceovers and character dubbing.

Item Type: Article
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science
Depositing User: Editor IJISRT Publication
Date Deposited: 30 Apr 2025 10:48
Last Modified: 30 Apr 2025 10:48
URI: https://eprint.ijisrt.org/id/eprint/619

Actions (login required)

View Item
View Item