Template-Type: ReDIF-Article 1.0
Author-Name:Adeel Munir, Hammad Nasir, Madiha Sher, Arbab Masood Ahmad
Author-Email:adeelmodernite@gmail.com
Author-Workplace-Name:Department of Computer Systems Engineering,University of Engineering and Technology,Peshawar, Pakistan
Title:Voice Cloning and Synthesis Using Deep Learning: A Comprehensive Study
Abstract:This paper reviews current voice cloning and speech synthesis methods. It focuses on the way that deep learning enhances AI-generated voice synthesis in terms of quality, flexibility,and efficiency. We analyze the top AI models in terms of their significance to  virtual  assistants,  dubbing,  and  accessibility  tools:  XTTS_v2,  Whisper, and Llama  8B. Voice  cloning  and  TTS  efforts  in  Tortoise  are  improved  by  XTTs_v2.  Based  on  the multilingual  creative  transfer,  it  has a higher  speed  and  shorter  time  of  a  computational process,and  generates  synthetic  speech  closer  to  naturalness.  Whisper  is  a  transcription model that goes from an audio waveform to text. It simplifies access to audio data. Llama 8B  focuses  on  user  question  answering  for  enhancing  AI  and  human  interaction.  Other related work includes fastSpeech2 [1], Neural Voice Cloning with few Samples [2], and Deep Learning-Based   Expressive   Speech   Synthesis [3],   which   also   contribute   to   these advancements.  This  progress  enhances  machines'  ability  to  communicate in  an  emotional and human-like way, leading to more sophisticated technology.
Keywords:Voice Cloning, Speech Synthesis, Deep Learning, Multilingual Zero-shot Multi-Speaker TTS (XTTS), Speaker Adaptation, Cross-Lingual TTS, Whisper, Llama 8B
Journal: International Journal of Innovations in Science and Technology
Pages:2225-2235
Volume:7
Issue:3
Year: 2025
Month:September
File-URL:https://journal.50sea.com/index.php/IJIST/article/view/1551/2242
File-Format: Application/pdf
File-URL:https://journal.50sea.com/index.php/IJIST/article/view/1551
File-Format: text/html
Handle: RePEc:abq:IJIST:v:7:y:2025:i:3:p:2225-2235