Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18865
Title: SPEAKER IDENTIFICATION FROM VOICE SIGNALS USING HYBRID NEURAL NETWORK
Authors: BHATT, HARSHIT
Keywords: SPEAKER IDENTIFICATION
VOICE SIGNALS
HYBRID NEURAL NETWORK
CNN AND LSTM NETWORKS
Issue Date: Oct-2021
Publisher: DELHI TECHNOLOGICAL UNIVERSITY
Series/Report no.: TD - 5413;
Abstract: Identifying the speaker in audio visual environment is a crucial task which is now surfacing in the research domain researchers nowadays are moving towards utilizing deep neural networks to match people with their respective voices the applications of deep learning are many-fold that include the ability to process huge volume of data robust training of algorithms feasibility of optimization and reduced computation time. Previous studies have explored recurrent and convolutional neural network incorporating GRUs, Bi-GRUs, LSTM, Bi-LSTM and many more[1]. This work proposes a hybrid mechanism which consist of an CNN and LSTM network fused using an early fusion method. We accumulated a dataset of 1,330 voices by recording through a python script of length of 3 seconds in .wav format. The dataset consists of 14 categories and we used 80% for training and 20% for testing. We optimized and fine-tuned the neural networks and modified them to yield optimum results. For the early fusion approach, we used the concatenation operation that fuses neural networks prior to the training phase. The proposed method achieves 97.72% accuracy on our dataset and outperforms all existing baseline mechanisms like MLP, LSTM, CNN, and RNN. This research serves as a contribution to the ongoing research in speaker identification domain and paves way to future directions using deep learning.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18865
Appears in Collections:M.E./M.Tech. Information Technology

Files in This Item:
File Description SizeFormat 
thesis Final.pdf1.08 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.