Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18865
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBHATT, HARSHIT-
dc.date.accessioned2022-02-21T08:36:49Z-
dc.date.available2022-02-21T08:36:49Z-
dc.date.issued2021-10-
dc.identifier.urihttp://dspace.dtu.ac.in:8080/jspui/handle/repository/18865-
dc.description.abstractIdentifying the speaker in audio visual environment is a crucial task which is now surfacing in the research domain researchers nowadays are moving towards utilizing deep neural networks to match people with their respective voices the applications of deep learning are many-fold that include the ability to process huge volume of data robust training of algorithms feasibility of optimization and reduced computation time. Previous studies have explored recurrent and convolutional neural network incorporating GRUs, Bi-GRUs, LSTM, Bi-LSTM and many more[1]. This work proposes a hybrid mechanism which consist of an CNN and LSTM network fused using an early fusion method. We accumulated a dataset of 1,330 voices by recording through a python script of length of 3 seconds in .wav format. The dataset consists of 14 categories and we used 80% for training and 20% for testing. We optimized and fine-tuned the neural networks and modified them to yield optimum results. For the early fusion approach, we used the concatenation operation that fuses neural networks prior to the training phase. The proposed method achieves 97.72% accuracy on our dataset and outperforms all existing baseline mechanisms like MLP, LSTM, CNN, and RNN. This research serves as a contribution to the ongoing research in speaker identification domain and paves way to future directions using deep learning.en_US
dc.language.isoenen_US
dc.publisherDELHI TECHNOLOGICAL UNIVERSITYen_US
dc.relation.ispartofseriesTD - 5413;-
dc.subjectSPEAKER IDENTIFICATIONen_US
dc.subjectVOICE SIGNALSen_US
dc.subjectHYBRID NEURAL NETWORKen_US
dc.subjectCNN AND LSTM NETWORKSen_US
dc.titleSPEAKER IDENTIFICATION FROM VOICE SIGNALS USING HYBRID NEURAL NETWORKen_US
dc.typeThesisen_US
Appears in Collections:M.E./M.Tech. Information Technology

Files in This Item:
File Description SizeFormat 
thesis Final.pdf1.08 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.