Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21802
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKUMAR, MANISH-
dc.date.accessioned2025-07-08T08:43:10Z-
dc.date.available2025-07-08T08:43:10Z-
dc.date.issued2025-05-
dc.identifier.urihttp://dspace.dtu.ac.in:8080/jspui/handle/repository/21802-
dc.description.abstractFace forgery detection has become increasingly critical as generative algorithms produce hyper-realistic images and videos that threaten privacy, security, and trust in digital media. Five state-of-the-art convolutional neural networks—Xception, ResNet50, EfficientNetB0, DenseNet121, and MobileNet—were benchmarked using a dataset of approximately 200,000 balanced real and fake images sourced from Flickr Face (FFHQ) and various AI-generated repositories. After applying data augmentation (rescaling, flips, rotations) and splitting into 70% training, 15% validation, and 15% test sets, each model was fine-tuned via transfer learning. Evaluation metrics included accuracy, precision, recall, F1-score, confusion matrices, and ROC-AUC. Xception achieved the highest test accuracy of 99.14%, outperforming DenseNet121 (98.67%), ResNet50 (97.92%), EfficientNetB0 (97.45%), and MobileNet (96.83%), illustrating the power of separable-convolution blocks in revealing subtle forgery artifacts. Lightweight vision transformer architectures—DeiT, LeViT, MobileViT-XXS, and TinyViT were also assessed, alongside three hybrid quantum-classical variants embedding parameterized quantum circuits into MobileViT-XXS and Swin-Tiny backbones. A separate 140,000-image dataset (70,000 FFHQ images and 70,000 StyleGAN-generated faces) was used, with multiple quantum gate configurations (RY; RY-entangled; RY, RX, RZ) simulated via PennyLane Lightning Qubit. Comparative analysis of training curves, confusion matrices, and classical performance metrics under consistent hyperparameters revealed that MobileViT-XXS led pure transformer models at 99.88% accuracy (TinyViT: 99.72%; LeViT: 99.55%; DeiT: 99.31%). Quantum-enhanced hybrids further improved detection: Swin-Tiny with RY, RX, and RZ rotations reached 97.42%, surpassing RY-only (95.88%) and RY-entangled (96.17%) variants. These results demonstrate that transfer learning with specialized CNNs remains highly effective for deepfake detection; compact vision transformers can match or exceed CNN performance with lower parameter counts; and integration of quantum circuits uncovers fine-grained forgery cues, enabling real-time, resource efficient authentication in mobile and streaming contexts.en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTD-8013;-
dc.subjectFACE FORGERY DETECTIONen_US
dc.subjectHYBRID QUANTUM DEEP LEARNING MODELSen_US
dc.subjectMOBILENETen_US
dc.subjectMobileViT-XXSen_US
dc.titleFACE FORGERY DETECTION USING CLASSICAL AND HYBRID QUANTUM DEEP LEARNING MODELSen_US
dc.typeThesisen_US
Appears in Collections:M.E./M.Tech. Information Technology

Files in This Item:
File Description SizeFormat 
MANISH KUMAR M.Tech.pdf3.77 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.