FACE FORGERY DETECTION USING  CLASSICAL AND HYBRID QUANTUM  DEEP LEARNING MODELS

KUMAR, MANISH

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21802

Full metadata record

DC Field	Value	Language
dc.contributor.author	KUMAR, MANISH	-
dc.date.accessioned	2025-07-08T08:43:10Z	-
dc.date.available	2025-07-08T08:43:10Z	-
dc.date.issued	2025-05	-
dc.identifier.uri	http://dspace.dtu.ac.in:8080/jspui/handle/repository/21802	-
dc.description.abstract	Face forgery detection has become increasingly critical as generative algorithms produce hyper-realistic images and videos that threaten privacy, security, and trust in digital media. Five state-of-the-art convolutional neural networks—Xception, ResNet50, EfficientNetB0, DenseNet121, and MobileNet—were benchmarked using a dataset of approximately 200,000 balanced real and fake images sourced from Flickr Face (FFHQ) and various AI-generated repositories. After applying data augmentation (rescaling, flips, rotations) and splitting into 70% training, 15% validation, and 15% test sets, each model was fine-tuned via transfer learning. Evaluation metrics included accuracy, precision, recall, F1-score, confusion matrices, and ROC-AUC. Xception achieved the highest test accuracy of 99.14%, outperforming DenseNet121 (98.67%), ResNet50 (97.92%), EfficientNetB0 (97.45%), and MobileNet (96.83%), illustrating the power of separable-convolution blocks in revealing subtle forgery artifacts. Lightweight vision transformer architectures—DeiT, LeViT, MobileViT-XXS, and TinyViT were also assessed, alongside three hybrid quantum-classical variants embedding parameterized quantum circuits into MobileViT-XXS and Swin-Tiny backbones. A separate 140,000-image dataset (70,000 FFHQ images and 70,000 StyleGAN-generated faces) was used, with multiple quantum gate configurations (RY; RY-entangled; RY, RX, RZ) simulated via PennyLane Lightning Qubit. Comparative analysis of training curves, confusion matrices, and classical performance metrics under consistent hyperparameters revealed that MobileViT-XXS led pure transformer models at 99.88% accuracy (TinyViT: 99.72%; LeViT: 99.55%; DeiT: 99.31%). Quantum-enhanced hybrids further improved detection: Swin-Tiny with RY, RX, and RZ rotations reached 97.42%, surpassing RY-only (95.88%) and RY-entangled (96.17%) variants. These results demonstrate that transfer learning with specialized CNNs remains highly effective for deepfake detection; compact vision transformers can match or exceed CNN performance with lower parameter counts; and integration of quantum circuits uncovers fine-grained forgery cues, enabling real-time, resource efficient authentication in mobile and streaming contexts.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	TD-8013;	-
dc.subject	FACE FORGERY DETECTION	en_US
dc.subject	HYBRID QUANTUM DEEP LEARNING MODELS	en_US
dc.subject	MOBILENET	en_US
dc.subject	MobileViT-XXS	en_US
dc.title	FACE FORGERY DETECTION USING CLASSICAL AND HYBRID QUANTUM DEEP LEARNING MODELS	en_US
dc.type	Thesis	en_US
Appears in Collections:	M.E./M.Tech. Information Technology

Files in This Item:

File	Description	Size	Format
MANISH KUMAR M.Tech.pdf		3.77 MB	Adobe PDF	View/Open

Show simple item record