Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/21239
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | DAGAR, DEEPAK | - |
dc.date.accessioned | 2024-12-13T05:09:10Z | - |
dc.date.available | 2024-12-13T05:09:10Z | - |
dc.date.issued | 2024-09 | - |
dc.identifier.uri | http://dspace.dtu.ac.in:8080/jspui/handle/repository/21239 | - |
dc.description.abstract | In recent years, the development of deep learning methods, particularly Generative Adversarial Networks (GANs) and Variational Auto-encoders (VAEs), has resulted in fabricated content that is more realistic and credible to the human eye. Deepfake is an emergent deep learning technology that enables the production of synthetic content that is both highly realistic and credible. On the one hand, Deepfake has facilitated the development of cutting-edge applications in a variety of industries, including advertising, creative arts, and film productions. Conversely, it presents a threat to a variety of Multimedia Information Retrieval Systems (MIPR), including speech and face recognition systems and has more significant societal implications in the dissemination of misleading information. This thesis highlights the importance of developing strong systems that can identify potentially harmful changes in deepfake multimedia content by harnessing the capabilities of deep learning algorithms. The objective of this study is to employ the potential of deep learning to effectively identify and mitigate several types of deepfake manipulations, which pose a significant threat to individuals, society, nations, and enterprises together. The proposed detection methods, which utilize deep learning, aim to guarantee the dependability and precision of deepfake manipulation content, considering that social media platforms are the primary means of exchanging information. Consequently, this will improve the development of a digital ecosystem characterized by greater dependability and trustworthiness. This thesis addresses the difficulty of detecting deepfake manipulation by introducing four innovative deep-learning architectures and a unique collection of diverse manipulation videos that facilitates the training of deepfake detection models. The first two models, namely Tex-ViT and Tex-Net focuses on the issue of deepfake manipulation detection. Deepfake manipulations can be misused in a variety of ways, pose a significant threat to individuals, society, nations, and enterprises together. Both Tex-ViT and Tex-Net uses texture as a feature and cross-attention mechanism to learn powerful representation of features. Tex-ViT uses gram matrices for texture feature representation while Tex-Net uses combination of Gram matrices and Local Binary Patterns for texture feature representation. Rest of the architecture is same in both the model, where the model combines traditional ResNet characteristics with a texture module that operates concurrently on ResNet segments before each down sampling process. This module serves as an input to the dual branch of the cross-attention vision transformer which uses them for final classification. The model's generalizability is illustrated through experimentation on a variety of categories of FF++ and vi GAN dataset images in cross-domain contexts. The investigations were conducted using the Celeb-DF, FF++, and DFDCPreview datasets, which were subjected to a variety of post processing techniques, including compression, noise addition, and blurring. The experimental results demonstrate that the proposed models outperform the current state-of-the-art approaches. Next, proposed a diverse manipulation deepfake video dataset named Div-DF in assisting the training of various detection methods. The dataset consists of 150 authentic videos featuring various celebrities from different fields, as well as 250 deepfake videos. The deepfake videos include 100 face-swap videos, 100 facial reenactment videos, and 50 lip-sync videos. Deepfake video are created by combining the Face-Swap GAN (FSGAN) and the Wav2Lip approach, which are advanced techniques. Third models for deepfake video detection approach integrates Xception and LSTM pretrained models with channel and spatial attention mechanisms (CBAM). The Xception model employs depthwise separable convolution to capture latent spatial artifacts, while the LSTM model captures the distinctions between the modified sequences. The hybrid model assembly enables the acquisition of knowledge on spatial and temporal distortions across multiple dimensions, making it a powerful tool for identifying deepfake content. The model was tested on the proposed dataset, demonstrating its improved extraction capabilities. Lastly, proposed a deepfake manipulation localization method is proposed. It is a dual-branch model that is propelled by the attention mechanism and combines handcrafted feature noise and CNNs as an encoder-decoder (ED). This dual-branch model employs noise features on one branch and RGB on the other before feeding to an ED architecture for semantic learning and skip connection deployment to retain spatial information. Additionally, this architecture employs channel spatial attention to enhance and refine the representation of the features. Extensive experimentation was conducted on the shallowfakes dataset (CASIA, COVERAGE, COLUMBIA, NIST16) and the deepfake dataset Faceforensics++ (FF++) to showcase the superior feature extraction capabilities and performance compared to a variety of baseline models, with an AUC score that exceeded 99%. The model is comparatively lighter, with 38 million parameters, and significantly surpasses existing State-of-the-Art (SoTA) models. | en_US |
dc.language.iso | en | en_US |
dc.relation.ispartofseries | TD-7614; | - |
dc.subject | DEEPFAKE DETECTION | en_US |
dc.subject | MULTIMEDIA DATA | en_US |
dc.subject | DEVELOPMENT OF FRAMEWORK | en_US |
dc.subject | MIPR | en_US |
dc.title | DEVELOPMENT OF FRAMEWORK FOR DEEPFAKE DETECTION IN MULTIMEDIA DATA | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | Ph.D. Information Technology |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Deepak Dagar Ph.D..pdf | 6.6 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.