Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20829
Title: DECODING DEEPFAKES: A DEEP LEARNING APPROACH TO UNVEILING SYNTHETIC MEDIA
Authors: KULKARNI, SARTHAK
Keywords: DECODING DEEPFAKES
DEEP LEARNING
SYNTHETIC MEDIA
VISION TRANSFORMER (ViTS)
CNN
Issue Date: May-2024
Series/Report no.: TD-7358;
Abstract: Identifying deepfakes is essential to combat misinformation, protect personal and national security, and prevent financial fraud and reputational harm. The increasing sophistication of deepfake technology heightens these threats, requiring advanced detection methods. Convolutional Neural Networks (CNNs) are good models for a classification task within deep learning. "The State of Deepfakes" from Deeptrace registered over 14,000 deep fake videos, the majority of which are not pornographic, marking a high potential to cause political and social disruption; as AI continues to advance deep fakes grow far more realistic—and more accessible to make. It is simply a working mechanism that is constant on detection and delays involving the technological, legal, and educational components. Underlining that the threat is newly emerging and changing all the time, the report puts stress on the fact that such future risks as fraud and misinformation, among others, arise from such types of malicious activities as deepfakes, which call for comprehensive action. Alternatively, can the development of the Vision Transformer (ViTs) help inspire turning the image processing problem into an approach where an image is regarded as a string of patches, in other words, self-attention to global relations? Whereas the current architecture is entirely based on Convolutional Neural Networks, it uses a sequence of local receptive fields and a building-up process hierarchically. This is believed to provide better ViTs with the ability to learn relationships at long ranges and information from contexts much more effectively, giving better performances for a more enormous scope of image identification. They are more robust, flexible, and scalable than CNNs and generally have a much more comprehensive set of visual inputs.. The new ViT model will apply deepfakes determination. The model's performance will be judged on performance metrics: accuracy, precision, recall, and F1-score, thereby making it more feasible. Through these metrics, we can determine how good the ViT model can be in differentiating an actual image from a deepfake.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20829
Appears in Collections:M.E./M.Tech. Information Technology

Files in This Item:
File Description SizeFormat 
Sarthak Kulkarni M.Tech..pdf4.35 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.