DEEP LEARNING FRAMEWORKS FOR FACE ANTI-SPOOFING

ANTIL, AASHANIA

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/22546

Title:	DEEP LEARNING FRAMEWORKS FOR FACE ANTI-SPOOFING
Authors:	ANTIL, AASHANIA
Keywords:	DEEP LEARNING FRAMEWORKS FACE ANTI-SPOOFING SPATIAL ADAPTIVE BLOCK (SAB) ELBP
Issue Date:	Dec-2025
Series/Report no.:	TD-8462;
Abstract:	With the rapid integration of facial recognition (FR) systems in access control, banking, and mobile authentication, the risk of face spoofing—also known as presentation attacks (PAs)—has grown significantly. These attacks, carried out using printed images, replayed videos, or 3D masks, threaten the security and reliability of biometric authentication systems. The increasing sophistication of spoofing techniques, combined with the low cost and easy availability of generative tools, underscores the urgent need for robust Face Anti-Spoofing (FAS) or Presentation Attack Detection (PAD) mechanisms. Although several approaches have been proposed, many existing solutions struggle to generalize under challenging conditions involving varied lighting, spoofing materials, backgrounds, and image/video quality. To address these challenges, this thesis proposes a suite of deep learning-based frameworks that are robust, interpretable, and generalizable for real-world face anti- spoofing. The research is structured around four complementary solutions, each targeting a specific dimension of the problem: texture-based learning, multi-modal fusion, spatio-temporal modeling, and generative learning. The first solution introduces a two-stream hybrid framework that fuses handcrafted and deep features to improve spoof detection accuracy. It combines Multi- Level Extended Local Binary Patterns (ELBP) to capture fine-grained texture information with a modified Xception network, enhanced by Squeeze-and-Excitation (SE) blocks for channel-wise feature reweighting without increasing complexity. This design balances expressive power and computational efficiency, enabling the model to handle diverse spoofing conditions and maintain generalization across datasets. The second solution, MF2ShrT, addresses multi-modal fusion by leveraging the power of Vision Transformers (ViTs). It uses overlapping patches to emphasize local contextual cues and introduces SharLViT, a shared-layer transformer backbone that improves feature representation while reducing parameter count. A novel T-Encoder- viii based Hybrid Feature Block is employed to mine inter-modal dependencies across RGB, depth, and IR streams. The Adaptive Weighted Fusion and Classification Block (AWFCB) then learns to dynamically combine these features, emphasizing salient cues while suppressing redundant information—resulting in a flexible and accurate spoof detection system. The third solution focuses on the temporal dimension of FAS by proposing Bi- STAM, a Bi-Directional Spatio-Temporal Adaptive Modeling framework. Aiming to capture motion inconsistencies and subtle dynamics in video-based attacks, it introduces two key components: a Temporal Adaptive Block (TAB) to balance motion and static information, and a Spatial Adaptive Block (SAB) to enhance texture representation while filtering noise. These are fused via a Feature Aggregation Block (FAB) to yield a unified spatio-temporal representation, significantly boosting generalization and performance on video-based spoof detection tasks. The fourth solution, PolarSentinelGAN, presents a novel generative adversarial framework that enhances spoof classification through depth map generation. By fusing RGB and Multi-Scale Retinex with Color Preservation (MSRCP) inputs, the model uses Dual Polarized Attention (DPAttn) to focus on discriminative regions. A dedicated Feed Forward Block (FFB) within the generator facilitates the transmission of rich features, while optimized latent variables improve generalization across attack types and datasets. All four frameworks are extensively evaluated using standard intra- and cross- dataset testing protocols on public benchmarks, and are further supported with explainability techniques such as class activation mapping and feature occlusion testing. The results demonstrate strong real-time performance, robustness, and scalability. The proposed methodologies in this thesis makes substantial contributions toward the development of next-generation face anti-spoofing systems. The proposed methods not only address key challenges in generalization, efficiency, and interpretability, but also pave the way for practical deployment in critical domains such as finance, border security, surveillance, and consumer electronics. Future directions include exploring ix federated learning, privacy-aware architectures, and continual domain adaptation to further enhance system reliability in dynamic real-world environments.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/22546
Appears in Collections:	Ph.D. Electronics & Communication Engineering

Files in This Item:

File	Description	Size	Format
AASHANIA ANTIL Ph.d..pdf		7.27 MB	Adobe PDF	View/Open
AASHANIA ANTIL Plag..pdf		6.51 MB	Adobe PDF	View/Open

Show full item record