A COMPARATIVE STUDY OF MACHINE  LEARNING AND DEEP LEARNING  MODELS FOR FACIAL EMOTION  RECOGNITION

KUMAR, SUJEET

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21800

Full metadata record

DC Field	Value	Language
dc.contributor.author	KUMAR, SUJEET	-
dc.date.accessioned	2025-07-08T08:42:36Z	-
dc.date.available	2025-07-08T08:42:36Z	-
dc.date.issued	2025-05	-
dc.identifier.uri	http://dspace.dtu.ac.in:8080/jspui/handle/repository/21800	-
dc.description.abstract	A comparative study investigates five models—Support Vector Machine with Histogram of Oriented Gradients (SVM with HOG), Custom Convolutional Neural Network (Custom CNN), LeNet-5, VGG16, and MobileNetV2—for classifying seven facial emotions (Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral) on CK+48 and FER2013 datasets. The analysis assesses accuracy, F1-scores, and computational efficiency, tackling FER2013’s class imbalance (547 Disgust vs. 8,989 Happy samples) and noise. MobileNetV2 led FER2013 performance with 67.82% accuracy (F1-score: ~0.66), utilizing focal loss, Cutout, and Mixup to boost Disgust’s F1-score (~0.60). With ~2.4 million parameters and ~3-hour training, it suits real-time applications like mobile mental health monitoring or driver safety systems. Custom CNN achieved 99.32% accuracy (F1-score: ~0.99) on CK+48, leveraging the dataset’s 981 high-quality, balanced images, making it ideal for controlled settings like psychological research labs. VGG16 attained 67% accuracy (F1-score: ~0.64) on FER2013, benefiting from transfer learning but hindered by overfitting due to ~14.7 million parameters and ~4-hour training. SVM with HOG scored 64.86% accuracy, offering speed (~10 minutes) and noise robustness (~1.5% accuracy drop with Gaussian noise) but limited by handcrafted features. LeNet-5, with 49.47% accuracy (F1-score: ~0.45), struggled with FER2013’s noise and imbalance, highlighting shallow models’ inadequacy. FER2013’s low resolution (48x48) and imbalance caused errors in Disgust and Fear (F1-scores: ~0.50–0.60), driven by low samples and visual similarities (e.g., Fear misclassified as Sad/Surprise). The study emphasizes dataset quality, model complexity, and optimizations for effective FER. Future research should explore diverse datasets (e.g., AffectNet), Vision Transformers, video based FER with 3D-CNNs, and ethical considerations like bias mitigation and federated learning to ensure fairness and enhance applications in healthcare, education, and human-machine interaction.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	TD-8011;	-
dc.subject	MACHINE LEARNING	en_US
dc.subject	DEEP LEARNING MODELS	en_US
dc.subject	FACIAL EMOTION RECOGNITION	en_US
dc.subject	CNN	en_US
dc.title	A COMPARATIVE STUDY OF MACHINE LEARNING AND DEEP LEARNING MODELS FOR FACIAL EMOTION RECOGNITION	en_US
dc.type	Thesis	en_US
Appears in Collections:	M.E./M.Tech. Information Technology

Files in This Item:

File	Description	Size	Format
SUJEET KUMAR M.Tech.pdf		2.47 MB	Adobe PDF	View/Open

Show simple item record