Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/22767
Title: INTEGRATED DATA ENVELOPMENT ANALYSIS - ML FRAMEWORK FOR GLOBAL UNIVERSITY EFFICIENCY ANALYSIS
Authors: GOYAL, MEHAK
Keywords: DATA ENVELOPMENT ANALYSIS
ML FRAMEWORK
GLOBAL UNIVERSITY EFFICIENCY ANALYSIS
Issue Date: May-2025
Series/Report no.: TD-8687;
Abstract: This thesis evaluates the performance of 800 universities worldwide by integrating Data Envelopment Analysis (DEA) with machine learning techniques. Utilizing data from the 2016 global rankings, the study applies input-oriented CCR, BCC, and NIRS DEA models to measure technical, scale, and overall efficiency. The analysis treats the student-to-staff ratio as the input and employs teaching, citation, research, citation, industry income, and international outlook scores as outputs. The DEA results indicate significant scope for efficiency improvement, with a mean overall (CCR) efficiency of approximately 0.108 and a mean technical (BCC) efficiency of 0.189. A predominant finding is that 86.75% of universities exhibit Increasing Returns to Scale (IRS), suggesting most were operating below optimal scale. Sensitivity analysis, conducted by altering output specifications, showed that while absolute efficiency scores and RTS distributions changed (Spearman rank correlation of ˜0.81 for BCC scores), the relative rankings of universities demonstrated considerable robustness. K-Means clustering (K=2, determined via Silhouette analysis) grouped universities based on contextual variables (location, student numbers, female-male ratio, international student percentage), identifying a large primary cluster and a very small cluster of dis- tinct mega-scale institutions. DEA performed within these clusters highlighted improved relative efficiency scores, especially for the smaller cluster, when benchmarked against more homogenous peers. Finally, tuned Random Forest, LightGBM, and Gradient Boosting regression models were developed to explain technical efficiency. LightGBM performed best, achieving an R-squared of approximately 0.4725 in predicting BCC scores. Key contextual drivers identified were total student numbers, location, and percentage of international students. This multi-stage approach provides a nuanced understanding of university performance, offering actionable insights for strategic planning and policy development in the higher education sector.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/22767
Appears in Collections:M Sc Applied Maths

Files in This Item:
File Description SizeFormat 
MEHAK GOYAL M.Sc..pdf692.67 kBAdobe PDFView/Open
MEHAK GOYAL Plag.pdf697.22 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.