Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/21795
Title: | ENHANCING MEDICAL QUESTION-ANSWERING THROUGH LOCAL FINE-TUNING OF SMALL LANGUAGE MODELS |
Authors: | PRAKASH, ARUNAV |
Keywords: | MEDICAL QUESTION-ANSWERING LOCAL FINE-TUNING SMALL LANGUAGE MODELS LLaMA |
Issue Date: | May-2025 |
Series/Report no.: | TD-8006; |
Abstract: | This thesis investigates how parameter-efficient fine-tuning techniques can bring powerful language models into everyday clinical environments without sacrificing performance or patient privacy. We begin by selecting a representative subset (16,412 pairs) of the MedQuAD medical question-answer dataset and adapt two LLaMA variants, a 3 billion-parameter model using LoRA adapters and an 8 billion-parameter model with 4-bit QLoRA quantization, entirely on a single RTX 4060 GPU with 8 GB VRAM. Training and inference both complete in under five hours, demonstrating that consumer-grade hardware can support domain-specific LLMs when only lightweight adapters are updated. To evaluate model outputs, we develop a multi-axis scoring framework, relevance, accuracy, conciseness, and completeness, automatically produced by a locally hosted LLaMA 3.1 8B judge via Ollama. This structured, human-aligned approach reveals clear differences: the 8 B QLoRA model consistently outperforms its 3 B counterpart across all four dimensions (mean score 7.16 vs. 6.96), while traditional overlap metrics like ROUGE fail to capture these gains. We show that ROUGE’s reliance on n-gram matching penalizes valid paraphrases and richer contextual detail, making it an unreliable proxy for clinical language quality. Our contributions include a reproducible, on-device pipeline for fine-tuning and evaluation, compelling evidence that aggressive quantization need not compromise model expressivity, and a practical blueprint for deploying privacy-preserving medical chatbots in resource-constrained settings. We conclude by outlining future directions, ensemble judging, expanded empathy and readability metrics, dynamic adapter libraries, and hybrid human-in-the-loop workflows, to further bridge the gap between scalable automation and clinical safety. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/21795 |
Appears in Collections: | M.E./M.Tech. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ARUNAV PRAKASH M.Tech..pdf | 1.54 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.