Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/21795
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | PRAKASH, ARUNAV | - |
dc.date.accessioned | 2025-07-08T08:41:26Z | - |
dc.date.available | 2025-07-08T08:41:26Z | - |
dc.date.issued | 2025-05 | - |
dc.identifier.uri | http://dspace.dtu.ac.in:8080/jspui/handle/repository/21795 | - |
dc.description.abstract | This thesis investigates how parameter-efficient fine-tuning techniques can bring powerful language models into everyday clinical environments without sacrificing performance or patient privacy. We begin by selecting a representative subset (16,412 pairs) of the MedQuAD medical question-answer dataset and adapt two LLaMA variants, a 3 billion-parameter model using LoRA adapters and an 8 billion-parameter model with 4-bit QLoRA quantization, entirely on a single RTX 4060 GPU with 8 GB VRAM. Training and inference both complete in under five hours, demonstrating that consumer-grade hardware can support domain-specific LLMs when only lightweight adapters are updated. To evaluate model outputs, we develop a multi-axis scoring framework, relevance, accuracy, conciseness, and completeness, automatically produced by a locally hosted LLaMA 3.1 8B judge via Ollama. This structured, human-aligned approach reveals clear differences: the 8 B QLoRA model consistently outperforms its 3 B counterpart across all four dimensions (mean score 7.16 vs. 6.96), while traditional overlap metrics like ROUGE fail to capture these gains. We show that ROUGE’s reliance on n-gram matching penalizes valid paraphrases and richer contextual detail, making it an unreliable proxy for clinical language quality. Our contributions include a reproducible, on-device pipeline for fine-tuning and evaluation, compelling evidence that aggressive quantization need not compromise model expressivity, and a practical blueprint for deploying privacy-preserving medical chatbots in resource-constrained settings. We conclude by outlining future directions, ensemble judging, expanded empathy and readability metrics, dynamic adapter libraries, and hybrid human-in-the-loop workflows, to further bridge the gap between scalable automation and clinical safety. | en_US |
dc.language.iso | en | en_US |
dc.relation.ispartofseries | TD-8006; | - |
dc.subject | MEDICAL QUESTION-ANSWERING | en_US |
dc.subject | LOCAL FINE-TUNING | en_US |
dc.subject | SMALL LANGUAGE MODELS | en_US |
dc.subject | LLaMA | en_US |
dc.title | ENHANCING MEDICAL QUESTION-ANSWERING THROUGH LOCAL FINE-TUNING OF SMALL LANGUAGE MODELS | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | M.E./M.Tech. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ARUNAV PRAKASH M.Tech..pdf | 1.54 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.