Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/21783
Title: | FINE TUNING LLMs FOR CONTEXT-AWARE DIALOGUE SUMMARIZATION AND HUMAN ALIGNED RESPONSES VIA RLHF |
Authors: | KUMAR, GAPESH |
Keywords: | FINE TUNING LLMs DIALOGUE SUMMARIZATION HUMAN ALIGNED RESPONSES RLHF LLM |
Issue Date: | May-2025 |
Series/Report no.: | TD-7993; |
Abstract: | Large language model (LLM) optimization on a task is based on tuning it, which saves costs in resources. Training models or instruction tuning on pairs of instructions and completions makes them follow human directions in the right manner. Complete tuning remains computationally expensive, however. There have been a number of recent parameter-efficient fine-tuning (PEFT) methods that assist in resolving this problem. It remains very hard to align model outputs with human preferences, however. In this current work, we tried instruction fine-tuning and PEFT techniques such as Low-Rank Adaptation (LoRA) to fine-tune pre-trained LLMs on a given task using struc tured training data and efficient tuning of a portion of model parameters. To ensure con textual appropriateness while improving response alignment with human expectation, we incorporated Reinforcement Learning from Human Feedback (RLHF) during fine-tuning. Our results indicate that while PEFT approaches significantly minimize computational and memory expense without any loss in performance, instruction adaptation actually enhances model task conformity. RLHF also prevents the model from providing out-of context responses thus ensuring that responses are uniform and human-aligned. Observations of this work show that highly specialized and resource-effective LLMs may be built by combining PEFT, instruction tuning, and RLHF. These methods offer a rational and scalable way to fine-tune, thus enhancing the usefulness and flexibility of LLMs for a wide range of other applications. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/21783 |
Appears in Collections: | M.E./M.Tech. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
GAPESH KUMAR M.Tech..pdf | 515.93 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.