Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/20887
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | JAIN, MIHIKA | - |
dc.contributor.author | RAO, SOMYA | - |
dc.date.accessioned | 2024-09-02T04:51:13Z | - |
dc.date.available | 2024-09-02T04:51:13Z | - |
dc.date.issued | 2024-05 | - |
dc.identifier.uri | http://dspace.dtu.ac.in:8080/jspui/handle/repository/20887 | - |
dc.description.abstract | Generative AI is a branch of artificial intelligence dedicated to creating new content, including text, images, audio, and video, by learning from existing data. Although still in its nascent stages, this technology is quickly reshaping the way we create and engage with content, holding the potential to revolutionize multiple industries. Its versatility is evident in its diverse applications, ranging from writing and music to data production. As we look to the future, Generative AI is poised to bring even more intriguing advancements, impacting various sectors and necessitating a thorough understanding of its ethical considerations and risk management. Nowadays, people face the problem of information overload in lengthy and unstructured PDFs, making it time-consuming and challenging to quickly extract and comprehend key details and information from the document. Then technique of summarization and querying of PDF can be useful. So that is when this paper comes into the picture which allows to provide the summary of PDF documents of varying lengths and retrieve desired information from the document in a concise and coherent form. The aim of this study is to summarize and query PDF documents using Generative AI and LangChain framework. And further assess the efficacy and quality of generated summaries and the relevance of query outputs. To summarize PDFs, the Stuff Documentation Chain method and the Map Reduce method are employed depending upon the length of the PDF. The Stuff Documentation Chain method is used to summarize small PDFs (preferably consisting of 1 to 2 pages). Hence, we used this technique to summarize a single-page PDF. Then to summarize a 4-page PDF, we applied the MapReduce method, which is suitable for handling large documents that exceed the token limit of a single prompt. Further, we performed Querying of PDF using LangChain framework to unsheathe the desired answers from the PDF. LangChain facilitated the extraction of key information and generation of concise summaries. Also, while querying, it enabled efficient processing of user queries and retrieval of relevant information from PDFs. After applying all these techniques on the PDFs, we retrieved summaries of the respective PDFs and answers addressing to the various queries. Next, we evaluated the relevancy of the query outcomes as well as the effectiveness and calibre of the summaries that were produced. The summarization techniques efficiently condensed the content of PDFs while retaining crucial details, resulting in a well-structured summary which provided a comprehensive overview of vi the document's content. The responses to the queries had a high level of pertinence and directly addressed the questions posed. | en_US |
dc.language.iso | en | en_US |
dc.relation.ispartofseries | TD-7405; | - |
dc.subject | GENERATIVE ARTIFICIAL INTELLIGENCE | en_US |
dc.subject | QUERYING OF PDF | en_US |
dc.subject | STUFF DOCUMENT CHAIN METHOD | en_US |
dc.subject | MAP-REDUCE METHOD | en_US |
dc.subject | LANGCHAIN | en_US |
dc.title | SUMMARIZATION AND QUERYING OF PDF USING GENERATIVE ARTIFICIAL INTELLIGENCE | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | M Sc Applied Maths |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Mihika & Somya M.Sc..pdf | 1.4 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.