Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18060
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMITTAL, SHUBHAM-
dc.date.accessioned2020-12-28T06:18:59Z-
dc.date.available2020-12-28T06:18:59Z-
dc.date.issued2020-07-
dc.identifier.urihttp://dspace.dtu.ac.in:8080/jspui/handle/repository/18060-
dc.description.abstractComprehensive knowledge of base pairing in RNA secondary structure would usher novel insights and develop greater understanding of their role in regulation of cellular processes and in disease. These could then be tackled in a more holistic manner. In pursuit of this objective, probabilistic models, especially those employing machine learning have come to dominate RNA secondary structure prediction, proving better than previous tools which were based upon comparative sequence analysis or folding algorithms employing thermodynamic and stochastic parameter schemes. This study is aimed at developing a machine learning technique better than the previously developed models, which have accelerated the research in RNA secondary structure prediction in the past two decades. The proposed model consists Embedding, CNN and Bidirectional GRU layers which prove effective, when together, for the objective of site accessibility estimation. Specifically, the Gated Recurrent Units (GRUs) are noteworthy since they tackle the problem of vanishing gradient by including the previous and far away time steps for prediction. Data was collected from RNA STRAND database (4666 experimentally determined RNA structures) and Comparative RNA Web (CRW) Site (17032 structures obtained through comparative sequence analysis). From these 4400 structures were curated after cleaning and clustering using CD-Hit. The model was trained, validated, and tested on divisions of this data to give a ROC curve with sensitivity of 0.75 and precision of 0.78, higher than the best compared state-of-the-art RNA structure prediction models, by 11% and 31%, respectively. The ROC values for class 0 with ‘bound’ residues and class 1 with ‘free’ residues were 0.90 and 0.90 respectively, indicating high accuracy in site accessibility prediction. An elaboration on RNA types, functions, and their functional mechanisms in diseases, is intended to provide the reader with the prerequisite knowledge to understand the vitality of unearthing structural information of RNA. Added to this a review of earlier and alternative RNA structure prediction techniques and models is incorporated for a better understanding of the history and scope of RNA structure prediction, through literature.en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTD-4916;-
dc.subjectRNA SECONDARY STRUCTUREen_US
dc.subjectPREDICTION TOOLen_US
dc.subjectDEEP LEARNINGen_US
dc.titleA NOVEL RNA SECONDARY STRUCTURE SITE ACCESSIBILITY PREDICTION TOOL USING DEEP LEARNINGen_US
dc.typeThesisen_US
Appears in Collections:M.E./M.Tech. Bio Tech

Files in This Item:
File Description SizeFormat 
M.Tech. ShubhamMittal.pdf2.87 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.