Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21784
Title: LYNX-RNA: A NEXTFLOW-BASED MODULAR RNA-SEQ AND MACHINE LEARNING PIPELINE FOR BIOMARKER DISCOVERY WITH LLM-SUMMARIZED REPORT GENERATION IN IMMUNE THROMBOCYTOPENIA
Authors: SHARMA, DEVANSHI
Keywords: RNA-SEQ PIPELINE
LYNX-RNA
BIOMARKER DISCOVERY
IMMUNE THROMBOCYTOPENIA (ITP)
GENE EXPRESSION ANALYSIS
LARGE LANGUAGE MODEL (LLM)
WORKFLOW AUTOMATION
TRANSCRIPTOMICS
NATURAL LANGUAGE REPORTING
WGCNA
XGBOOST
Issue Date: May-2025
Series/Report no.: TD-7994;
Abstract: The increasing complexity of RNA-seq data requires analysis pipelines that are robust, scalable, and interpretable. LYNX-RNA (Language-augmented Yield for Nextflow based RNA eXpression analysis) is a modular, Nextflow-based workflow that delivers end-to-end RNA-seq analysis—from raw FASTQ files to biological insights—with automation and reproducibility. LYNX-RNA integrates standard tools for quality control, alignment, quantification, and differential gene expression (DGE), along with advanced modules for WGCNA, PPI network modeling, and GO/KEGG enrichment. A key feature is its built-in machine learning module (Random Forest and XGBoost) for predictive biomarker discovery, and an LLM-powered reporting system that generates natural language summaries of results. To identify DEGs, treated ITP patient data (GSE112278) was compared against external healthy controls (GSE251778) using Welch’s t-test and FDR correction. These DEGs were used to train a classifier that achieved a ROC AUC of 0.937, demonstrating high predictive accuracy. Notably, top predicted DEGs such as EPB42, TNS1, and HAGH overlapped with WGCNA derived hub genes, reinforcing biological relevance. The pipeline supports deployment in low-resource environments (≤24 GB RAM), is compatible with Conda, Docker, and HPC systems, and includes a Python-based CLI for user accessibility. We applied LYNX-RNA to a longitudinal ITP dataset spanning control, pre-treatment, and post treatment stages, uncovering dynamic gene signatures and potential immune metabolic biomarkers. LYNX-RNA provides a flexible, automation-ready solution for transcriptome analysis, well-suited for biomarker discovery and translational immunology. In summary, LYNX-RNA bridges key gaps in usability, scalability, and interpretability in transcriptomic workflows. It serves as a versatile, automation-ready platform for both academic research and translational applications in systems immunology, precision medicine, and biomarker discovery.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21784
Appears in Collections:M.E./M.Tech. Bio Tech

Files in This Item:
File Description SizeFormat 
DEVANSHI SHARMA M.Tech.pdf2.65 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.