Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/13516
Title: FINDING SEQUENTIAL PATTERNS FROM BIOLOGICAL SEQUENCES
Authors: KUMAR, PRADEEP
Keywords: Biological
Issue Date: 13-Nov-2006
Series/Report no.: TD-262;
Abstract: Bioinformatics became very popular nowadays. Most of the tasks in bioinformatics involve searching of biological databases. The sizes of biological data records are very huge, and the numbers of records in the databases are increasing year by year. So we need efficient searching techniques for biological databases. Biosequences typically have a small alphabet, a long length, and patterns containing gaps of arbitrary size. Mining frequent patterns in such sequences faces a different type of explosion than in transaction sequences. In this project report, we study how this explosion affects the classic sequential pattern mining, and present a scalable two-phase algorithm to deal with this new explosion. We propose a new algorithm called Two-Phase Searching Algorithm (2-PSA) that incorporates reliability and efficiency. The first phase “Segment Phase” first searches for short patterns containing no gaps, called segments. This phase is efficient. The second phase “Pattern Phase” s...
Description: ME THESIS
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/13516
Appears in Collections:M.E./M.Tech. Computer Technology & Applications

Files in This Item:
File Description SizeFormat 
PardeepKumar.pdf821.93 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.