FREQUENCY BASED AUTHOR-TOPIC MODEL FOR INFORMATION DISCOVERY

GUPTA, DEEPAK

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/660

Full metadata record

DC Field	Value	Language
dc.contributor.author	GUPTA, DEEPAK	-
dc.date.accessioned	2011-01-03T10:36:45Z	-
dc.date.available	2011-01-03T10:36:45Z	-
dc.date.issued	2011-01-03	-
dc.identifier.uri	http://dspace.dtu.ac.in:8080/jspui/handle/repository/660	-
dc.description	ME THESIS	en_US
dc.description.abstract	For any work of literature, a fundamental issue is to identify the individual(s) who wrote it, and conversely, to identify all of the works that belong to a given individual or to identify the individual who writes many papers on same topic or to identify the topics name that an author works on. Information extraction techniques (such as Author Name and Topic Recognition) have long been used to extract useful pieces of information from text. The types of information to be extracted are generally fixed and well defined. However in some cases, the user goal is more abstract and information types cannot be narrowly defined. For example, a reader of online user reviews typically has the goal of making a good choice and is interested to learn about the different aspects of a topic and author relation (e.g., famous author of a topic, author’s papers with his research field). Some of these aspects may be known by the reader and some others may need to be discovered from the inherent text structure in a large collection. Even for the known aspects (such as “author name” and “topic”), the challenge is to recognize various hidden aspects like number of papers written by an author, his research field, popularity of an author. In this thesis, we model the author-topic information discovery system as topics with identifiable word distributions across documents. We review several probabilistic graphical models (such as Latent Dirichlet Allocation) and propose a new model which is based on frequency of the words within the document. We also provide a case study of a probability based author-topic model developed for information discovery.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	TD 721;80	-
dc.subject	INFORMATION	en_US
dc.subject	INFORMATION DISCOVERY	en_US
dc.title	FREQUENCY BASED AUTHOR-TOPIC MODEL FOR INFORMATION DISCOVERY	en_US
Appears in Collections:	M.E./M.Tech. Computer Technology & Applications

Files in This Item:

File	Description	Size	Format
ME THESIS.doc		2.67 MB	Microsoft Word	View/Open

Show simple item record