Extracting Topic Words

2007/04/04

Today, I encoded a program for extracting topic words. The goal of this program is to detect the topic words of a paper by referring the ACM categories. There are some post-processing procedures on the topic words must be done. For example: each paper maybe belongs to many categories, hence I will dispatch each paper to its categories repeatedly. Besides, due to the consistency, I also do the stemming on each topic word. (Regarding two categories, such as Information Retrieval and Information retrieval as one category).

No comments: