Setting the Assembly’s Homework & Studying the paper

2007/03/28

Today, I chose two paper which are relevant to my research. The titles are “Improved Automatic Keyword Extraction Given More Linguistic Knowledge” and “A study on automatically extracted keywords in text categorization”, respectively.
The main point of the first one is that by adding linguistic knowledge to the representation rather than relying on statistics. They compared three different approaches (n-grams; noun phrase; and POS tag sequences) and adopted four features (term frequency, collection frequency, relative position of the first occurrence, and the POS tags assigned to the term).

My opinion:Due to the short text, they keep the terms which appeared only once, but they didn’t consider the term’s distribution.

No comments: