期刊論文
學年 | 99 |
---|---|
學期 | 1 |
出版(發表)日期 | 2010-10-01 |
作品名稱 | The Chinese Text Categorization System with Category Priorities |
作品名稱(其他語言) | |
著者 | Keh, Huan-Chao; Chiang, Ding-An; Hsu, Chih-Cheng; Huang, Hui-Hua |
單位 | 淡江大學資訊工程學系 |
出版者 | Oulu: Academy Publisher |
著錄名稱、卷期、頁數 | Journal of Software 5(10), pp.1137-1143 |
摘要 | The process of text categorization involves some understanding of the content of the documents and/or some previous knowledge of the categories. For the content of the documents, we use a filtering measure for feature selection in our Chinese text categorization system. We modify the formula of Term Frequency-Inverse Document Frequency (TF-IDF) to strengthen important keywords’ weights and weaken unimportant keywords’ weights. For the knowledge of the categories, we use category priority to represent the relationship between two different categories. Consequently, the experimental results show that our method can effectively not only decrease noise text but also increase the accuracy rate and recall rate of text categorization. |
關鍵字 | text categorization; feature selection; filtering measure; text mining |
語言 | en |
ISSN | 1796-217X |
期刊性質 | 國外 |
收錄於 | EI |
產學合作 | |
通訊作者 | Keh, Huan-Chao |
審稿制度 | |
國別 | FIN |
公開徵稿 | |
出版型式 | 紙本 |
相關連結 |
機構典藏連結 ( http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/54954 ) |