JCF: Joint Coarse-and Fine-Grained Similarity Comparison for Plagiarism Detection Based on NLP | |
---|---|
學年 | 111 |
學期 | 2 |
出版(發表)日期 | 2023-06-24 |
作品名稱 | JCF: Joint Coarse-and Fine-Grained Similarity Comparison for Plagiarism Detection Based on NLP |
作品名稱(其他語言) | |
著者 | C. Y. Chang; S.-J. Jhang; S.-J. Wu; D. S. Roy |
單位 | |
出版者 | |
著錄名稱、卷期、頁數 | Journal of Supercomputing 80(1), p.363-394 |
摘要 | Document similarity recognition is one of the most important problems in natural language processing. This paper proposes a plagiarism comparison mechanism called JCF. Initially, the TF–IDF scheme is applied to build a bag of words as the representation of the common features of all documents. Then, the plagiarism comparison is carried out in a coarse-grained manner, which speeds up the similarity comparison. Finally, the most similar documents can then be compared in detail based on a fine-grained approach. In addition, the JCF detects plagiarism at both syntax level and semantic-like level. To prevent the distortion of similarity comparison, this paper further develops a similarity restoration approach such that the proposed JCF can obtain both advantages of quickness and accuracy. Performance studies confirm that the proposed JCF outperforms existing studies in terms of precision, recall and F1 score. |
關鍵字 | Natural language processing;TF–IDF;Word2Vec;Coarse and fine grained;Document similarity |
語言 | en_US |
ISSN | 1573-0484; 0920-8542 |
期刊性質 | 國外 |
收錄於 | SCI |
產學合作 | |
通訊作者 | |
審稿制度 | 否 |
國別 | USA |
公開徵稿 | |
出版型式 | ,電子版,紙本 |
相關連結 |
機構典藏連結 ( http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/125169 ) |
SDGS | 尊嚴就業與經濟發展,產業創新與基礎設施 |