Feature Selection for Identifying Protein Disordered Regions | |
---|---|
學年 | 98 |
學期 | 2 |
出版(發表)日期 | 2010-04-01 |
作品名稱 | Feature Selection for Identifying Protein Disordered Regions |
作品名稱(其他語言) | |
著者 | Hsu, Hui-Huang; Hsieh, Cheng-Wei |
單位 | 淡江大學資訊工程學系 |
出版者 | Singapore: World Scientific Publishing Co. Pte. Ltd. |
著錄名稱、卷期、頁數 | Biomedical Engineering: Applications, Basis and Communications 22(2), pp.119-125 |
摘要 | Determining the structure of a protein is not an easy task, which usually involved a time-consuming and costly process in the web lab. Using computational methods to predict a protein's tertiary structure from its primary structure (the amino acid sequence) is desirable. Disordered regions are segments of a protein that do not have a fixed conformation, which makes the structure prediction harder. Also, these disordered regions are functionally important for a protein. In this research, we would like to identify such regions with a focus on selecting a proper feature set. Three feature selection methods, namely F-score, information gain (IG), and k-medoids clustering, are used for feature selection. The support vector machine (SVM) is then used for classification. The results show that the classification accuracy can be raised with a smaller feature set. The k-medoids clustering feature selection can reduce the number of features from 440 to 150 and improve the accuracy from 84.66 to 86.81% in five-fold cross validation. It also has a more stable performance than F-score and IG. |
關鍵字 | Disordered protein region; k-Medoids clustering; Feature selection; Proteomics |
語言 | en |
ISSN | 1016-2372 1793-7132 |
期刊性質 | 國外 |
收錄於 | SCI EI |
產學合作 | |
通訊作者 | Hsu, Hui-Huang |
審稿制度 | 是 |
國別 | SGP |
公開徵稿 | |
出版型式 | 紙本 |
相關連結 |
機構典藏連結 ( http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/59913 ) |