教師資料查詢 | 類別: 會議論文 | 教師: 王彥雯 WANG, CHARLOTTE (瀏覽個人網頁)

標題:Active Learning with Sequential Sampling and Dimension Reduction for Analyzing Large-Scale Datasets
學年105
學期1
發表日期2016/12/09
作品名稱Active Learning with Sequential Sampling and Dimension Reduction for Analyzing Large-Scale Datasets
作品名稱(其他語言)
著者Wang, Charlotte; Chang, Yuan-chin Ivan
作品所屬單位
出版者
會議名稱2016 CSA & NCCU Joint Statistical Meetings
會議地點國立政治大學
摘要Active learning is a kind of semi-supervised learning methods in which learning algorithm is able to interactively query some information to get new subjects’ labels/classes. When labeling subjects is quite expensive, active learning is a possible solution to reduce cost because only the selected subjects need to be exanimated and labeled, such as in money laundering detection and disease screening. For analyzing large-scale datasets, the large sample size and high dimension become a challenge for both analysis and computation. In this talk, we will present an active learning algorithm for analyzing large-scale datasets. The proposed method is based on a logistic regression model with a modified iterative algorithm for estimating parameters in order to be more computational efficiency, without sacrificing too much in statistical efficiency. In addition, the methods of shrinkage estimation and subject clustering are considered for selecting effective variables and reducing subject-searching time when analyzing large-scale datasets. For the perspectives of uncertainty sampling and precision of parameter estimates, we search the representatives of subject clusters and select useful samples based on the concept of sequential D-optimal design. The real data applications and simulations will be used to evaluate the performance of the proposed active learning algorithm.
關鍵字active learning;clustering;D-optimal design;sequential sampling
語言中文
收錄於
會議性質國內
校內研討會地點
研討會時間20161209~20161210
通訊作者
國別中華民國
公開徵稿
出版型式
出處no proceeding
相關連結
Google+ 推薦功能,讓全世界都能看到您的推薦!