教師資料查詢 | 類別: 期刊論文 | 教師: 周建興 Chien-hsing Chou (瀏覽個人網頁)

標題:A machine-learning approach for analyzing document layout structures with two reading orders
學年97
學期1
出版(發表)日期2008/10/01
作品名稱A machine-learning approach for analyzing document layout structures with two reading orders
作品名稱(其他語言)
著者Wu, Chung-Chih; Chou, Chien-Hsing; Chang, Fu
單位淡江大學電機工程學系
出版者Kidlington: Pergamon
著錄名稱、卷期、頁數Pattern Recognition 41(10), pp.3200-3213
摘要The purpose of document layout analysis is to locate textlines and text regions in document images mostly via a series of split-or-merge operations. Before applying such an operation, however, it is necessary to examine the context to decide whether the place chosen for the operation is appropriate. We thus view document layout analysis as a matter of solving a series of binary decision problems, such as whether to apply, or not to apply, a split-or-merge operation to a chosen place. To solve these problems, we use support vector machines to learn whether or not to apply the previously mentioned operations from training documents in which all textlines and text regions have been located and their identifies labeled. The proposed approach is very effective for analyzing documents that allow both horizontal and vertical reading orders. When applied to a test data set composed of eight types of layout structure, the approach's accuracy rates for identifying textlines and text regions are 98.83% and 96.72%, respectively.
關鍵字Binary decision;Document layout analysis;Reading order;Support vector machine;Taboo box;Textline;Text region
語言英文
ISSN0031-3203
期刊性質國外
收錄於
產學合作
通訊作者Chang, Fu
審稿制度
國別英國
公開徵稿
出版型式,紙本
相關連結
Google+ 推薦功能,讓全世界都能看到您的推薦!