MOSA: Matrix Optimized Self-Attention Hardware Accelerator for Mobile Device | |
---|---|
學年 | 113 |
學期 | 1 |
發表日期 | 2025-01-19 |
作品名稱 | MOSA: Matrix Optimized Self-Attention Hardware Accelerator for Mobile Device |
作品名稱(其他語言) | |
著者 | Yang-Rwei Chang; Hsuan-Fu Chen; Horng-Yuan Shih |
作品所屬單位 | |
出版者 | |
會議名稱 | 2025 International Conference on Electronics, Information, and Communication (ICEIC) |
會議地點 | Osaka, Japan |
摘要 | The Self-Attention mechanism, which lies at the core of Transformer architectures, plays a vital role in capturing long-range dependencies. However, its high computational complexity and significant memory requirements pose major challenges for resource-constrained hardware such as mobile devices. In particular, frequent memory accesses and inefficient matrix multiplication operations often result in performance bottlenecks. Therefore, the development of dedicated hardware accelerators for Self-Attention, focusing on optimizing matrix operations and reducing memory usage, is essential for improving AI processing efficiency on mobile devices. This paper presents a Self-Attention hardware accelerator designed specifically for mobile devices. By optimizing the matrix multiplication process, the accelerator effectively reduces data transmission and memory access frequency, thereby lowering the overall computational complexity. It also reuses intermediate computation results, minimizing frequent memory read and write operations, which significantly reduces memory bandwidth requirements. This data reuse strategy not only cuts down on redundant computations but also greatly enhances computational efficiency. In addition, the accelerator eliminates the need for Key Transpose operations, further simplifying certain computational steps. Compared to traditional algorithms, the idle time is reduced by 66.5%., and signal line transmission costs are reduced by approximately 80%. With an optimized hardware architecture, mobile devices can efficiently support complex deep learning applications, such as speech processing and image recognition. |
關鍵字 | |
語言 | en_US |
收錄於 | |
會議性質 | 國際 |
校內研討會地點 | 無 |
研討會時間 | 20250119~20250122 |
通訊作者 | |
國別 | JPN |
公開徵稿 | |
出版型式 | |
出處 | |
相關連結 |
機構典藏連結 ( http://tkuir.lib.tku.edu.tw:8080/dspace/handle/987654321/128112 ) |