高级检索
当前位置: 首页 > 详情页

Vision transformer-based recognition of diabetic retinopathy grade.

文献详情

资源类型:
Pubmed体系:

收录情况: ◇ EI

机构: [1]School of computer science, Guangdong Polytechnic Normal University, Guangzhou, China [2]School of Traditional Chinese Medicine, Jinan University, Guangzhou, China [3]Huidong people’s Hospital, Huizhou, China
出处:
ISSN:

关键词: Diabetic retinopathy Deep Learning Vision Transformer Multi-head attention

摘要:
In the domain of natural language processing, Transformers are recognized as state-of-the-art models, which opposing to typical convolutional neural networks (CNNs), do not rely on convolution layers. Instead, Transformers employ multi-head attention mechanisms as the main building block to capture long-range contextual relations between image pixels. Till recently, CNNs dominated the deep learning solutions for diabetic retinopathy grade recognition. However, spurred by the advantages of Transformers, we propose a Transformer-based method that is appropriate for recognizing the grade of diabetic retinopathy.The purposes of this work were to (i) demonstrate that the pure attention mechanism is suitable for diabetic retinopathy grade recognition and (ii) demonstrate that Transformers can replace traditional CNNs for diabetic retinopathy grade recognition.This paper proposes a Vision Transformer-based method to recognize the grade of diabetic retinopathy. Fundus images are subdivided into non-overlapping patches, which are then converted into sequences by flattening, and undergo a linear and positional embedding process to preserve positional information. Then, the generated sequence is input into several multi-head attention layers to generate the final representation. The first token sequence is input to a softmax classification layer to produce the recognition output in the classification stage.The dataset for training and testing employs fundus images of different resolutions, subdivided into patches. We challenge our method against current CNNs and extreme learning machines and achieve an appealing performance. Specifically, the suggested deep learning architecture attains an accuracy of 91.4%, specificity = 0.977-95% CI (0.951-1), precision = 0.928-95% CI(0.852-1), sensitivity = 0.926-95% CI (0.863-0.989), Quadratic weighted kappa score = 0.935, and AUC = 0.986.Our comparative experiments against current methods conclude that our model is competitive and highlight that an attention mechanism based on a Vision Transformer model is promising for the diabetic retinopathy grade recognition task. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.

基金:
语种:
PubmedID:
中科院(CAS)分区:
出版当年[2020]版:
大类 | 3 区 医学
小类 | 3 区 核医学
最新[2025]版:
大类 | 3 区 医学
小类 | 3 区 核医学
第一作者:
第一作者机构: [1]School of computer science, Guangdong Polytechnic Normal University, Guangzhou, China
通讯作者:
通讯机构: [1]School of computer science, Guangdong Polytechnic Normal University, Guangzhou, China [*1]School of computer science, Guangdong Polytechnic Normal University, Guangzhou, China
推荐引用方式(GB/T 7714):
APA:
MLA:

资源点击量:2018 今日访问量:0 总访问量:645 更新日期:2024-07-01 建议使用谷歌、火狐浏览器 常见问题

版权所有©2020 广东省中医院 技术支持:重庆聚合科技有限公司 地址:广州市越秀区大德路111号