Kernel Attention Transformer (KAT) for Histopathology Whole Slide Image Classification
Yushan Zheng*, Jun Li, Jun Shi, Fengying Xie, Zhiguo Jiang
Medical Image Computing and Computer Assisted Intervention (MICCAI), 2022
Abstract
BibTeX
Code
Transformer has been widely used in histopathology whole slide image (WSI) classification for the purpose of tumor grading, prognosis analysis, etc. However, the design of token-wise self-attention and positional embedding strategy in the common Transformer limits the effectiveness and efficiency in the application to gigapixel histopathology images. In this paper, we propose a kernel attention Transformer (KAT) for histopathology WSI classification. The information transmission of the tokens is achieved by cross-attention between the tokens and a set of kernels related to a set of positional anchors on the WSI. Compared to the common Transformer structure, the proposed KAT can better describe the hierarchical context information of the local regions of the WSI and meanwhile maintains a lower computational complexity. The proposed method was evaluated on a gastric dataset with 2040 WSIs and an endometrial dataset with 2560 WSIs, and was compared with 5 state-of-the-art methods. The experimental results have demonstrated the proposed KAT is effective and efficient in the task of histopathology WSI classification and is superior to the state-of-the-art methods.
@inproceedings{zheng2022kernel,
author = {Yushan Zheng, Jun Li, Jun Shi, Fengying Xie, Zhiguo Jiang},
title = {Kernel Attention Transformer (KAT) for Histopathology Whole Slide Image Classification},
booktitle = {Medical Image Computing and Computer Assisted Intervention
-- MICCAI 2022},
year = {2022}
}
Lesion-Aware Contrastive Representation Learning For Histopathology Whole Slide Images Analysis
Jun Li, Yushan Zheng*, Kun Wu, Jun Shi*, Fengying Xie, Zhiguo Jiang
Medical Image Computing and Computer Assisted Intervention (MICCAI), 2022
Abstract
BibTeX
Local representation learning has been a key challenge to promote the performance of the histopathological whole slide images analysis. The previous representation learning methods almost follow the supervised learning paradigm. However, manual annotation for large-scale WSIs is time-consuming and labor-intensive. Hence, the self-supervised contrastive learning has recently attracted intensive attention. Contrastive learning on the design of informative of the positive and negative pairs, but common contrastive learning methods treat each sample as a single class, which leads to the class collision problems, especially in the domain of histopathology image analysis. In this paper, we proposed a novel contrastive representation learning framework called Lesion-Aware Contrastive Learning (LACL) for histopathology whole slide image analysis. We built a module called lesion queue based on the memory bank structure to store the representations of different classes of WSIs, which allowed the model to selectively sample the negative pairs during the training. Moreover, We designed a queue refinement strategy to purify the representations stored in the lesion queue. The experimental results demonstrate that LACL achieves the best performance in histopathology image representation learning on different datasets, and outperforms state-of-the-art methods under different WSI classification benchmarks.
@inproceedings{li2022lesion,
author = {Jun Li, Yushan Zheng*, Kun Wu, Jun Shi, Fengying Xie, Zhiguo Jiang},
title = {Lesion-Aware Contrastive Representation Learning For Histopathology Whole Slide Images Analysis},
booktitle = {Medical Image Computing and Computer Assisted Intervention
-- MICCAI 2022},
year = {2022}
}
Multi-Frame Super-Resolution With Raw Images Via Modified Deformable Convolution
Gongzhe Li, Linwei Qiu, Haopeng Zhang, Fengying Xie, Zhiguo Jiang
IEEE ICASSP, 2022
Abstract
BibTeX
In this paper we propose a novel model towards multi-frame super-resolution, which leverages multiple RAW images and yields a super-resolved RGB image. To facilitate the pixel misalignment in burst photography, we apply a refined Pyramid Cascading and Deformable Convolution (PCD) feature alignment module. A new 3D deformable convolution fusion module is proposed subsequently to merge the information from all frames adaptively. In addition, we employ an encoder-decoder network to restore color and details in sRGB space after super-resolving images in linear space. Extensive experiments demonstrate the superiority of our architecture and the strength of multi-frame super-resolution with RAW images.
@inproceedings{li2022multi,
author={Li, Gongzhe and Qiu, Linwei and Zhang, Haopeng and Xie, Fengying and Jiang, Zhiguo},
booktitle={ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Multi-Frame Super-Resolution With Raw Images Via Modified Deformable Convolution},
year={2022},
pages={2155-2159},
doi={10.1109/ICASSP43922.2022.9747407}
}
A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects
Gongzhe Li, Linpeng Pan, Linwei Qiu, Zhiwen Tan, Fengying Xie, Haopeng Zhang*
18th IEEE Workshop on Perception Beyond the Visible Spectrum (PBVS Workshop 2022) in conjunction with CVPR 2022
PDF
Abstract
BibTeX
Code
Synthetic Aperture Radar (SAR) has received more attention due to its complementary superiority on capturing
significant information in the remote sensing area. However, for an Aerial View Object Classification (AVOC) task,
SAR images still suffer from the long-tailed distribution of
the aerial view objects. This disparity limit the performance
of classification methods, especially for the data-sensitive
deep learning models. In this paper, we propose a twostage shake-shake network to tackle the long-tailed learning problem. Specifically, it decouples the learning procedure into the representation learning stage and the classification learning stage. Moreover, we apply the test time
augmentation (TTA) and the classification with alternating
normalization (CAN) to improve the accuracy. In the PBVS
1 2022 Multi-modal Aerial View Object Classification Challenge Track 1, our method achieves 21.82% and 27.97% accuracy in the development phase and testing phase respectively, which wins the top-tier among all the participants
@inproceedings{
author={Gongzhe Li, Linpeng Pan, Linwei Qiu, Zhiwen Tan, Fengying Xie, Haopeng Zhang},
booktitle={18th IEEE Workshop on Perception Beyond the Visible Spectrum (PBVS Workshop 2022) in conjunction with CVPR 2022},
title={A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects},
year={2022},
pages={245-256},
}
Few-Shot Multi-Class Ship Detection in Remote Sensing Images Using Attention Feature Map and Multi-Relation Detector
Haopeng Zhang*, Xingyu Zhang, Gang Meng, Chen Guo, Zhiguo Jiang
Remote Sensing, 2022
PDF
Abstract
BibTeX
Monitoring and identification of ships in remote sensing images is of great significance for port management, marine traffic, marine security, etc. However, due to small size and complex background, ship detection in remote sensing images is still a challenging task. Currently, deep-learning-based detection models need a lot of data and manual annotation, while training data containing ships in remote sensing images may be in limited quantities. To solve this problem, in this paper, we propose a few-shot multi-class ship detection algorithm with attention feature map and multi-relation detector (AFMR) for remote sensing images. We use the basic framework of You Only Look Once (YOLO), and use the attention feature map module to enhance the features of the target. In addition, the multi-relation head module is also used to optimize the detection head of YOLO. Extensive experiments on publicly available HRSC2016 dataset and self-constructed REMEX-FSSD dataset validate that our method achieves a good detection performance.
@Article{zhang2022few,
AUTHOR = {Zhang, Haopeng and Zhang, Xingyu and Meng, Gang and Guo, Chen and Jiang, Zhiguo},
TITLE = {Few-Shot Multi-Class Ship Detection in Remote Sensing Images Using Attention Feature Map and Multi-Relation Detector},
JOURNAL = {Remote Sensing},
VOLUME = {14},
YEAR = {2022},
NUMBER = {12},
ARTICLE-NUMBER = {2790},
URL = {https://www.mdpi.com/2072-4292/14/12/2790},
ISSN = {2072-4292},
DOI = {10.3390/rs14122790}
}