Transformer-based Segmentation

人工智能炼丹师
2021-09-02 / 0 评论 / 396 阅读 / 正在检测是否收录...

Unet系列Transformer模型(医学图像分割)

  1. 结合全局(self-attention) 和 局部(Unet) 的特点,构建分割网络
  2. 如何在小样本数据集上,使得分割work,有效训练大参数量的transformer模型

Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [Code]

1. Metric:
2. Motivation: Swin transformer 的优点: 解决长序列问题;窗口内Attention + 窗口间信息交互; UNet的优点: 局部信息, ShortCut
3. Main Contributions:

  • Based on Swin Transformer block, we build a symmetric Encoder-Decoder architecture with skip connections. In the encoder, self-attention from local to global is realized; in the decoder, the global features are up-sampled to the input resolution for corresponding pixel-level segmentation prediction.
  • A patch expanding layer is developed to achieve up-sampling and feature dimension increase without using convolution or interpolation operation.
  • It is found in the experiment that skip connection is also effective for Transformer, so a pure Transformer-based U-shaped Encoder-Decoder architecture with skip connection is finally constructed, named Swin-Unet.

4. Model Structure:
5. Take Home Message: 上采样方式patch expanding layer

Medical Transformer: Gated Axial-Attention for Medical Image Segmentation [Code]

主流语义分割Transformer模型

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers [Code]

MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation [Code]

0

评论 (0)

取消
粤ICP备2021042327号