Unet系列Transformer模型(医学图像分割)

Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [Code]

1. Metric:
2. Motivation: Swin transformer 的优点: 解决长序列问题；窗口内Attention + 窗口间信息交互; UNet的优点: 局部信息, ShortCut
3. Main Contributions:

Based on Swin Transformer block, we build a symmetric Encoder-Decoder architecture with skip connections. In the encoder, self-attention from local to global is realized; in the decoder, the global features are up-sampled to the input resolution for corresponding pixel-level segmentation prediction.
A patch expanding layer is developed to achieve up-sampling and feature dimension increase without using convolution or interpolation operation.
It is found in the experiment that skip connection is also effective for Transformer, so a pure Transformer-based U-shaped Encoder-Decoder architecture with skip connection is finally constructed, named Swin-Unet.

4. Model Structure:
5. Take Home Message: 上采样方式patch expanding layer

Medical Transformer: Gated Axial-Attention for Medical Image Segmentation [Code]

主流语义分割Transformer模型

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers [Code]

MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation [Code]

版权属于：人工智能炼丹师

本文链接： https://jefxiong.cn/index.php/archives/273.html

作品采用：《署名-非商业性使用-相同方式共享 4.0 国际 (CC BY-NC-SA 4.0) 》许可协议授权