Task | Papers | |
Pretraining | | |
relationship with cnn | 2020/1/10 On Therelationship Eetween Self-Attention And Convolutional Layers | 2020/1/10 关于自注意力和卷积层之间的关系 |
classfication | 2020/4/28 Exploring Self-Attention For Image Recognition 2020/10/ 22 An Image Is Worth 16x16 Words: Transformers For Image Recognition At Scale | 2020/4/28 探索图像识别的自注意力 2020/10/ 22 一张图像值 16x16 字:大规模图像识别的变形金刚 |
object detection | 2020/5/28 End-To-End Object Detecti On With Adaptive Clustering Transformer 2020/10/8 Deformable Detr: Deformable Transformers For End-To-End Object Detection 2020/11/18 Act(Endto-End Object Detection With Adaptive Clustering Transformer) | 2020/5/28 使用自适应聚类变换器进行端到端对象检测 2020/10/8 Deformable detr:用于端到端物体检测的可变形变压器 2020/11/18 Act(使用自适应聚类变换器进行端到端对象检测) |
image gpt | 2018/6/15 Image Transf Ormer 2019/4/23 Generating Long Sequences With Sparse Transformers 2020/1/10 Generative Pretraining From Pixels | 2018/6/15 图像转换器 2019/4/23 使用稀疏变换器生成长序列 2020/1/10 从像素生成预训练 |
segmentation | 2020/9/23 Hamming Ocr:A Locality Sensitive Hashing Neural Network For Scene Text Recognition 2020/11/14 Actbert:Learning Global-Local Video-Text Representations 2020/12/1 Max-Deeplab:End-To-End Panoptic Segmentation With Mask Transformers | 2020/9/23 Hamming ocr:一种用于场景文本识别的局部敏感哈希神经网络 2020/11/14 Actbert:学习全局-本地视频-文本表示 2020/12/1 Max-deeplab:使用掩模转换器的端到端全景分割 |
video | Cvpr2018 End-To-End Dense Video Captioning With Masked Transformer 2020/11/4 Foley Music: Learning To Generate Music From Videos 2020/12/4 End-To-End Video Instance Segmentation With Transformers | CVPR2018 带掩码转换器的端到端密集视频字幕 2020/11/4 拟音音乐:学习从视频中生成音乐2020/12/4 使用转换器的端到端视频实例分割 |
lane detection | 2020/7/14 Polylanenet: Lane Estimation Via Deep Polynomial Regression | 2020/7/14 Polylanene:通过深度多项式回归进行车道估计 |
vision model | | |