【原】DL之PSPNet：PSPNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

处女座的程序猿 2021-09-28

展开全文

DL之PSPNet：PSPNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

相关文章
DL之PSPNet：PSPNet算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
DL之PSPNet：PSPNet算法的架构详解

PSPNet算法的简介(论文介绍)

更新……

Abstract
Scene parsing is challenging for unrestricted open vocabulary and diverse scenes. In this paper, we exploit the capability of global context information by different-regionbased context aggregation through our pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet). Our global prior representation is effective to produce good quality results on the scene parsing task, while PSPNet provides a superior framework for pixellevel prediction. The proposed approach achieves state-ofthe-art performance on various datasets. It came first in ImageNet scene parsing challenge 2016, PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields the new record of mIoU accuracy 85.4% on PASCAL VOC 2012 and accuracy 80.2% on Cityscapes.
场景解析对于不受限制的开放词汇表和不同的场景具有挑战性。本文结合金字塔场景分析网络(PSPNet)，通过金字塔池模块实现了基于不同区域的上下文聚合，实现了全局上下文信息的聚合。我们的全局先验表示方法能够有效地在场景解析任务中生成高质量的结果，而PSPNet为pixellevel预测提供了一个优越的框架。该方法在各种数据集上实现了最先进的性能。在ImageNet场景分析的挑战2016、PASCAL VOC 2012基准测试和Cityscapes基准测试中获得第一名。单个PSPNet在PASCAL VOC 2012上的mIoU准确率为85.4%，在城市景观上的准确率为80.2%。
Concluding Remarks
We have proposed an effective pyramid scene parsing network for complex scene understanding. The global pyramid pooling feature provides additional contextual information. We have also provided a deeply supervised optimization strategy for ResNet-based FCN network. We hope the implementation details publicly available can help the community adopt these useful strategies for scene parsing and semantic segmentation and advance related techniques.
针对复杂场景的理解，提出了一种有效的金字塔场景解析网络。全局金字塔池功能提供了额外的上下文信息。为基于resnet的FCN网络提供了一种深度监督优化策略。我们希望公开的实现细节可以帮助社区采用这些有用的场景解析和语义分割策略，并推进相关技术。

论文
Hengshuang Zhao, JianpingShi, XiaojuanQi, XiaogangWang, JiayaJia.
Pyramid Scene Parsing Network. CVPR 2017.
https:///abs/1612.01105

0、实验结果

1、Experiments

作者在三个不同的数据集上做实验，Three different datasets, including 三个不同的数据集，包括

ImageNet scene parsing challenge 2016
ImageNet场景解析挑战2016
PASCAL VOC 2012 semantic segmentation
PASCAL VOC 2012语义分割
urban scene understanding dataset Cityscapes
城市场景理解数据集城市景观

2、在ADE2OK验证集中，不同预训练ResNet的PSPNet性能

Performance of PSPNet with different pre-trained ResNet on ADE2OK validation set 随着深度增加，性能逐渐增加；当然，深度越深，其复杂度越高！
Visual improvements on ADE20K PSPNet produces more accurate and detailed results. 因为有全局信息，PSPNet 生成了更精确和详细的结果。