TensorFlow人工智能引擎入门教程之十最强网络 RSNN深度残差网络平均准确率96

雪柳花明 2017-04-18

展开全文

在第六届 ImageNet 图像识别挑战赛上，微软研究院在多个类别的比赛中取得了第一名的成绩。比赛结果显示，微软的技术水平远远超越了 Google、Intel、高通、腾讯以及一众创业公司和科研实验室。

这个叫做「图像识别的深度残差学习」的获胜项目由微软研究员何恺明、张祥雨、任少卿和孙剑共同完成。根据微软博客显示，有关该成果的细节将会在后续的论文中详细介绍。

该技术的显著意义主要在于其复杂性。

http:///pdf/1512.03385v1.pdf

因为传统的多层网络随着层数增多，导致残差加大。所以为了防止这个问题，我们把多个网络看成一个单元，单元计算后将上次的产生的残差记入并记入下一次单元计算，举个例子，小明拿出100块买了一件1 元的 2 元的 6元的东西，但是老板没有1块零钱，但是小明可能会继续买，所以买了1 2 6 元后当做10元我买了3次，那么实际上相当于每一次老板还欠小明1块，总共3块，所以把这个3块加入到下一次计算的里面呢，比如小明下次买了个2元 5元的东西那么实际上就是3 2 5 记入下一次网络，总之我的理解就是把每一次计算后得到的残差作为作为一层网络来替代，也就是说把残差用网络替代，就好像我们用wx+b 替代y 一样，实际的值与真实的值有误差所以如果我们把这个误差记入下一次wx+b来替代，最后是不是可以保证中间每一层 wx+b 被抵押消除了。大概是这样的，这是我的理解，网上也没有任何资料指出，个人看官方论文有感，如果有什么不对请指正。

看看官方samples的关键代码。他使用3 3 3 的卷积核，三次卷积之后产生的残差记入下一次卷积，看net+conv 然后继续wx+b 参数了残差之后继续把新产生的残差记入下下次wx+b

2015 MSRE大赛第一名准确率最高的深度学习网络，也是至今为止准确率最高的网络幸运的是在好的训练集情况下大概结果对大多数训练得到的准确率96-99.9之间。rsnn152

152的太长了，这里贴出一个rsnn50 准确率 92-99

name: "ResNet-50"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224

layer {
	bottom: "data"
	top: "conv1"
	name: "conv1"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 7
		pad: 3
		stride: 2
	}
}

layer {
	bottom: "conv1"
	top: "conv1"
	name: "bn_conv1"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "conv1"
	top: "conv1"
	name: "scale_conv1"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "conv1"
	top: "conv1"
	name: "conv1_relu"
	type: "ReLU"
}

layer {
	bottom: "conv1"
	top: "pool1"
	name: "pool1"
	type: "Pooling"
	pooling_param {
		kernel_size: 3
		stride: 2
		pool: MAX
	}
}

layer {
	bottom: "pool1"
	top: "res2a_branch1"
	name: "res2a_branch1"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2a_branch1"
	top: "res2a_branch1"
	name: "bn2a_branch1"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2a_branch1"
	top: "res2a_branch1"
	name: "scale2a_branch1"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "pool1"
	top: "res2a_branch2a"
	name: "res2a_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2a_branch2a"
	top: "res2a_branch2a"
	name: "bn2a_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2a_branch2a"
	top: "res2a_branch2a"
	name: "scale2a_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2a_branch2a"
	top: "res2a_branch2a"
	name: "res2a_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res2a_branch2a"
	top: "res2a_branch2b"
	name: "res2a_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2a_branch2b"
	top: "res2a_branch2b"
	name: "bn2a_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2a_branch2b"
	top: "res2a_branch2b"
	name: "scale2a_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2a_branch2b"
	top: "res2a_branch2b"
	name: "res2a_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res2a_branch2b"
	top: "res2a_branch2c"
	name: "res2a_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2a_branch2c"
	top: "res2a_branch2c"
	name: "bn2a_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2a_branch2c"
	top: "res2a_branch2c"
	name: "scale2a_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2a_branch1"
	bottom: "res2a_branch2c"
	top: "res2a"
	name: "res2a"
	type: "Eltwise"
}

layer {
	bottom: "res2a"
	top: "res2a"
	name: "res2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res2a"
	top: "res2b_branch2a"
	name: "res2b_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2b_branch2a"
	top: "res2b_branch2a"
	name: "bn2b_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2b_branch2a"
	top: "res2b_branch2a"
	name: "scale2b_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2b_branch2a"
	top: "res2b_branch2a"
	name: "res2b_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res2b_branch2a"
	top: "res2b_branch2b"
	name: "res2b_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2b_branch2b"
	top: "res2b_branch2b"
	name: "bn2b_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2b_branch2b"
	top: "res2b_branch2b"
	name: "scale2b_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2b_branch2b"
	top: "res2b_branch2b"
	name: "res2b_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res2b_branch2b"
	top: "res2b_branch2c"
	name: "res2b_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2b_branch2c"
	top: "res2b_branch2c"
	name: "bn2b_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2b_branch2c"
	top: "res2b_branch2c"
	name: "scale2b_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2a"
	bottom: "res2b_branch2c"
	top: "res2b"
	name: "res2b"
	type: "Eltwise"
}

layer {
	bottom: "res2b"
	top: "res2b"
	name: "res2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res2b"
	top: "res2c_branch2a"
	name: "res2c_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2c_branch2a"
	top: "res2c_branch2a"
	name: "bn2c_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2c_branch2a"
	top: "res2c_branch2a"
	name: "scale2c_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2c_branch2a"
	top: "res2c_branch2a"
	name: "res2c_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res2c_branch2a"
	top: "res2c_branch2b"
	name: "res2c_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 64
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2c_branch2b"
	top: "res2c_branch2b"
	name: "bn2c_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2c_branch2b"
	top: "res2c_branch2b"
	name: "scale2c_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2c_branch2b"
	top: "res2c_branch2b"
	name: "res2c_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res2c_branch2b"
	top: "res2c_branch2c"
	name: "res2c_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res2c_branch2c"
	top: "res2c_branch2c"
	name: "bn2c_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res2c_branch2c"
	top: "res2c_branch2c"
	name: "scale2c_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2b"
	bottom: "res2c_branch2c"
	top: "res2c"
	name: "res2c"
	type: "Eltwise"
}

layer {
	bottom: "res2c"
	top: "res2c"
	name: "res2c_relu"
	type: "ReLU"
}

layer {
	bottom: "res2c"
	top: "res3a_branch1"
	name: "res3a_branch1"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 2
		bias_term: false
	}
}

layer {
	bottom: "res3a_branch1"
	top: "res3a_branch1"
	name: "bn3a_branch1"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3a_branch1"
	top: "res3a_branch1"
	name: "scale3a_branch1"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res2c"
	top: "res3a_branch2a"
	name: "res3a_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 1
		pad: 0
		stride: 2
		bias_term: false
	}
}

layer {
	bottom: "res3a_branch2a"
	top: "res3a_branch2a"
	name: "bn3a_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3a_branch2a"
	top: "res3a_branch2a"
	name: "scale3a_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3a_branch2a"
	top: "res3a_branch2a"
	name: "res3a_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res3a_branch2a"
	top: "res3a_branch2b"
	name: "res3a_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3a_branch2b"
	top: "res3a_branch2b"
	name: "bn3a_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3a_branch2b"
	top: "res3a_branch2b"
	name: "scale3a_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3a_branch2b"
	top: "res3a_branch2b"
	name: "res3a_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res3a_branch2b"
	top: "res3a_branch2c"
	name: "res3a_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3a_branch2c"
	top: "res3a_branch2c"
	name: "bn3a_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3a_branch2c"
	top: "res3a_branch2c"
	name: "scale3a_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3a_branch1"
	bottom: "res3a_branch2c"
	top: "res3a"
	name: "res3a"
	type: "Eltwise"
}

layer {
	bottom: "res3a"
	top: "res3a"
	name: "res3a_relu"
	type: "ReLU"
}

layer {
	bottom: "res3a"
	top: "res3b_branch2a"
	name: "res3b_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3b_branch2a"
	top: "res3b_branch2a"
	name: "bn3b_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3b_branch2a"
	top: "res3b_branch2a"
	name: "scale3b_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3b_branch2a"
	top: "res3b_branch2a"
	name: "res3b_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res3b_branch2a"
	top: "res3b_branch2b"
	name: "res3b_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3b_branch2b"
	top: "res3b_branch2b"
	name: "bn3b_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3b_branch2b"
	top: "res3b_branch2b"
	name: "scale3b_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3b_branch2b"
	top: "res3b_branch2b"
	name: "res3b_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res3b_branch2b"
	top: "res3b_branch2c"
	name: "res3b_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3b_branch2c"
	top: "res3b_branch2c"
	name: "bn3b_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3b_branch2c"
	top: "res3b_branch2c"
	name: "scale3b_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3a"
	bottom: "res3b_branch2c"
	top: "res3b"
	name: "res3b"
	type: "Eltwise"
}

layer {
	bottom: "res3b"
	top: "res3b"
	name: "res3b_relu"
	type: "ReLU"
}

layer {
	bottom: "res3b"
	top: "res3c_branch2a"
	name: "res3c_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3c_branch2a"
	top: "res3c_branch2a"
	name: "bn3c_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3c_branch2a"
	top: "res3c_branch2a"
	name: "scale3c_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3c_branch2a"
	top: "res3c_branch2a"
	name: "res3c_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res3c_branch2a"
	top: "res3c_branch2b"
	name: "res3c_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3c_branch2b"
	top: "res3c_branch2b"
	name: "bn3c_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3c_branch2b"
	top: "res3c_branch2b"
	name: "scale3c_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3c_branch2b"
	top: "res3c_branch2b"
	name: "res3c_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res3c_branch2b"
	top: "res3c_branch2c"
	name: "res3c_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3c_branch2c"
	top: "res3c_branch2c"
	name: "bn3c_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3c_branch2c"
	top: "res3c_branch2c"
	name: "scale3c_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3b"
	bottom: "res3c_branch2c"
	top: "res3c"
	name: "res3c"
	type: "Eltwise"
}

layer {
	bottom: "res3c"
	top: "res3c"
	name: "res3c_relu"
	type: "ReLU"
}

layer {
	bottom: "res3c"
	top: "res3d_branch2a"
	name: "res3d_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3d_branch2a"
	top: "res3d_branch2a"
	name: "bn3d_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3d_branch2a"
	top: "res3d_branch2a"
	name: "scale3d_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3d_branch2a"
	top: "res3d_branch2a"
	name: "res3d_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res3d_branch2a"
	top: "res3d_branch2b"
	name: "res3d_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 128
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3d_branch2b"
	top: "res3d_branch2b"
	name: "bn3d_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3d_branch2b"
	top: "res3d_branch2b"
	name: "scale3d_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3d_branch2b"
	top: "res3d_branch2b"
	name: "res3d_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res3d_branch2b"
	top: "res3d_branch2c"
	name: "res3d_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res3d_branch2c"
	top: "res3d_branch2c"
	name: "bn3d_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res3d_branch2c"
	top: "res3d_branch2c"
	name: "scale3d_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3c"
	bottom: "res3d_branch2c"
	top: "res3d"
	name: "res3d"
	type: "Eltwise"
}

layer {
	bottom: "res3d"
	top: "res3d"
	name: "res3d_relu"
	type: "ReLU"
}

layer {
	bottom: "res3d"
	top: "res4a_branch1"
	name: "res4a_branch1"
	type: "Convolution"
	convolution_param {
		num_output: 1024
		kernel_size: 1
		pad: 0
		stride: 2
		bias_term: false
	}
}

layer {
	bottom: "res4a_branch1"
	top: "res4a_branch1"
	name: "bn4a_branch1"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4a_branch1"
	top: "res4a_branch1"
	name: "scale4a_branch1"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res3d"
	top: "res4a_branch2a"
	name: "res4a_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 2
		bias_term: false
	}
}

layer {
	bottom: "res4a_branch2a"
	top: "res4a_branch2a"
	name: "bn4a_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4a_branch2a"
	top: "res4a_branch2a"
	name: "scale4a_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4a_branch2a"
	top: "res4a_branch2a"
	name: "res4a_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res4a_branch2a"
	top: "res4a_branch2b"
	name: "res4a_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4a_branch2b"
	top: "res4a_branch2b"
	name: "bn4a_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4a_branch2b"
	top: "res4a_branch2b"
	name: "scale4a_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4a_branch2b"
	top: "res4a_branch2b"
	name: "res4a_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res4a_branch2b"
	top: "res4a_branch2c"
	name: "res4a_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 1024
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4a_branch2c"
	top: "res4a_branch2c"
	name: "bn4a_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4a_branch2c"
	top: "res4a_branch2c"
	name: "scale4a_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4a_branch1"
	bottom: "res4a_branch2c"
	top: "res4a"
	name: "res4a"
	type: "Eltwise"
}

layer {
	bottom: "res4a"
	top: "res4a"
	name: "res4a_relu"
	type: "ReLU"
}

layer {
	bottom: "res4a"
	top: "res4b_branch2a"
	name: "res4b_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4b_branch2a"
	top: "res4b_branch2a"
	name: "bn4b_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4b_branch2a"
	top: "res4b_branch2a"
	name: "scale4b_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4b_branch2a"
	top: "res4b_branch2a"
	name: "res4b_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res4b_branch2a"
	top: "res4b_branch2b"
	name: "res4b_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4b_branch2b"
	top: "res4b_branch2b"
	name: "bn4b_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4b_branch2b"
	top: "res4b_branch2b"
	name: "scale4b_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4b_branch2b"
	top: "res4b_branch2b"
	name: "res4b_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res4b_branch2b"
	top: "res4b_branch2c"
	name: "res4b_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 1024
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4b_branch2c"
	top: "res4b_branch2c"
	name: "bn4b_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4b_branch2c"
	top: "res4b_branch2c"
	name: "scale4b_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4a"
	bottom: "res4b_branch2c"
	top: "res4b"
	name: "res4b"
	type: "Eltwise"
}

layer {
	bottom: "res4b"
	top: "res4b"
	name: "res4b_relu"
	type: "ReLU"
}

layer {
	bottom: "res4b"
	top: "res4c_branch2a"
	name: "res4c_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4c_branch2a"
	top: "res4c_branch2a"
	name: "bn4c_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4c_branch2a"
	top: "res4c_branch2a"
	name: "scale4c_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4c_branch2a"
	top: "res4c_branch2a"
	name: "res4c_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res4c_branch2a"
	top: "res4c_branch2b"
	name: "res4c_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4c_branch2b"
	top: "res4c_branch2b"
	name: "bn4c_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4c_branch2b"
	top: "res4c_branch2b"
	name: "scale4c_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4c_branch2b"
	top: "res4c_branch2b"
	name: "res4c_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res4c_branch2b"
	top: "res4c_branch2c"
	name: "res4c_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 1024
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4c_branch2c"
	top: "res4c_branch2c"
	name: "bn4c_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4c_branch2c"
	top: "res4c_branch2c"
	name: "scale4c_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4b"
	bottom: "res4c_branch2c"
	top: "res4c"
	name: "res4c"
	type: "Eltwise"
}

layer {
	bottom: "res4c"
	top: "res4c"
	name: "res4c_relu"
	type: "ReLU"
}

layer {
	bottom: "res4c"
	top: "res4d_branch2a"
	name: "res4d_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4d_branch2a"
	top: "res4d_branch2a"
	name: "bn4d_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4d_branch2a"
	top: "res4d_branch2a"
	name: "scale4d_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4d_branch2a"
	top: "res4d_branch2a"
	name: "res4d_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res4d_branch2a"
	top: "res4d_branch2b"
	name: "res4d_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4d_branch2b"
	top: "res4d_branch2b"
	name: "bn4d_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4d_branch2b"
	top: "res4d_branch2b"
	name: "scale4d_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4d_branch2b"
	top: "res4d_branch2b"
	name: "res4d_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res4d_branch2b"
	top: "res4d_branch2c"
	name: "res4d_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 1024
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4d_branch2c"
	top: "res4d_branch2c"
	name: "bn4d_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4d_branch2c"
	top: "res4d_branch2c"
	name: "scale4d_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4c"
	bottom: "res4d_branch2c"
	top: "res4d"
	name: "res4d"
	type: "Eltwise"
}

layer {
	bottom: "res4d"
	top: "res4d"
	name: "res4d_relu"
	type: "ReLU"
}

layer {
	bottom: "res4d"
	top: "res4e_branch2a"
	name: "res4e_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4e_branch2a"
	top: "res4e_branch2a"
	name: "bn4e_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4e_branch2a"
	top: "res4e_branch2a"
	name: "scale4e_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4e_branch2a"
	top: "res4e_branch2a"
	name: "res4e_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res4e_branch2a"
	top: "res4e_branch2b"
	name: "res4e_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4e_branch2b"
	top: "res4e_branch2b"
	name: "bn4e_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4e_branch2b"
	top: "res4e_branch2b"
	name: "scale4e_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4e_branch2b"
	top: "res4e_branch2b"
	name: "res4e_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res4e_branch2b"
	top: "res4e_branch2c"
	name: "res4e_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 1024
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4e_branch2c"
	top: "res4e_branch2c"
	name: "bn4e_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4e_branch2c"
	top: "res4e_branch2c"
	name: "scale4e_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4d"
	bottom: "res4e_branch2c"
	top: "res4e"
	name: "res4e"
	type: "Eltwise"
}

layer {
	bottom: "res4e"
	top: "res4e"
	name: "res4e_relu"
	type: "ReLU"
}

layer {
	bottom: "res4e"
	top: "res4f_branch2a"
	name: "res4f_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4f_branch2a"
	top: "res4f_branch2a"
	name: "bn4f_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4f_branch2a"
	top: "res4f_branch2a"
	name: "scale4f_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4f_branch2a"
	top: "res4f_branch2a"
	name: "res4f_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res4f_branch2a"
	top: "res4f_branch2b"
	name: "res4f_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 256
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4f_branch2b"
	top: "res4f_branch2b"
	name: "bn4f_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4f_branch2b"
	top: "res4f_branch2b"
	name: "scale4f_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4f_branch2b"
	top: "res4f_branch2b"
	name: "res4f_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res4f_branch2b"
	top: "res4f_branch2c"
	name: "res4f_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 1024
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res4f_branch2c"
	top: "res4f_branch2c"
	name: "bn4f_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res4f_branch2c"
	top: "res4f_branch2c"
	name: "scale4f_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4e"
	bottom: "res4f_branch2c"
	top: "res4f"
	name: "res4f"
	type: "Eltwise"
}

layer {
	bottom: "res4f"
	top: "res4f"
	name: "res4f_relu"
	type: "ReLU"
}

layer {
	bottom: "res4f"
	top: "res5a_branch1"
	name: "res5a_branch1"
	type: "Convolution"
	convolution_param {
		num_output: 2048
		kernel_size: 1
		pad: 0
		stride: 2
		bias_term: false
	}
}

layer {
	bottom: "res5a_branch1"
	top: "res5a_branch1"
	name: "bn5a_branch1"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5a_branch1"
	top: "res5a_branch1"
	name: "scale5a_branch1"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res4f"
	top: "res5a_branch2a"
	name: "res5a_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 2
		bias_term: false
	}
}

layer {
	bottom: "res5a_branch2a"
	top: "res5a_branch2a"
	name: "bn5a_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5a_branch2a"
	top: "res5a_branch2a"
	name: "scale5a_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5a_branch2a"
	top: "res5a_branch2a"
	name: "res5a_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res5a_branch2a"
	top: "res5a_branch2b"
	name: "res5a_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5a_branch2b"
	top: "res5a_branch2b"
	name: "bn5a_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5a_branch2b"
	top: "res5a_branch2b"
	name: "scale5a_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5a_branch2b"
	top: "res5a_branch2b"
	name: "res5a_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res5a_branch2b"
	top: "res5a_branch2c"
	name: "res5a_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 2048
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5a_branch2c"
	top: "res5a_branch2c"
	name: "bn5a_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5a_branch2c"
	top: "res5a_branch2c"
	name: "scale5a_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5a_branch1"
	bottom: "res5a_branch2c"
	top: "res5a"
	name: "res5a"
	type: "Eltwise"
}

layer {
	bottom: "res5a"
	top: "res5a"
	name: "res5a_relu"
	type: "ReLU"
}

layer {
	bottom: "res5a"
	top: "res5b_branch2a"
	name: "res5b_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5b_branch2a"
	top: "res5b_branch2a"
	name: "bn5b_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5b_branch2a"
	top: "res5b_branch2a"
	name: "scale5b_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5b_branch2a"
	top: "res5b_branch2a"
	name: "res5b_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res5b_branch2a"
	top: "res5b_branch2b"
	name: "res5b_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5b_branch2b"
	top: "res5b_branch2b"
	name: "bn5b_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5b_branch2b"
	top: "res5b_branch2b"
	name: "scale5b_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5b_branch2b"
	top: "res5b_branch2b"
	name: "res5b_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res5b_branch2b"
	top: "res5b_branch2c"
	name: "res5b_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 2048
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5b_branch2c"
	top: "res5b_branch2c"
	name: "bn5b_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5b_branch2c"
	top: "res5b_branch2c"
	name: "scale5b_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5a"
	bottom: "res5b_branch2c"
	top: "res5b"
	name: "res5b"
	type: "Eltwise"
}

layer {
	bottom: "res5b"
	top: "res5b"
	name: "res5b_relu"
	type: "ReLU"
}

layer {
	bottom: "res5b"
	top: "res5c_branch2a"
	name: "res5c_branch2a"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5c_branch2a"
	top: "res5c_branch2a"
	name: "bn5c_branch2a"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5c_branch2a"
	top: "res5c_branch2a"
	name: "scale5c_branch2a"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5c_branch2a"
	top: "res5c_branch2a"
	name: "res5c_branch2a_relu"
	type: "ReLU"
}

layer {
	bottom: "res5c_branch2a"
	top: "res5c_branch2b"
	name: "res5c_branch2b"
	type: "Convolution"
	convolution_param {
		num_output: 512
		kernel_size: 3
		pad: 1
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5c_branch2b"
	top: "res5c_branch2b"
	name: "bn5c_branch2b"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5c_branch2b"
	top: "res5c_branch2b"
	name: "scale5c_branch2b"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5c_branch2b"
	top: "res5c_branch2b"
	name: "res5c_branch2b_relu"
	type: "ReLU"
}

layer {
	bottom: "res5c_branch2b"
	top: "res5c_branch2c"
	name: "res5c_branch2c"
	type: "Convolution"
	convolution_param {
		num_output: 2048
		kernel_size: 1
		pad: 0
		stride: 1
		bias_term: false
	}
}

layer {
	bottom: "res5c_branch2c"
	top: "res5c_branch2c"
	name: "bn5c_branch2c"
	type: "BatchNorm"
	batch_norm_param {
		use_global_stats: true
	}
}

layer {
	bottom: "res5c_branch2c"
	top: "res5c_branch2c"
	name: "scale5c_branch2c"
	type: "Scale"
	scale_param {
		bias_term: true
	}
}

layer {
	bottom: "res5b"
	bottom: "res5c_branch2c"
	top: "res5c"
	name: "res5c"
	type: "Eltwise"
}

layer {
	bottom: "res5c"
	top: "res5c"
	name: "res5c_relu"
	type: "ReLU"
}

layer {
	bottom: "res5c"
	top: "pool5"
	name: "pool5"
	type: "Pooling"
	pooling_param {
		kernel_size: 7
		stride: 1
		pool: AVE
	}
}

layer {
	bottom: "pool5"
	top: "fc1000"
	name: "fc1000"
	type: "InnerProduct"
	inner_product_param {
		num_output: 1000
	}
}

layer {
	bottom: "fc1000"
	top: "prob"
	name: "prob"
	type: "Softmax"
}

下面贴出一个非常简单的10层的残差网络，真实环境下请用res50 res152

import tensorflow as tf
from collections import namedtuple
from math import sqrt
import input_data
def conv2d(x, n_filters,
           k_h=5, k_w=5,
           stride_h=2, stride_w=2,
           stddev=0.02,
           activation=lambda x: x,
           bias=True,
           padding='SAME',
           name="Conv2D"):
  
    with tf.variable_scope(name):
        w = tf.get_variable(
            'w', [k_h, k_w, x.get_shape()[-1], n_filters],
            initializer=tf.truncated_normal_initializer(stddev=stddev))
        conv = tf.nn.conv2d(
            x, w, strides=[1, stride_h, stride_w, 1], padding=padding)
        if bias:
            b = tf.get_variable(
                'b', [n_filters],
                initializer=tf.truncated_normal_initializer(stddev=stddev))
            conv = conv + b
        return activation(conv)


def linear(x, n_units, scope=None, stddev=0.02,
           activation=lambda x: x):
    
    shape = x.get_shape().as_list()

    with tf.variable_scope(scope or "Linear"):
        matrix = tf.get_variable("Matrix", [shape[1], n_units], tf.float32,
                                 tf.random_normal_initializer(stddev=stddev))
        return activation(tf.matmul(x, matrix))
# %%
def residual_network(x, n_outputs,
                     activation=tf.nn.relu):
  
    # %%
    LayerBlock = namedtuple(
        'LayerBlock', ['num_repeats', 'num_filters', 'bottleneck_size'])
    blocks = [LayerBlock(3, 128, 32),
              LayerBlock(3, 256, 64),
              LayerBlock(3, 512, 128),
              LayerBlock(3, 1024, 256)]

    # %%
    input_shape = x.get_shape().as_list()
    if len(input_shape) == 2:
        ndim = int(sqrt(input_shape[1]))
        if ndim * ndim != input_shape[1]:
            raise ValueError('input_shape should be square')
        x = tf.reshape(x, [-1, ndim, ndim, 1])

    # %%
    # First convolution expands to 64 channels and downsamples
    net = conv2d(x, 64, k_h=7, k_w=7,
                 name='conv1',
                 activation=activation)

    # %%
    # Max pool and downsampling
    net = tf.nn.max_pool(
        net, [1, 3, 3, 1], strides=[1, 2, 2, 1], padding='SAME')

    # %%
    # Setup first chain of resnets
    net = conv2d(net, blocks[0].num_filters, k_h=1, k_w=1,
                 stride_h=1, stride_w=1, padding='VALID', name='conv2')

    # %%
    # Loop through all res blocks
    for block_i, block in enumerate(blocks):
        for repeat_i in range(block.num_repeats):

            name = 'block_%d/repeat_%d' % (block_i, repeat_i)
            conv = conv2d(net, block.bottleneck_size, k_h=1, k_w=1,
                          padding='VALID', stride_h=1, stride_w=1,
                          activation=activation,
                          name=name + '/conv_in')

            conv = conv2d(conv, block.bottleneck_size, k_h=3, k_w=3,
                          padding='SAME', stride_h=1, stride_w=1,
                          activation=activation,
                          name=name + '/conv_bottleneck')

            conv = conv2d(conv, block.num_filters, k_h=1, k_w=1,
                          padding='VALID', stride_h=1, stride_w=1,
                          activation=activation,
                          name=name + '/conv_out')

            net = conv + net
        try:
            # upscale to the next block size
            next_block = blocks[block_i + 1]
            net = conv2d(net, next_block.num_filters, k_h=1, k_w=1,
                         padding='SAME', stride_h=1, stride_w=1, bias=False,
                         name='block_%d/conv_upscale' % block_i)
        except IndexError:
            pass

    # %%
    net = tf.nn.avg_pool(net,
                         ksize=[1, net.get_shape().as_list()[1],
                                net.get_shape().as_list()[2], 1],
                         strides=[1, 1, 1, 1], padding='VALID')
    net = tf.reshape(
        net,
        [-1, net.get_shape().as_list()[1] *
         net.get_shape().as_list()[2] *
         net.get_shape().as_list()[3]])

    net = linear(net, n_outputs, activation=tf.nn.softmax)

    # %%
    return net


def rsnn():
    """Test the resnet on MNIST."""
    

    mnist = input_data.read_data_sets('/tmp/data/', one_hot=True)
    x = tf.placeholder(tf.float32, [None, 784])
    y = tf.placeholder(tf.float32, [None, 10])
    y_pred = residual_network(x, 10)

    # %% Define loss/eval/training functions
    cross_entropy = -tf.reduce_sum(y * tf.log(y_pred))
    optimizer = tf.train.AdamOptimizer().minimize(cross_entropy)

    # %% Monitor accuracy
    correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))

    # %% We now create a new session to actually perform the initialization the
    # variables:
    sess = tf.Session()
    sess.run(tf.initialize_all_variables())

    # %% We'll train in minibatches and report accuracy:
    batch_size = 50
    n_epochs = 5
    for epoch_i in range(n_epochs):
        # Training
        train_accuracy = 0
        for batch_i in range(mnist.train.num_examples // batch_size):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            train_accuracy += sess.run([optimizer, accuracy], feed_dict={
                x: batch_xs, y: batch_ys})[1]
        train_accuracy /= (mnist.train.num_examples // batch_size)

        # Validation
        valid_accuracy = 0
        for batch_i in range(mnist.validation.num_examples // batch_size):
            batch_xs, batch_ys = mnist.validation.next_batch(batch_size)
            valid_accuracy += sess.run(accuracy,
                                       feed_dict={
                                           x: batch_xs,
                                           y: batch_ys
                                       })
        valid_accuracy /= (mnist.validation.num_examples // batch_size)
        print('epoch:', epoch_i, ', train:',
              train_accuracy, ', valid:', valid_accuracy)


if __name__ == '__main__':
    rsnn()