caffe示例实现之10LeNet的python接口

mscdj 2016-10-09

展开全文

本例利用Python学写caffe模型的prototxt。先载入一些必要的模块，把路径改成自己的：

import os
os.chdir('/home/lml/caffe-master/')
import sys
sys.path.insert(0, './python')
import caffe

from pylab import 
%matplotlib inline1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8

接下来运行给出的LeNet例子（确定已经下载了数据，转换好了数据格式）：

# Download and prepare data
!data/mnist/get_mnist.sh
!examples/mnist/create_mnist.sh1
2
3
1
2
3

Downloading…
–2015-09-21 09:11:03– http://yann./exdb/mnist/train-images-idx3-ubyte.gz 正在解析主机
yann. (yann.)… 128.122.47.89 正在连接 yann.
(yann.)|128.122.47.89|:80… 已连接。已发出 HTTP 请求，正在等待回应… 200
OK 长度： 9912422 (9.5M) [application/x-gzip] 正在保存至:
“train-images-idx3-ubyte.gz”
100%[======================================>] 9,912,422 1.39MB/s
用时 12s 2015-09-21 09:11:16 (775 KB/s) - 已保存
“train-images-idx3-ubyte.gz” [9912422/9912422])
–2015-09-21 09:11:16– http://yann./exdb/mnist/train-labels-idx1-ubyte.gz 正在解析主机
yann. (yann.)… 128.122.47.89 正在连接 yann.
(yann.)|128.122.47.89|:80… 已连接。已发出 HTTP 请求，正在等待回应… 200
OK 长度： 28881 (28K) [application/x-gzip] 正在保存至:
“train-labels-idx1-ubyte.gz”
100%[======================================>] 28,881 47.4KB/s
用时 0.6s 2015-09-21 09:11:18 (47.4 KB/s) - 已保存
“train-labels-idx1-ubyte.gz” [28881/28881])
–2015-09-21 09:11:18– http://yann./exdb/mnist/t10k-images-idx3-ubyte.gz 正在解析主机
yann. (yann.)… 128.122.47.89 正在连接 yann.
(yann.)|128.122.47.89|:80… 已连接。已发出 HTTP 请求，正在等待回应… 200
OK 长度： 1648877 (1.6M) [application/x-gzip] 正在保存至:
“t10k-images-idx3-ubyte.gz”
100%[======================================>] 1,648,877 452KB/s
用时 3.6s 2015-09-21 09:11:22 (452 KB/s) - 已保存
“t10k-images-idx3-ubyte.gz” [1648877/1648877])
–2015-09-21 09:11:22– http://yann./exdb/mnist/t10k-labels-idx1-ubyte.gz 正在解析主机
yann. (yann.)… 128.122.47.89 正在连接 yann.
(yann.)|128.122.47.89|:80… 已连接。已发出 HTTP 请求，正在等待回应… 200
OK 长度： 4542 (4.4K) [application/x-gzip] 正在保存至:
“t10k-labels-idx1-ubyte.gz”
100%[======================================>] 4,542 13.2KB/s
用时 0.3s 2015-09-21 09:11:23 (13.2 KB/s) - 已保存
“t10k-labels-idx1-ubyte.gz” [4542/4542]) Unzipping… Done. Creating
lmdb… Done.

还需要两个文件：
- net prototxt，定义网络结构，并指定对应的训练/测试数据
- solver prototxt，定义学习的参数
先从网络开始，用python代码以caffe的protobuf模型格式写网络。这个网络读取LMDB格式的预生成数据，也可以用MemoryDataLayer从ndarray中直接读取。

from caffe import layers as L
from caffe import params as P

def lenet(lmdb, batch_size):
    # our version of LeNet: a series of linear and simple nonlinear transformations
    n = caffe.NetSpec()
    n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb,
                             transform_param=dict(scale=1./255), ntop=2)
    n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))
    n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))
    n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.ip1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier'))
    n.loss = L.SoftmaxWithLoss(n.ip2, n.label)
    return n.to_proto()

with open('examples/mnist/lenet_auto_train.prototxt', 'w') as f:
    f.write(str(lenet('examples/mnist/mnist_train_lmdb', 64)))

with open('examples/mnist/lenet_auto_test.prototxt', 'w') as f:
    f.write(str(lenet('examples/mnist/mnist_test_lmdb', 100)))1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

这个网络利用Google的protobuf库写入到磁盘，虽然这样写出来的网络模型有些冗长，但是可读性好，可以直接读、写、修改。看一下这样自动生成的训练网络：

!cat examples/mnist/lenet_auto_train.prototxt1
1

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  transform_param {
    scale: 0.00392156862745
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 20
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 50
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99

看看学习参数，同样是prototxt文件，用带动量的SGD（随机梯度下降），权重递减，以及特定的学习率：

!cat examples/mnist/lenet_auto_solver.prototxt1
1

# The train/test net protocol buffer definition
train_net: "examples/mnist/lenet_auto_train.prototxt"
test_net: "examples/mnist/lenet_auto_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

选择GPU，载入solver，这里用的是SGD，Adagrad和Nesterov加速梯度也是可行的：

caffe.set_device(0)
caffe.set_mode_gpu()
solver = caffe.SGDSolver('examples/mnist/lenet_auto_solver.prototxt')1
2
3
1
2
3

查看中间特征（blobs）和参数（params）的维数，对网络结构加深理解：

# each output is (batch size, feature dim, spatial dim)
[(k, v.data.shape) for k, v in solver.net.blobs.items()]1
2
1
2

[(‘data’, (64, 1, 28, 28)),
(‘label’, (64,)),
(‘conv1’, (64, 20, 24, 24)),
(‘pool1’, (64, 20, 12, 12)),
(‘conv2’, (64, 50, 8, 8)),
(‘pool2’, (64, 50, 4, 4)),
(‘ip1’, (64, 500)),
(‘ip2’, (64, 10)),
(‘loss’, ())]

 # just print the weight sizes (not biases)
[(k, v[0].data.shape) for k, v in solver.net.params.items()]1
2
1
2

[(‘conv1’, (20, 1, 5, 5)),
(‘conv2’, (50, 20, 5, 5)),
(‘ip1’, (500, 800)),
(‘ip2’, (10, 500))]
在开始训练前，检查一下所有东西是不是都载入了，接下来在测试集和训练集上执行一个前向的过程，确保网络中包含了数据：

solver.net.forward()  # train net
solver.test_nets[0].forward()  # test net (there can be more than one)1
2
1
2

{‘loss’: array(2.394181489944458, dtype=float32)}
显示训练集的前8个图像和它们的标签来看看：

# we use a little trick to tile the first eight images
imshow(solver.net.blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 828), cmap='gray')
print solver.net.blobs['label'].data[:8]1
2
3
1
2
3

[ 5. 0. 4. 1. 9. 2. 1. 3.]

这里写图片描述

再显示测试集的前8个图像和它们的标签来看看：

imshow(solver.test_nets[0].blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 828), cmap='gray')
print solver.test_nets[0].blobs['label'].data[:8]1
2
1
2

[ 7. 2. 1. 0. 4. 1. 4. 9.]

这里写图片描述

看起来训练集和测试集的数据都已经顺利载入了，并且标签都是正确的。下面执行一个minibatch的SGD，看看会发生什么：

solver.step(1)1
1

我们看一下第一层的滤波器经过传播后的更新情况，20个5×5的滤波器：

imshow(solver.net.params['conv1'][0].diff[:, 0].reshape(4, 5, 5, 5).transpose(0, 2, 1, 3).reshape(45, 55), cmap='gray')1
1

这里写图片描述

让网络运行一会，注意到这个过程和通过caffe的binary训练是一样的。由于可以控制python中的循环，因此可以做一些其他的事情，例如自定义停止的标准，通过循环更新网络来改变求解过程：

%%time
niter = 200
test_interval = 25
# losses will also be stored in the log
train_loss = zeros(niter)
test_acc = zeros(int(np.ceil(niter / test_interval)))
output = zeros((niter, 8, 10))

# the main solver loop
for it in range(niter):
    solver.step(1)  # SGD by Caffe

    # store the train loss
    train_loss[it] = solver.net.blobs['loss'].data

    # store the output on the first test batch
    # (start the forward pass at conv1 to avoid loading new data)
    solver.test_nets[0].forward(start='conv1')
    output[it] = solver.test_nets[0].blobs['ip2'].data[:8]

    # run a full test every so often
    # (Caffe can also do this for us and write to a log, but we show here
    #  how to do it directly in Python, where more complicated things are easier.)
    if it % test_interval == 0:
        print 'Iteration', it, 'testing...'
        correct = 0
        for test_it in range(100):
            solver.test_nets[0].forward()
            correct += sum(solver.test_nets[0].blobs['ip2'].data.argmax(1)
                           == solver.test_nets[0].blobs['label'].data)
        test_acc[it // test_interval] = correct / 1e41
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

Iteration 0 testing…
Iteration 25 testing…
Iteration 50 testing…
Iteration 75 testing…
Iteration 100 testing…
Iteration 125 testing…
Iteration 150 testing…
Iteration 175 testing…
CPU times: user 2.29 s, sys: 588 ms, total: 2.88 s
Wall time: 2.05 s

接下来画出训练损失和测试准确率：

_, ax1 = subplots()
ax2 = ax1.twinx()
ax1.plot(arange(niter), train_loss)
ax2.plot(test_interval  arange(len(test_acc)), test_acc, 'r')
ax1.set_xlabel('iteration')
ax1.set_ylabel('train loss')
ax2.set_ylabel('test accuracy')1
2
3
4
5
6
7
1
2
3
4
5
6
7

这里写图片描述

损失下降很快，并且会收敛（有些随机性），准确率相应上升。由于保存了第一个测试batch的结果，可以看一下预测得分是怎样变化的，x轴为时间，y轴为每个可能的标签，亮度表示置信度：

for i in range(8):
    figure(figsize=(2, 2))
    imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray')
    figure(figsize=(10, 2))
    imshow(output[:50, i].T, interpolation='nearest', cmap='gray')
    xlabel('iteration')
    ylabel('label')1
2
3
4
5
6
7
1
2
3
4
5
6
7

这里写图片描述

发现最后一个数字9最容易出错，会和数字4混淆。注意到这是原始输出得分，而不是softmax计算后的概率，下面的内容可以看到网络的置信度：

for i in range(8):
    figure(figsize=(2, 2))
    imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray')
    figure(figsize=(10, 2))
    imshow(exp(output[:50, i].T) / exp(output[:50, i].T).sum(0), interpolation='nearest', cmap='gray')
    xlabel('iteration')
    ylabel('label')1
2
3
4
5
6
7
1
2
3
4
5
6
7

这里写图片描述