torch7学习(一)——Tensor 总说torch7框架学习首先还是应该先看最基本的torch包中有什么。在我的前2篇博客torch7学习(一) 和torch7学习(二) 已经较为详细讲了。接下来要看的是神经网络包nn中的内容。可以说,当学习了神经网络包中的内容后,则能看懂给予torch框架的论文代码了,并且已经具备一定的用该框架书写神经网络的能力。 目的看完这篇博客后,应该能进行最简单的自动训练方式。(⊙o⊙)…,我这样说感觉自己像是个写书的或是老师。 Overview首先对整个nn包进行总览。 当然你上面只是说了如何构建神经网络,那么怎么进行训练呢?首先你要有一个损失函数,在就是 初识Module这里说的module是啥啊,就是模块啊,你把神经网络看成很多一小块一小块的模块组成的,有的小块是具有参数的,比如Linear或是卷积模块,有些是不带有参数的,比如Max或是Pooling。对于每个小块,总有输入吧,也有输出吧。那么对于那些具有参数的小块,则我们可以计算dLoss_dParams, 注意dLoss_dParams包含了2个,分别是dLoss_dWeight和dLoss_dBias。如果是不具有参数的小块,那么就没有dLoss_dParams了,不过二者都有dLoss_dInput和dLoss_dOutput。这个挺像matcovnet的思想的。如果没明白可以参考这个,Notes on MatConvNet ( I ) — Overview Module是抽象类,包含了4个主要的函数: 只有两个成员变量:output, gradInput。 最简单的自动挡(GSD)训练这个训练很简单啊。不过对数据集有要求。 然后你就可以用最简单的自动挡了! criterion = nn.MSECriterion()trainer = nn.StochasticGradient(mlp,criterion)-- setting trainer's paramstrainer.learningRate = 0.001trainer.maxIteration = 5-- traintrainer:train(dataset)
训练神经网络共五步! require 'nn'require 'cunn'require 'torch'require 'cutorch'-- load datatrainset = torch.load('cifar10-train.t7')testset = torch.load('cifar10-test.t7')setmetatable(trainset,{ __index = function(t,i) return {t.data[i], t.label[i]} end});function trainset:size() return self.data:size(1)end-- transfer data to GPU-- Lua cannot handle ByteTensortrainset.data = trainset.data:cuda()trainset.label = trainset.label:cuda()-- normalizemeanv = {}stdv = {}for i = 1,3 do meanv[i] = trainset.data[ {{}, {i},{}, {}}]:mean() trainset.data[ {{}, {i},{}, {}}]:add(-meanv[i]) print('mean'..i..'_'..meanv[i]) stdv[i] = trainset.data[ {{}, {i},{}, {}}]:std() trainset.data[ {{}, {i},{}, {}}]:div(stdv[i]) print('std'..i..'_'..meanv[i])end------------------ Define Networknet = nn.Sequential()net:add(nn.SpatialConvolution(3, 6, 5, 5)) -- 3 input image channels, 6 output channels, 5x5 convolution kernelnet:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2)) -- A max-pooling operation that looks at 2x2 windows and finds the max.net:add(nn.SpatialConvolution(6, 16, 5, 5))net:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2))net:add(nn.View(16*5*5)) -- reshapes from a 3D tensor of 16x5x5 into 1D tensor of 16*5*5net:add(nn.Linear(16*5*5, 120)) -- fully connected layer (matrix multiplication between input and weights)net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(120, 84))net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(84, 10)) -- 10 is the number of outputs of the network (in this case, 10 digits)net:add(nn.LogSoftMax())print(net)----------------Define criterion-----------criterion = nn.ClassNLLCriterion()---------------tranfer net and criterion to GPU-------net = net:cuda()criterion = criterion:cuda()-----------------StochasticGradient with Criterion--timer = torch.Timer()trainer = nn.StochasticGradient(net, criterion)trainer.learningRate = 0.001trainer.maxIteration = 5trainer:train(trainset)-------------------test---------------Firstly, normalize testsetfor i = 1,3 do testset.data[{{},{i},{},{}}]:add(-meanv[i]) testset.data[{{},{i},{},{}}]:div(stdv[i])endtestset.data = testset.data:cuda()testset.label = testset.label:cuda()--calculate accuraciesaccuracies = {0,0,0,0,0,0,0,0,0,0}for i=1,10000 do predict = net:forward(trainset.data[i]) sampleLabel = trainset.label[i] local confidences , indice = torch.sort(predict,true) if indice[1] == sampleLabel then accuracies[sampleLabel] = accuracies[sampleLabel]+1 endend-- start,end,step, different from matlab which with start, step, endfor i =1,10,1 do print('accuracies: '..i..'_'..(accuracies[i])/10 ..'%')endprint('Elapsed time:'..timer:time().real..'seconds')--[[注意,如果用GPU的话,则3样东西一定要转换! dataset, net和criterion.否则,就就会出现问题。 ]]
结果如下: mean1_125.83173370361 std1_125.83173370361 mean2_123.26066589355 std2_123.26066589355 mean3_114.0306854248 std3_114.0306854248 nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> output] (1): nn.SpatialConvolution(3 -> 6, 5x5) (2): nn.ReLU (3): nn.SpatialMaxPooling(2x2, 2,2) (4): nn.SpatialConvolution(6 -> 16, 5x5) (5): nn.ReLU (6): nn.SpatialMaxPooling(2x2, 2,2) (7): nn.View(400) (8): nn.Linear(400 -> 120) (9): nn.ReLU (10): nn.Linear(120 -> 84) (11): nn.ReLU (12): nn.Linear(84 -> 10) (13): nn.LogSoftMax}# StochasticGradient: training # current error = 2.2138581074953 # current error = 1.8626037351131 # current error = 1.6799513678432 # current error = 1.5545474862456 # current error = 1.4643953397989 # StochasticGradient: you have reached the maximum number of iterations # training error = 1.4643953397989 accuracies: 1_34.2% accuracies: 2_67% accuracies: 3_30.6% accuracies: 4_36.6% accuracies: 5_31.6% accuracies: 6_25% accuracies: 7_68.2% accuracies: 8_58% accuracies: 9_72.7% accuracies: 10_57.6% Elapsed time:42.098989009857seconds
到此,应该能进行最简单的手动挡GSD的训练。后面将会用optim包进行更加专业更加自定义的训练方式。 |
|