如何用卷积神经网络CNN识别手写数字集?
前几天用CNN识别手写数字集,后来看到kaggle上有一个比赛是识别手写数字集的,已经进行了一年多了,目前有1179个有效提交,最高的是100%,我做了一下,用keras做的,一开始用最简单的MLP,准确率只有98.19%,然后不断改进,现在是99.78%,然而我看到排名第一是100%,心碎==,于是又改进了一版,现在把最好的结果记录一下,如果提升了再来更新。
手写数字集相信大家应该很熟悉了,这个程序相当于学一门新语言的“HelloWorld”,或者mapreduce的“WordCount”:)这里就不多做介绍了,简单给大家看一下:
复制代码
1#Author:Charlotte
2#Plotmnistdataset
3fromkeras.datasetsimportmnist
4importmatplotlib.pyplotasplt
5#loadtheMNISTdataset
6(X_train,y_train),(X_test,y_test)=mnist.load_data()
7#plot4imagesasgrayscale
8plt.subplot(221)
9plt.imshow(X_train[0],cmap=plt.get_cmap(''PuBuGn_r''))
10plt.subplot(222)
11plt.imshow(X_train[1],cmap=plt.get_cmap(''PuBuGn_r''))
12plt.subplot(223)
13plt.imshow(X_train[2],cmap=plt.get_cmap(''PuBuGn_r''))
14plt.subplot(224)
15plt.imshow(X_train[3],cmap=plt.get_cmap(''PuBuGn_r''))
16#showtheplot
17plt.show()
复制代码
图:
1.BaseLine版本
一开始我没有想过用CNN做,因为比较耗时,所以想看看直接用比较简单的算法看能不能得到很好的效果。之前用过机器学习算法跑过一遍,最好的效果是SVM,96.8%(默认参数,未调优),所以这次准备用神经网络做。BaseLine版本用的是MultiLayerPercepton(多层感知机)。这个网络结构比较简单,输入--->隐含--->输出。隐含层采用的rectifierlinearunit,输出直接选取的softmax进行多分类。
网络结构:
代码:
复制代码
1#coding:utf-8
2#BaselineMLPforMNISTdataset
3importnumpy
4fromkeras.datasetsimportmnist
5fromkeras.modelsimportSequential
6fromkeras.layersimportDense
7fromkeras.layersimportDropout
8fromkeras.utilsimportnp_utils
9
10seed=7
11numpy.random.seed(seed)
12#加载数据
13(X_train,y_train),(X_test,y_test)=mnist.load_data()
14
15num_pixels=X_train.shape[1]X_train.shape[2]
16X_train=X_train.reshape(X_train.shape[0],num_pixels).astype(''float32'')
17X_test=X_test.reshape(X_test.shape[0],num_pixels).astype(''float32'')
18
19X_train=X_train/255
20X_test=X_test/255
21
22#对输出进行onehot编码
23y_train=np_utils.to_categorical(y_train)
24y_test=np_utils.to_categorical(y_test)
25num_classes=y_test.shape[1]
26
27#MLP模型
28defbaseline_model():
29model=Sequential()
30model.add(Dense(num_pixels,input_dim=num_pixels,init=''normal'',activation=''relu''))
31model.add(Dense(num_classes,init=''normal'',activation=''softmax''))
32model.summary()
33model.compile(loss=''categorical_crossentropy'',optimizer=''adam'',metrics=[''accuracy''])
34returnmodel
35
36#建立模型
37model=baseline_model()
38
39#Fit
40model.fit(X_train,y_train,validation_data=(X_test,y_test),nb_epoch=10,batch_size=200,verbose=2)
41
42#Evaluation
43scores=model.evaluate(X_test,y_test,verbose=0)
44print("BaselineError:%.2f%%"%(100-scores[1]100))#输出错误率
复制代码
结果:
复制代码
1Layer(type)OutputShapeParam#Connectedto
2====================================================================================================
3dense_1(Dense)(None,784)615440dense_input_1[0][0]
4____________________________________________________________________________________________________
5dense_2(Dense)(None,10)7850dense_1[0][0]
6====================================================================================================
7Totalparams:623290
8____________________________________________________________________________________________________
9Trainon60000samples,validateon10000samples
10Epoch1/10
113s-loss:0.2791-acc:0.9203-val_loss:0.1420-val_acc:0.9579
12Epoch2/10
133s-loss:0.1122-acc:0.9679-val_loss:0.0992-val_acc:0.9699
14Epoch3/10
153s-loss:0.0724-acc:0.9790-val_loss:0.0784-val_acc:0.9745
16Epoch4/10
173s-loss:0.0509-acc:0.9853-val_loss:0.0774-val_acc:0.9773
18Epoch5/10
193s-loss:0.0366-acc:0.9898-val_loss:0.0626-val_acc:0.9794
20Epoch6/10
213s-loss:0.0265-acc:0.9930-val_loss:0.0639-val_acc:0.9797
22Epoch7/10
233s-loss:0.0185-acc:0.9956-val_loss:0.0611-val_acc:0.9811
24Epoch8/10
253s-loss:0.0150-acc:0.9967-val_loss:0.0616-val_acc:0.9816
26Epoch9/10
274s-loss:0.0107-acc:0.9980-val_loss:0.0604-val_acc:0.9821
28Epoch10/10
294s-loss:0.0073-acc:0.9988-val_loss:0.0611-val_acc:0.9819
30BaselineError:1.81%
复制代码
可以看到结果还是不错的,正确率98.19%,错误率只有1.81%,而且只迭代十次效果也不错。这个时候我还是没想到去用CNN,而是想如果迭代100次,会不会效果好一点?于是我迭代了100次,结果如下:
Epoch100/100
8s-loss:4.6181e-07-acc:1.0000-val_loss:0.0982-val_acc:0.9854
BaselineError:1.46%
从结果中可以看出,迭代100次也只提高了0.35%,没有突破99%,所以就考虑用CNN来做。
2.简单的CNN网络
keras的CNN模块还是很全的,由于这里着重讲CNN的结果,对于CNN的基本知识就不展开讲了。
网络结构:
代码:
复制代码
1#coding:utf-8
2#SimpleCNN
3importnumpy
4fromkeras.datasetsimportmnist
5fromkeras.modelsimportSequential
6fromkeras.layersimportDense
7fromkeras.layersimportDropout
8fromkeras.layersimportFlatten
9fromkeras.layers.convolutionalimportConvolution2D
10fromkeras.layers.convolutionalimportMaxPooling2D
11fromkeras.utilsimportnp_utils
12
13seed=7
14numpy.random.seed(seed)
15
16#加载数据
17(X_train,y_train),(X_test,y_test)=mnist.load_data()
18#reshapetobe[samples][channels][width][height]
19X_train=X_train.reshape(X_train.shape[0],1,28,28).astype(''float32'')
20X_test=X_test.reshape(X_test.shape[0],1,28,28).astype(''float32'')
21
22#normalizeinputsfrom0-255to0-1
23X_train=X_train/255
24X_test=X_test/255
25
26#onehotencodeoutputs
27y_train=np_utils.to_categorical(y_train)
28y_test=np_utils.to_categorical(y_test)
29num_classes=y_test.shape[1]
30
31#defineasimpleCNNmodel
32defbaseline_model():
33#createmodel
34model=Sequential()
35model.add(Convolution2D(32,5,5,border_mode=''valid'',input_shape=(1,28,28),activation=''relu''))
36model.add(MaxPooling2D(pool_size=(2,2)))
37model.add(Dropout(0.2))
38model.add(Flatten())
39model.add(Dense(128,activation=''relu''))
40model.add(Dense(num_classes,activation=''softmax''))
41#Compilemodel
42model.compile(loss=''categorical_crossentropy'',optimizer=''adam'',metrics=[''accuracy''])
43returnmodel
44
45#buildthemodel
46model=baseline_model()
47
48#Fitthemodel
49model.fit(X_train,y_train,validation_data=(X_test,y_test),nb_epoch=10,batch_size=128,verbose=2)
50
51#Finalevaluationofthemodel
52scores=model.evaluate(X_test,y_test,verbose=0)
53print("CNNError:%.2f%%"%(100-scores[1]100))
复制代码
结果:
复制代码
1____________________________________________________________________________________________________
2Layer(type)OutputShapeParam#Connectedto
3====================================================================================================
4convolution2d_1(Convolution2D)(None,32,24,24)832convolution2d_input_1[0][0]
5____________________________________________________________________________________________________
6maxpooling2d_1(MaxPooling2D)(None,32,12,12)0convolution2d_1[0][0]
7____________________________________________________________________________________________________
8dropout_1(Dropout)(None,32,12,12)0maxpooling2d_1[0][0]
9____________________________________________________________________________________________________
10flatten_1(Flatten)(None,4608)0dropout_1[0][0]
11____________________________________________________________________________________________________
12dense_1(Dense)(None,128)589952flatten_1[0][0]
13____________________________________________________________________________________________________
14dense_2(Dense)(None,10)1290dense_1[0][0]
15====================================================================================================
16Totalparams:592074
17____________________________________________________________________________________________________
18Trainon60000samples,validateon10000samples
19Epoch1/10
2032s-loss:0.2412-acc:0.9318-val_loss:0.0754-val_acc:0.9766
21Epoch2/10
2232s-loss:0.0726-acc:0.9781-val_loss:0.0534-val_acc:0.9829
23Epoch3/10
2432s-loss:0.0497-acc:0.9852-val_loss:0.0391-val_acc:0.9858
25Epoch4/10
2632s-loss:0.0413-acc:0.9870-val_loss:0.0432-val_acc:0.9854
27Epoch5/10
2834s-loss:0.0323-acc:0.9897-val_loss:0.0375-val_acc:0.9869
29Epoch6/10
3036s-loss:0.0281-acc:0.9909-val_loss:0.0424-val_acc:0.9864
31Epoch7/10
3236s-loss:0.0223-acc:0.9930-val_loss:0.0328-val_acc:0.9893
33Epoch8/10
3436s-loss:0.0198-acc:0.9939-val_loss:0.0381-val_acc:0.9880
35Epoch9/10
3636s-loss:0.0156-acc:0.9954-val_loss:0.0347-val_acc:0.9884
37Epoch10/10
3836s-loss:0.0141-acc:0.9955-val_loss:0.0318-val_acc:0.9893
39CNNError:1.07%
复制代码
迭代的结果中,loss和acc为训练集的结果,val_loss和val_acc为验证机的结果。从结果上来看,效果不错,比100次迭代的MLP(1.46%)提升了0.39%,CNN的误差率为1.07%。这里的CNN的网络结构还是比较简单的,如果把CNN的结果再加几层,边复杂一代,结果是否还能提升?
3.LargerCNN
这一次我加了几层卷积层,代码:
复制代码
1#LargerCNN
2importnumpy
3fromkeras.datasetsimportmnist
4fromkeras.modelsimportSequential
5fromkeras.layersimportDense
6fromkeras.layersimportDropout
7fromkeras.layersimportFlatten
8fromkeras.layers.convolutionalimportConvolution2D
9fromkeras.layers.convolutionalimportMaxPooling2D
10fromkeras.utilsimportnp_utils
11
12seed=7
13numpy.random.seed(seed)
14#loaddata
15(X_train,y_train),(X_test,y_test)=mnist.load_data()
16#reshapetobe[samples][pixels][width][height]
17X_train=X_train.reshape(X_train.shape[0],1,28,28).astype(''float32'')
18X_test=X_test.reshape(X_test.shape[0],1,28,28).astype(''float32'')
19#normalizeinputsfrom0-255to0-1
20X_train=X_train/255
21X_test=X_test/255
22#onehotencodeoutputs
23y_train=np_utils.to_categorical(y_train)
24y_test=np_utils.to_categorical(y_test)
25num_classes=y_test.shape[1]
26#definethelargermodel
27deflarger_model():
28#createmodel
29model=Sequential()
30model.add(Convolution2D(30,5,5,border_mode=''valid'',input_shape=(1,28,28),activation=''relu''))
31model.add(MaxPooling2D(pool_size=(2,2)))
32model.add(Convolution2D(15,3,3,activation=''relu''))
33model.add(MaxPooling2D(pool_size=(2,2)))
34model.www.baiyuewang.netadd(Dropout(0.2))
35model.add(Flatten())
36model.add(Dense(128,activation=''relu''))
37model.add(Dense(50,activation=''relu''))
38model.add(Dense(num_classes,activation=''softmax''))
39#Compilemodel
40model.summary()
41model.compile(loss=''categorical_crossentropy'',optimizer=''adam'',metrics=[''accuracy''])
42returnmodel
43#buildthemodel
44model=larger_model()
45#Fitthemodel
46model.fit(X_train,y_train,validation_data=(X_test,y_test),nb_epoch=69,batch_size=200,verbose=2)
47#Finalevaluationofthemodel
48scores=model.evaluate(X_test,y_test,verbose=0)
49print("LargeCNNError:%.2f%%"%(100-scores[1]100))
复制代码
结果:
复制代码
___________________________________________________________________________________________________
Layer(type)OutputShapeParam#Connectedto
====================================================================================================
convolution2d_1(Convolution2D)(None,30,24,24)780convolution2d_input_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1(MaxPooling2D)(None,30,12,12)0convolution2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_2(Convolution2D)(None,15,10,10)4065maxpooling2d_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_2(MaxPooling2D)(None,15,5,5)0convolution2d_2[0][0]
____________________________________________________________________________________________________
dropout_1(Dropout)(None,15,5,5)0maxpooling2d_2[0][0]
____________________________________________________________________________________________________
flatten_1(Flatten)(None,375)0dropout_1[0][0]
____________________________________________________________________________________________________
dense_1(Dense)(None,128)48128flatten_1[0][0]
____________________________________________________________________________________________________
dense_2(Dense)(None,50)6450dense_1[0][0]
____________________________________________________________________________________________________
dense_3(Dense)(None,10)510dense_2[0][0]
====================================================================================================
Totalparams:59933
____________________________________________________________________________________________________
Trainon60000samples,validateon10000samples
Epoch1/10
34s-loss:0.3789-acc:0.8796-val_loss:0.0811-val_acc:0.9742
Epoch2/10
34s-loss:0.0929-acc:0.9710-val_loss:0.0462-val_acc:0.9854
Epoch3/10
35s-loss:0.0684-acc:0.9786-val_loss:0.0376-val_acc:0.9869
Epoch4/10
35s-loss:0.0546-acc:0.9826-val_loss:0.0332-val_acc:0.9890
Epoch5/10
35s-loss:0.0467-acc:0.9856-val_loss:0.0289-val_acc:0.9897
Epoch6/10
35s-loss:0.0402-acc:0.9873-val_loss:0.0291-val_acc:0.9902
Epoch7/10
34s-loss:0.0369-acc:0.9880-val_loss:0.0233-val_acc:0.9924
Epoch8/10
36s-loss:0.0336-acc:0.9894-val_loss:0.0258-val_acc:0.9913
Epoch9/10
39s-loss:0.0317-acc:0.9899-val_loss:0.0219-val_acc:0.9926
Epoch10/10
40s-loss:0.0268-acc:0.9916-val_loss:0.0220-val_acc:0.9919
LargeCNNError:0.81%
复制代码
效果不错,现在的准确率是99.19%
4.最终版本
网络结构没变,只是在每一层后面加了dropout,结果居然有显著提升。一开始迭代500次,跑死我了,结果过拟合了,然后观察到69次的时候结果就已经很好了,就选择了迭代69次。
复制代码
1#LargerCNNfortheMNISTDataset
2importnumpy
3fromkeras.datasetsimportmnist
4fromkeras.modelsimportSequential
5fromkeras.layersimportDense
6fromkeras.layersimportDropout
7fromkeras.layersimportFlatten
8fromkeras.layers.convolutionalimportConvolution2D
9fromkeras.layers.convolutionalimportMaxPooling2D
10fromkeras.utilsimportnp_utils
11importmawww.wang027.comtplotlib.pyplotasplt
12fromkeras.constraintsimportmaxnorm
13fromkeras.optimizersimportSGD
14#fixrandomseedforreproducibility
15seed=7
16numpy.random.seed(seed)
17#loaddata
18(X_train,y_train),(X_test,y_test)=mnist.load_data()
19#reshapetobe[samples][pixels][width][height]
20X_train=X_train.reshape(X_train.shape[0],1,28,28).astype(''float32'')
21X_test=X_test.reshape(X_test.shape[0],1,28,28).astype(''float32'')
22#normalizeinputsfrom0-255to0-1
23X_train=X_train/255
24X_test=X_test/255
25#onehotencodeoutputs
26y_train=np_utils.to_categorical(y_train)
27y_test=np_utils.to_categorical(y_test)
28num_classes=y_test.shape[1]
29###raw
30#definethelargermodel
31deflarger_model():
32#createmodel
33model=Sequential()
34model.add(Convolution2D(30,5,5,border_mode=''valid'',input_shape=(1,28,28),activation=''relu''))
35model.add(MaxPooling2D(pool_size=(2,2)))
36model.add(Dropout(0.4))
37model.add(Convolution2D(15,3,3,activation=''relu''))
38model.add(MaxPooling2D(pool_size=(2,2)))
39model.add(Dropout(0.4))
40model.add(Flatten())
41model.add(Dense(128,activation=''relu''))
42model.add(Dropout(0.4))
43model.add(Dense(50,activation=''relu''))
44model.add(Dropout(0.4))
45model.add(Dense(num_classes,activation=''softmax''))
46#Compilemodel
47model.compile(loss=''categorical_crossentropy'',optimizer=''adam'',metrics=[''accuracy''])
48returnmodel
49
50#buildthemodel
51model=larger_model()
52#Fitthemodel
53model.fit(X_train,y_train,validation_data=(X_test,y_test),nb_epoch=200,batch_size=200,verbose=2)
54#Finalevaluationofthemodel
55scores=model.evaluate(X_test,y_test,verbose=0)
56print("LargeCNNError:%.2f%%"%(100-scores[1]100))
复制代码
结果:
复制代码
1____________________________________________________________________________________________________
2Layer(type)OutputShapeParam#Connectedto
3====================================================================================================
4convolution2d_1(Convolution2D)(None,30,24,24)780convolution2d_input_1[0][0]
5____________________________________________________________________________________________________
6maxpooling2d_1(MaxPooling2D)(None,30,12,12)0convolution2d_1[0][0]
7____________________________________________________________________________________________________
8convolution2d_2(Convolution2D)(None,15,10,10)4065maxpooling2d_1[0][0]
9____________________________________________________________________________________________________
10maxpooling2d_2(MaxPooling2D)(None,15,5,5)0convolution2d_2[0][0]
11____________________________________________________________________________________________________
12dropout_1(Dropout)(None,15,5,5)0maxpooling2d_2[0][0]
13____________________________________________________________________________________________________
14flatten_1(Flatten)(None,375)0dropout_1[0][0]
15____________________________________________________________________________________________________
16dense_1(Dense)(None,128)48128flatten_1[0][0]
17____________________________________________________________________________________________________
18dense_2(Dense)(None,50)6450dense_1[0][0]
19____________________________________________________________________________________________________
20dense_3(Dense)(None,10)510dense_2[0][0]
21====================================================================================================
22Totalparams:59933
23____________________________________________________________________________________________________
24Trainon60000samples,validateon10000samples
25Epoch1/69
2634s-loss:0.4248-acc:0.8619-val_loss:0.0832-val_acc:0.9746
27Epoch2/69
2835s-loss:0.1147-acc:0.9638-val_loss:0.0518-val_acc:0.9831
29Epoch3/69
3035s-loss:0.0887-acc:0.9719-val_loss:0.0452-val_acc:0.9855
31、、、
32Epoch66/69
3338s-loss:0.0134-acc:0.9955-val_loss:0.0211-val_acc:0.9943
34Epoch67/69
3538s-loss:0.0114-acc:0.9960-val_loss:0.0171-val_acc:0.9950
36Epoch68/69
3738s-loss:0.0116-acc:0.9959-val_loss:0.0192-val_acc:0.9956
38Epoch69/69
3938s-loss:0.0132-acc:0.9969-val_loss:0.0188-val_acc:0.9978
40LargeCNNError:0.22%
41
42real41m47.350s
43user157m51.145s
44sys6m5.829s
复制代码
这是目前的最好结果,99.78%,然而还有很多地方可以提升,下次准确率提高了再来更。
总结:
1.CNN在图像识别上确实比传统的MLP有优势,比传统的机器学习算法也有优势(不过也有通过随机森林取的很好效果的)
2.加深网络结构,即多加几层卷积层有助于提升准确率,但是也能大大降低运行速度
3.适当加Dropout可以提高准确率
4.激活函数最好,算了,直接说就选relu吧,没有为啥,就因为relu能避免梯度消散这一点应该选它,训练速度快等其他优点下次专门总结一篇文章再说吧。
5.迭代次数不是越多越好,很可能会过拟合,自己可以做一个收敛曲线,keras里可以用history函数plot一下,看算法是否收敛,还是发散。
|
|