LIBSVM学习笔记（二）

Z2ty6osc12zs6c 2018-05-18

展开全文

参考资料：http://www./thread-12379-1-1.html

LIBSVM（MATLAB版）的使用说明位于\libsvm-3.22\libsvm-3.22\matlab\目录下的README文档中。

1、函数用法说明

(1) model = svmtrain(training_label_vector, training_instance_matrix [, 'libsvm_options']);

 
     — training_label_vector:
            An m by 1 vector of training labels (type must be double).
     — training_instance_matrix:
            An m by n matrix of m training instances with n features.
            It can be dense or sparse (type must be double).
     — libsvm_options:
            A string of training options in the same format as that of LIBSVM.

(2)[predicted_label,accuracy,decision_values/prob_estimates]= svmpredict(testing_label_vector, testing_instance_matrix, model [, 'libsvm_options']);
(3)[predicted_label] = svmpredict(testing_label_vector, testing_instance_matrix,model [, 'libsvm_options']);

 
  — testing_label_vector:
            An m by 1 vector of prediction labels. If labels of test
            data are unknown, simply use any random values. (type must be double)
  — testing_instance_matrix:
            An m by n matrix of m testing instances with n features.
            It can be dense or sparse. (type must be double)
  — model:
            The output of svmtrain.
  — libsvm_options:
            A string of testing options in the same format as that of LIBSVM.
（3）返回模型结构

Returned Model Structure
========================

The 'svmtrain' function returns a model which can be used for futureprediction. It is a structure and is organized

as [Parameters, nr_class,totalSV, rho, Label, ProbA, ProbB, nSV, sv_coef, SVs]:

        -Parameters: parameters
        -nr_class: number of classes; = 2 for regression/one-class svm
        -totalSV: total #SV
        -rho: -b of the decision function(s) wx+b
        -Label: label of each class; empty for regression/one-class SVM
        -sv_indices: values in [1,...,num_traning_data] to indicate SVs in the training set
        -ProbA: pairwise probability information; empty if -b 0 or in one-class SVM
        -ProbB: pairwise probability information; empty if -b 0 or in one-class SVM
        -nSV: number of SVs for each class; empty for regression/one-class SVM
        -sv_coef: coefficients for SVs in decision functions
        -SVs: support vectors

例如

（4）预测结果

Result of Prediction
====================

The function 'svmpredict' has three outputs. The first one,predictd_label, is a vector of predicted labels. The second output,
accuracy, is a vector including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression).
The third is a matrix containing decision values or probability estimates (if '-b 1' is specified). If k is the number of classes
in training data, for decision values, each row includes results of predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a
special case. Decision value +1 is returned for each testing instance,instead of an empty vector. For probabilities, each row contains k values
indicating the probability that the testing instance is in each class.Note that the order of classes here is the same as 'Label' field
in the model structure.

2、实例应用

（1）实例1

利用libsvm进行分类预测，其实使用libsvm进行分类很简单，只需要有属性矩阵和标签，然后就可以建立分类模型（model），然后利用得到的这个model进行分类预测了。

那什么是属性矩阵？什么又是标签呢？我举一个直白的不能在直白的例子：

说一个班级里面有两个男生（男生1、男生2），两个女生（女生1、女生2），其中

男生1 身高：176cm 体重：70kg；
男生2 身高：180cm 体重：80kg；
女生1 身高：161cm 体重：45kg；
女生2 身高：163cm 体重：47kg；

(1)如果我们将男生定义为1，女生定义为-1，并将上面的数据放入矩阵data中，即

	data = [176 70;180 80;161 45;163 47];

在label中存入男女生类别标签（1、-1），即

	label = [1;1;-1;-1];

这样上面的data矩阵就是一个属性矩阵，行数4代表有4个样本，列数2表示属性有两个，label就是标签（1、-1表示有两个类别：男生、女生）。这里面的标签定义就是区分开男生和女生，怎么定义都可以的，只要定义成数值型的就可以。

(2)现在回归正题，有了上面的属性矩阵data，和标签label就可以利用libsvm建立分类模型了，简要代码如下：

	model = svmtrain(label,data);

有了model我们就可以做分类预测，比如此时该班级又转来一个新学生，其身高190cm，体重85kg。我们想通过上面这些信息就给出其标签（想知道其是男【1】还是女【-1】）

比如，令 testdata = [190 85]; 由于其标签我们不知道，我们假设其标签为-1（也可以假设为1）

（3）然后利用libsvm来预测这个新来的学生是男生还是女生，代码如下：

	[predictlabel,accuracy] = svmpredict(testdatalabel,testdata,model)

（4）下面我们整体运行一下上面这段背景数据和代码

	data = [176 70;180 80;161 45;163 47];
	label = [1;1;-1;-1];
	model = svmtrain(label,data);
	testdata = [190 85];
	testdatalabel = -1;
	[predictlabel,accuracy] = svmpredict(testdatalabel,testdata,model);
	predictlabel

运行结果如下：

改后代码：

	data = [176 70;180 80;161 45;163 47];
	label = [1;1;-1;-1];
	model = svmtrain(label,data);
	testdata = [190 85];
	testdatalabel = -1;
	[predictlabel] = svmpredict(testdatalabel,testdata,model);
	predictlabel

结果为：

	Accuracy = 0% (0/1) (classification)
	predictlabel =
     		1

（2）实例2

下面使用libsvm工具箱本身带的测试数据heart_scale来实际进行一下测试：

  %% HowToClassifyUsingLibsvm
  % by faruto @ faruto's Studio~
  % http://blog.sina.com.cn/faruto
  % Email:faruto@163.com
  % http://www.MATLABsky.com
  % http://www.
  % http://video.
  % last modified by 2010.12.27
  %% a litte clean work
  tic;
  close all;
  clear;
  clc;
  format compact;
  %% 
 
  % 首先载入数据
  load heart_scale;
  data = heart_scale_inst;
  label = heart_scale_label;
 
  % 选取前200个数据作为训练集合，后70个数据作为测试集合
  ind = 200;
  traindata = data(1:ind,:);
  trainlabel = label(1:ind,:);
  testdata = data(ind+1:end,:);
  testlabel = label(ind+1:end,:);
 
  % 利用训练集合建立分类模型
  model = svmtrain(trainlabel,traindata,'-s 0 -t 2 -c 1.2 -g 2.8');
 
  % 分类模型model解密
  model
  Parameters = model.Parameters
  Label = model.Label
  nr_class = model.nr_class
  totalSV = model.totalSV
  nSV = model.nSV 
 
  % 利用建立的模型看其在训练集合上的分类效果
  [ptrain,acctrain] = svmpredict(trainlabel,traindata,model);
 
  % 预测测试集合标签
  [ptest,acctest] = svmpredict(testlabel,testdata,model);
 
  %%
  toc;

运行结果：

model = 
    Parameters: [5x1 double]
      nr_class: 2
       totalSV: 197
           rho: 0.0583
         Label: [2x1 double]
    sv_indices: [197x1 double]
         ProbA: []
         ProbB: []
           nSV: [2x1 double]
       sv_coef: [197x1 double]
           SVs: [197x13 double]
Parameters =
         0
    2.0000
    3.0000
    2.8000
         0
Label =
     1
    -1
nr_class =
     2
totalSV =
   197
nSV =
    89
   108
Accuracy = 99.5% (199/200) (classification)
Accuracy = 68.5714% (48/70) (classification)
Elapsed time is 0.009619 seconds.

只是说一下参数输入的意义：

	-s svm类型：SVM设置类型(默认0)
	0 -- C-SVC
	1 --v-SVC
	2 – 一类SVM
	3 -- e -SVR
	4 -- v-SVR
	-t 核函数类型：核函数设置类型(默认2)
		0 – 线性：u'v
		1 – 多项式：(r*u'v + coef0)^degree
		2 – RBF函数：exp(-r|u-v|^2)
		3 –sigmoid：tanh(r*u'v + coef0)
	-g r(gama)：核函数中的gamma函数设置(针对多项式/rbf/sigmoid核函数)
	-c cost：设置C-SVC，e -SVR和v-SVR的参数(损失函数)(默认1)

我们先来看一下model.Parameters里面承装的都是什么：

	>> model.Parameters
		ans =0
   		 2.0000
     	  3.0000
   		 2.8000
       	       0

重要知识点：

model.Parameters参数意义从上到下依次为：

	-s svm类型：SVM设置类型(默认0)
	-t 核函数类型：核函数设置类型(默认2)
	-d degree：核函数中的degree设置(针对多项式核函数)(默认3)
	-g r(gama)：核函数中的gamma函数设置(针对多项式/rbf/sigmoid核函数) (默认类别数目的倒数)
	-r coef0：核函数中的coef0设置(针对多项式/sigmoid核函数)((默认0)

即在本例中通过model.Parameters我们可以得知–s参数为0；-t参数为2；-d参数为3；-g参数为2.8（这也是我们自己的输入）；-r参数为0。

关于libsvm参数的一点小说明：

Libsvm中参数设置可以按照SVM的类型和核函数所支持的参数进行任意组合，如果设置的参数在函数或SVM类型中没有也不会产生影响，程序不会接受该参数；如果应有的参数设置不正确，参数将采用默认值。

（1）model.Label model.nr_class

	>> model.Label
	ans =
     	1
    	-1
	>> model.nr_class
	ans =
     	2

重要知识点：

model.Label表示数据集中类别的标签都有什么，这里是1，-1；

model.nr_class表示数据集中有多少类别，这里是二分类。

（2）model.totalSV model.nSV

	>> model.totalSV
	ans =
   		259
	>> model.nSV
	ans =
   		118
   		141

重要知识点：

model.totalSV代表总共的支持向量的数目，这里共有259个支持向量；

model.nSV表示每类样本的支持向量的数目，这里表示标签为1的样本的支持向量有118个，标签为-1的样本的支持向量为141。

注意：这里model.nSV所代表的顺序是和model.Label相对应的。

（3）model.ProbA model.ProbB

关于这两个参数这里不做介绍，使用-b参数时才能用到，用于概率估计。

-b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)

（4）model.sv_coef model.SVs model.rho

	sv_coef: [259x1 double]
           SVs: [259x13 double]
                model.rho =  0.0514

重要知识点：

model.sv_coef是一个259*1的矩阵，承装的是259个支持向量在决策函数中的系数；
model.SVs是一个259*13的稀疏矩阵，承装的是259个支持向量。
model.rho是决策函数中的常数项的相反数（-b）

在这里首先我们看一下通过 –s 0 参数（C-SVC模型）得到的最终的分类决策函数的表达式是怎样的？

这里如果有关于C-SVC模型不懂的地方，请看这个pdf文件：libsvm_library.pdf。

最终的决策函数为：

在由于我们使用的是RBF核函数（前面参数设置 –t 2），故这里的决策函数即为：

其中|| x-y ||是二范数距离 ;

这里面的

b就是-model.rho（一个标量数字）;
b = -model.rho;
n代表支持向量的个数即 n = model.totalSV（一个标量数字）；

对于每一个i：
wi =model.sv_coef(i); 支持向量的系数（一个标量数字）
xi = model.SVs(i,:) 支持向量（1*13的行向量）

x 是待预测标签的样本（1*13的行向量）
gamma 就是 -g 参数

好的下面我们通过model提供的信息自己建立上面的决策函数如下：

	%% DecisionFunction
	function plabel = DecisionFunction(x,model)
 
	gamma = model.Parameters(4);
	RBF = @(u,v)( exp(-gamma.*sum( (u-v).^2) ) );
 
	len = length(model.sv_coef);
	y = 0;
 
	for i = 1:len
    	u = model.SVs(i,:);
    	y = y + model.sv_coef(i)*RBF(u,x);
	end
	b = -model.rho;
	y = y + b;
 
	if y >= 0
    	plabel = 1;
	else
    	plabel = -1;
	end

有了这个决策函数，我们就可以自己预测相应样本的标签了

	%%
	plable = zeros(270,1);
	for i = 1:270
    	x = data(i,:);
    	plabel(i,1) = DecisionFunction(x,model);
	end
 
	%% 验证自己通过决策函数预测的标签和svmpredict给出的标签相同
	flag = sum(plabel == PredictLabel)
	over = 1;

最终可以看到 flag = 270，即自己建立的决策函数是正确的，可以得到和svmpredict得到的一样的样本的预测标签，事实上svmpredict底层大体也就是这样实现的。

最后我们来看一下，svmpredict得到的返回参数的意义都是什么

在下面这段代码中:

	%% 
	% 首先载入数据
	load heart_scale;
	data = heart_scale_inst;
	label = heart_scale_label;
	% 建立分类模型
	model = svmtrain(label,data,'-s 0 -t 2 -c 1.2 -g 2.8');
	model
	% 利用建立的模型看其在训练集合上的分类效果
	[PredictLabel,accuracy] = svmpredict(label,data,model);
	accuracy

运行可以看到:

model = 
       Parameters: [5x1 double]
       nr_class: 2
       totalSV: 259
         rho: 0.0514
        Label: [2x1 double]
        ProbA: []
        ProbB: []
         nSV: [2x1 double]
       sv_coef: [259x1 double]
         SVs: [259x13 double]
Accuracy = 99.6296% (269/270) (classification)
accuracy =
   99.6296
   0.0148
   0.9851

这里面要说一下返回参数accuracy的三个参数的意义。

重要的知识点：

返回参数accuracy从上到下依次的意义分别是：

分类准率（分类问题中用到的参数指标）

平均平方误差（MSE (mean squared error)）[回归问题中用到的参数指标]

平方相关系数（r2 (squared correlation coefficient)）[回归问题中用到的参数指标]