说是surf,其实原算法是基于Notes on the OpenSURF Library这篇文档。
下载地址:http://opensurf1./files/OpenSURFcpp.zip
首先定义 #define PROCEDURE 1
紧接着进入main()主函数:
int
main( void )
{
if (PROCEDURE == 1)
return mainImage();
if (PROCEDURE == 2)
return mainVideo();
if (PROCEDURE == 3)
return mainMatch();
if (PROCEDURE == 4)
return mainMotionPoints();
if (PROCEDURE == 5)
return mainStaticMatch();
if (PROCEDURE == 6)
return mainKmeans();
}
|
我们以监测静态图像特征点(即1)为例了解surf算法如何提取特征点。
int
mainImage(void)
{
IpVec ipts;
IplImage
*img=cvLoadImage( "imgs/sf.jpg" );
clock_t start
= clock();
surfDetDes(img, ipts, false , 5, 4, 2,
0.0004f);
clock_t end =
clock();
std::cout<< "OpenSURF found: " << ipts.size() << " interest points" << std::endl;
std::cout<< "OpenSURF took: " << float(end - start) / CLOCKS_PER_SEC <<
" seconds" << std::endl;
drawIpoints(img, ipts);
showImage(img);
return 0;
}
|
EXP:
IpVec的定义在ipoint.h里,typedef std::vector<Ipoint> IpVec;
也就是说,IpVec是一个vector向量,每个成员是一个Ipoint。而Ipoint是一个类,也在ipoint.h里,作者将它按照结构体使用,都是public。
clock()的使用是为了测试程序运行的时间,一个记录起始时间,一个记录结束时间,最终将两者只差输出,即surfDetDes()特征提取所需要的时间。
drawIpoints()函数是在img基础上增加特征点的描述,说白了,就是添加特征点的函数。
那么,现在进入到surfDetDes()函数内部来探个究竟吧。
surfDetDes定义在surflib.h里面:
inline void
surfDetDes(IplImage *img,
std::vector<Ipoint> &ipts,
bool upright = false ,
int octaves = OCTAVES,
int intervals = INTERVALS,
int init_sample = INIT_SAMPLE,
float thres = THRES )
{
IplImage
*int_img = Integral(img);
FastHessian
fh(int_img, ipts, octaves, intervals, init_sample, thres);
fh.getIpoints();
Surf
des(int_img, ipts);
des.getDescriptors(upright);
cvReleaseImage(&int_img);
}
|
总的来说,surfDetDes()里面规划了具体的步骤:
1.获取积分图像init_img;
2.计算Hessian矩阵特征fh;
3.利用fh提取特征点;
4.创建surf特征描述符,并和特征点贯联;
5.释放积分图像。
①积分图像的实现
首先在Integral.cpp里面找到Integral(),如下:
IplImage
*Integral(IplImage *source)
{
IplImage *img
= getGray(source);
IplImage
*int_img = cvCreateImage(cvGetSize(img), IPL_DEPTH_32F, 1);
int height =
img->height;
int width =
img->width;
int step =
img->widthStep/sizeof(float);
float *data
= (float *) img->imageData;
float *i_data
= (float *) int_img->imageData;
float rs =
0.0f;
for (int j=0;
j<width; j++)
{
rs +=
data[j];
i_data[j] =
rs;
}
for (int i=1;
i<height; ++i)
{
rs = 0.0f;
for (int j=0;
j<width; ++j)
{
rs +=
data[i*step+j];
i_data[i*step+j] = rs + i_data[(i-1)*step+j];
}
}
cvReleaseImage(&img);
return int_img;
}
|
1. 首先将原输入转化为灰度图像,并创建一个大小等于灰度图像gray-image的图像数组--积分图像int_img。
2. 获取图像的信息,比如大小(高height和宽width)以及gray-image和积分图像int_img的数据首地址data
&& i_data。(注意此时数据类型为float)
3. 首先计算第一行像素的积分值,相当于一维数据的累加。其他数据通过迭代计算获取,i_data[i*step+j] = rs +
i_data[(i-1)*step+j],若当前点为(i0,j0),其中rs就为第(i+1)行(即i=i0)行j<=j0之前所有像素值和。 如下所示:
[其中黑色为当前点i_data[i*step+j],绿色为i_data[(i-1)*step+j],而rs=横放着的和黑色点同行的那块矩形框对应的区域像素值之和]
4. 释放灰度图像,并返回积分图像。
相关函数integral.h中的BoxIntegral().
inline float BoxIntegral(IplImage *img, int row, int col, int rows, int
cols)
其中,几个参数意思分别为源图像,row,col为A点的坐标值,rows和cols分别为高和宽。
利用上面的积分图像计算 A B
这样的box区域里面所有像素点的灰度值之和。S=int_img(D)+int_img(A)-int_img(B)-int_img(C).
C D
②Hessian矩阵特征的计算
FastHessian,计算hessian矩阵的类,它的定义在fasthessian.h里,实现在fasthessian.cpp里。
class
FastHessian {
public:
FastHessian(std::vector<Ipoint> &ipts,
const int octaves = OCTAVES,
const int intervals = INTERVALS,
const int init_sample = INIT_SAMPLE,
const float thres = THRES);
FastHessian(IplImage *img,
std::vector<Ipoint> &ipts,
const int octaves = OCTAVES,
const int intervals = INTERVALS,
const int init_sample = INIT_SAMPLE,
const float thres = THRES);
~FastHessian();
void
saveParameters(const int octaves,
const int intervals,
const int init_sample,
const float thres);
void
setIntImage(IplImage *img);
void
getIpoints();
private:
void
buildResponseMap();
void
buildResponseLayer(ResponseLayer *r);
int
isExtremum(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer
*b);
void
interpolateExtremum(int r, int c, ResponseLayer *t, ResponseLayer *m,
ResponseLayer *b);
void
interpolateStep(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer
*b,
double* xi, double* xr, double* xc );
CvMat*
deriv3D(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer *b);
CvMat*
hessian3D(int r, int c, ResponseLayer *t, ResponseLayer *m, ResponseLayer *b);
IplImage
*img;
int i_width,
i_height;
std::vector<Ipoint> &ipts;
std::vector<ResponseLayer *> responseMap;
int octaves;
int
intervals;
int
init_sample;
float
thresh;
};
|
在public里面定义了两种构造函数分别对应有无源图像这一项参数,紧接着还定义了析构函数~FastHessian等函数。下面在fasthessian.cpp对这些函数的实现一一解释:
两个构造函数都是调用了saveParameters(octaves, intervals, init_sample,
thresh)设置构造金字塔的参数,而带图像的构造函数另外多加了一句setIntImage(img)用来设置当前图像。
void
FastHessian::saveParameters(const int octaves, const int intervals,
const int init_sample, const float thresh)
{
this ->octaves =
(octaves
> 0 && octaves <= 4 ? octaves : OCTAVES);
this ->intervals =
(intervals
> 0 && intervals <= 4 ? intervals : INTERVALS);
this ->init_sample =
(init_sample
> 0 && init_sample <= 6 ? init_sample : INIT_SAMPLE);
this ->thresh
= (thresh >= 0 ? thresh : THRES);
}
void
FastHessian::setIntImage(IplImage *img)
{
this ->img =
img;
i_height =
img->height;
i_width =
img->width;
}
|
由于在h头文件中已设置
static
const int OCTAVES = 5;
static
const int INTERVALS = 4;
static
const float THRES = 0.0004f;
static
const int INIT_SAMPLE = 2;
|
所以 saveParameters的作用就是调整参数,以防过大或过小。
FastHessian::getIpoints()提取兴趣点:
void
FastHessian::getIpoints()
{
static const
int filter_map [OCTAVES][INTERVALS] = {{0,1,2,3}, {1,3,4,5}, {3,5,6,7},
{5,7,8,9}, {7,9,10,11}};
ipts.clear();
buildResponseMap();
...<BR>
...
}
|
首先初始化filter_map,清空标记特征点的ipts结构体。
创建高斯平滑层函数参数ResponseMap(),大小与论文所给完全一致,
// Oct1: 9, 15, 21, 27 // Oct2: 15, 27, 39, 51 // Oct3: 27, 51, 75, 99 // Oct4: 51, 99, 147,195 // Oct5: 99, 195,291,387
这些都是每组模板的大小,每组间隔递增,6,12,24,48,96
。ResponseMap这个结构体向量包含4个参数ResponseLayer(int width, int height, int step, int
filter)定义在responsemap.h里面,其中width和height等于实际图像大小除以step(step初始值为2),而filter则是滤波器半径。
然后使用buildResponseLayer(responseMap[i])对图像处理后将数据存放在responses和laplacian两个数组里面。
void
FastHessian::buildResponseLayer(ResponseLayer *rl)
{
float
*responses = rl->responses;
unsigned char
*laplacian = rl->laplacian;
int step =
rl->step;
int b =
(rl->filter - 1) / 2;
int l =
rl->filter / 3;
int w =
rl->filter;
float
inverse_area = 1.f/(w*w);
float Dxx,
Dyy, Dxy;
for (int r, c,
ar = 0, index = 0; ar < rl->height; ++ar)
{
for (int ac = 0;
ac < rl->width; ++ac, index++)
{
r = ar *
step;
c = ac *
step;
Dxx =
BoxIntegral(img, r - l + 1, c - b, 2*l - 1, w)
-
BoxIntegral(img, r - l + 1, c - l / 2, 2*l - 1, l)*3;
Dyy =
BoxIntegral(img, r - b, c - l + 1, w, 2*l - 1)
-
BoxIntegral(img, r - l / 2, c - l + 1, l, 2*l - 1)*3;
Dxy = +
BoxIntegral(img, r - l, c + 1, l, l)
+
BoxIntegral(img, r + 1, c - l, l, l)
-
BoxIntegral(img, r - l, c - l, l, l)
-
BoxIntegral(img, r + 1, c + 1, l, l);
Dxx *=
inverse_area;
Dyy *=
inverse_area;
Dxy *=
inverse_area;
responses[index] = (Dxx * Dyy - 0.81f * Dxy * Dxy);
laplacian[index] = (Dxx + Dyy >= 0 ? 1 :
0);
#ifdef RL_DEBUG
rl->coords.push_back(std::make_pair<int,int>(r,c));
#endif
}
}
}
|
其中计算Dxy和Dyy的示意图如下,其他的Dxx(Dyy的转置)读者自己参考。【此时w=9,l=9/3=3,对应于右图的总计算区域高度和宽度2*l-1】

圆点为当前点,将红色方形区域1内的积分值减去绿色方形2区域内的积分值=Dxy*w2
绿色方形区域2内的积分值减去2*红色色方形区域1内的积分值=Dyy*w2 (==用一整块区域-3*红色区域)
最后将计算后的结果存放到ResponseLayer里面的response和laplacian一维数组里,数组的大小即为ResponseLayer->width
* ResponseLayer->width。
这样就计算出了每一层的所有像素点的det(Happrox)=Dxx*Dyy-(0.9*Dxy)2,下面开始判断当前点是否是极值点。
③根据上一步计算每层的每个点的det(Happrox),判断极值点。
在fasthessian.cpp里面找到getIpoints(),下面开始抽取每组(共octaves组)相邻的3层(共有4层,每组有2个这样层的满足):
ResponseLayer *b, *m, *t;
for
(int o = 0; o < octaves; ++o) for (int i = 0;
i <= 1; ++i)
{
b =
responseMap.at(filter_map[o][i]);
m =
responseMap.at(filter_map[o][i+1]);
t =
responseMap.at(filter_map[o][i+2]);
for
(int r = 0; r < t->height; ++r)
{
for (int c = 0;
c < t->width; ++c)
{
if (isExtremum(r, c, t, m, b))
{
interpolateExtremum(r, c, t, m, b);
}
}
}
}
|
在判断的时候用到了isExtremum()和interpolateExtremum()子函数,在当前cpp内找到它们,并分析。
int
FastHessian::isExtremum(int r, int c, ResponseLayer *t, ResponseLayer *m,
ResponseLayer *b)
{
int
layerBorder = (t->filter + 1) / (2 * t->step);
if
(r <= layerBorder || r >= t->height -
layerBorder || c <= layerBorder || c >= t->width - layerBorder)
return 0;
float
candidate = m->getResponse(r, c, t);
if
(candidate < thresh)
return 0;
for
(int rr = -1; rr <=1; ++rr)
{
for
(int cc = -1; cc <=1; ++cc)
{
if (
t->getResponse(r+rr, c+cc) >= candidate ||
((rr !=
0 || cc != 0) && m->getResponse(r+rr, c+cc, t) >= candidate) ||
b->getResponse(r+rr, c+cc, t) >= candidate
)
return 0;
}
}
return 1;
}
|
大家务必注意,由于各层大小不一致,顾在比较大小的时候,要统一到统一尺度下,在getResponse()里面有所体现,即先计算两者尺度之比,比如尺度之比为2,最上层当前点为(20,25),那么需要比较的紧邻下层点为(40,50)而不是下层的(20,25)。再看若当前点为极值点,那么调用interpolateExtremum():
void
FastHessian::interpolateExtremum(int r, int c, ResponseLayer *t, ResponseLayer
*m, ResponseLayer *b)
{
int filterStep
= (m->filter - b->filter);
assert(filterStep > 0 && t->filter -
m->filter == m->filter - b->filter);
double xi = 0,
xr = 0, xc = 0;
interpolateStep(r, c, t, m, b, &xi, &xr,
&xc );
if ( fabs( xi )
< 0.5f && fabs( xr ) < 0.5f && fabs( xc ) < 0.5f )
{
Ipoint ipt;
ipt.x =
static_cast<float>((c + xc) * t->step);
ipt.y =
static_cast<float>((r + xr) * t->step);
ipt.scale =
static_cast<float>((0.1333f) * (m->filter + xi * filterStep));
ipt.laplacian =
static_cast<int>(m->getLaplacian(r,c,t));
ipts.push_back(ipt);
}
}
|
本函数实现功能,利用插值精确计算极值点在原图像中的位置,并保存在ipt中。
打开interpolateStep,其中deriv3D是求当前点的3的方向上的一阶导,hessian3D是求当前点的3维hessian二阶导,最后计算出3个方向的偏差值xi,xr,xc
.
void
FastHessian::interpolateStep(int r, int c, ResponseLayer *t, ResponseLayer *m,
ResponseLayer *b,
double* xi, double* xr, double* xc )
{
CvMat* dD, *
H, * H_inv, X;
double x[3] =
{ 0 };
dD = deriv3D(
r, c, t, m, b );
H = hessian3D(
r, c, t, m, b );
H_inv =
cvCreateMat( 3, 3, CV_64FC1 );
cvInvert( H,
H_inv, CV_SVD );
cvInitMatHeader( &X, 3, 1, CV_64FC1, x, CV_AUTOSTEP
);
cvGEMM( H_inv,
dD, -1, NULL, 0, &X, 0 );
cvReleaseMat(
&dD );
cvReleaseMat(
&H );
cvReleaseMat(
&H_inv );
*xi = x[2];
*xr = x[1];
*xc = x[0];
}
|
下面开始在特征点周围提取特征描述符
④ 提取特征描述符
转到surf.cpp里寻找getDescriptors(),由于upright初始化设置为false(为特征点分配主方向,并旋转找到邻域提取特征描述符),若为true,则直接提取邻域特征描述符。
void
Surf::getDescriptors(bool upright)
{
if
(!ipts.size()) return ;
int ipts_size
= (int)ipts.size();
if
(upright)
{
for
(int i = 0; i < ipts_size; ++i)
{
index = i;
getDescriptor( true );
}
}
else
{
for
(int i = 0; i < ipts_size; ++i)
{
index = i;
getOrientation();
getDescriptor( false );
}
}
}
|
其实两者区别在于提取特征描述符之前判断upright的时候是否需要多调用一句getOrientation()就是了。前者不考虑旋转,两幅图只是尺度有异,而后者还考虑了旋转不变性,更加全面。
几个地方:
1.如论文提出的采取r=6的圆域,共计包含109个像素点,大家可以算算,在此不多解释了。
2.像素点的方向由harrx和harry计算得到,相当于梯度公式里面的dx和dy,然后利用getAngle得到θ(分布在0~2∏,而不是简单的tan-1(harrx/harry)取值在0~∏)。
inline
float Surf::haarX(int row, int column, int s)
{
return BoxIntegral(img, row-s/2, column, s, s/2)
-1 *
BoxIntegral(img, row-s/2, column-s/2, s, s/2);
}
inline
float Surf::haarY(int row, int column, int s)
{
return BoxIntegral(img, row, column-s/2, s/2, s)
-1 *
BoxIntegral(img, row-s/2, column-s/2, s/2, s);
}
float
Surf::getAngle(float X, float Y)
{
if (X > 0
&& Y >= 0)
return atan(Y/X);
if (X < 0
&& Y >= 0)
return pi -
atan(-Y/X);
if (X < 0
&& Y < 0)
return pi +
atan(Y/X);
if (X > 0
&& Y < 0)
return 2*pi -
atan(-Y/X);
return 0;
}
|
图示如右: (黑色代表-1,白色为1,即白色区域积分值减去黑色区域积分值)
在getOrienation()里面resx和resy分别是上面计算出来的harrx和harry分别乘上高斯模板系数gauss25。
void
Surf::getOrientation()
{
......
for (ang1 = 0;
ang1 < 2*pi; ang1+=0.15f) {
ang2 = (
ang1+pi/3.0f > 2*pi ? ang1-5.0f*pi/3.0f : ang1+pi/3.0f);
sumX = sumY
= 0.f;
for (unsigned
int k = 0; k < Ang.size(); ++k)
{
const
float & ang = Ang[k];
if (ang1 <
ang2 && ang1 < ang && ang < ang2)
{
sumX+=resX[k];
sumY+=resY[k];
}
else if (ang2 <
ang1 &&
((ang
> 0 && ang < ang2) || (ang > ang1 && ang < 2*pi) ))
{
sumX+=resX[k];
sumY+=resY[k];
}
}
if
(sumX*sumX + sumY*sumY > max)
{
max =
sumX*sumX + sumY*sumY;
orientation = getAngle(sumX, sumY);
}
}
ipt->orientation = orientation;
}
|
好了,终于要开始提取特征描述符了哈~.~
void
Surf::getDescriptor(bool bUpright)
{
int y, x,
sample_x, sample_y, count=0;
int i = 0, ix
= 0, j = 0, jx = 0, xs = 0, ys = 0;
float scale,
*desc, dx, dy, mdx, mdy, co, si;
float gauss_s1
= 0.f, gauss_s2 = 0.f;
float rx =
0.f, ry = 0.f, rrx = 0.f, rry = 0.f, len = 0.f;
float cx =
-0.5f, cy = 0.f;
Ipoint *ipt =
&ipts[index];
scale =
ipt->scale;
x =
fRound(ipt->x);
y =
fRound(ipt->y);
desc =
ipt->descriptor;
if
(bUpright)
{
co = 1;
si = 0;
}
else
{
co =
cos(ipt->orientation);
si =
sin(ipt->orientation);
}
i = -8;
while (i <
12)
{
j = -8;
i = i-4;
cx += 1.f;
cy = -0.5f;
while (j <
12)
{
dx=dy=mdx=mdy=0.f;
cy += 1.f;
j = j - 4;
ix = i +
5;
jx = j +
5;
xs =
fRound(x + ( -jx*scale*si + ix*scale*co));
ys =
fRound(y + ( jx*scale*co + ix*scale*si));
for (int k = i;
k < i + 9; ++k)
{
for (int l = j;
l < j + 9; ++l)
{
sample_x = fRound(x + (-l*scale*si + k*scale*co));
sample_y = fRound(y + ( l*scale*co + k*scale*si));
gauss_s1 =
gaussian(xs-sample_x,ys-sample_y,2.5f*scale);
rx =
haarX(sample_y, sample_x, 2*fRound(scale));
ry =
haarY(sample_y, sample_x, 2*fRound(scale));
rrx =
gauss_s1*(-rx*si + ry*co);
rry =
gauss_s1*(rx*co + ry*si);
dx +=
rrx;
dy +=
rry;
mdx +=
fabs(rrx);
mdy +=
fabs(rry);
}
}
gauss_s2 =
gaussian(cx-2.0f,cy-2.0f,1.5f);
desc[count++] = dx*gauss_s2;
desc[count++] = dy*gauss_s2;
desc[count++] = mdx*gauss_s2;
desc[count++] = mdy*gauss_s2;
len +=
(dx*dx + dy*dy + mdx*mdx + mdy*mdy) * gauss_s2*gauss_s2;
j += 9;
}
i += 9;
}
len =
sqrt(len);
for (int i = 0;
i < 64; ++i)
desc[i] /=
len;
}
|
其中i和j分别取的值为-8,-3,2,7,很明显i,j确定的邻域为7-(-8)+1=16,16x16的邻域,旋转对应的在原图像的点位置为
xs = fRound(x + ( -jx*scale*si +
ix*scale*co)); ys = fRound(y + ( jx*scale*co +
ix*scale*si));
co = cos(ipt->orientation);
si = sin(ipt->orientation);
【ipt->orientation为特征点的方向角】
共有
4x4个子块(9x9=81个像素点),每个子块分别计算了其中16个dx,dy,|dx|,|dy|之和(当然还要考虑高斯滤波权重系数),则最终的特征描述符为4x4x4=64维向量。
main.cpp内mainImage函数内部drawIpoints(img, ipts)就不用再做解释了吧。
搞了一天,终于搞完了~~
来个截图~~

谢谢大家~
转载请注明blue_lghttp://www.cnblogs.com/blue-lg/archive/2012/07/20/2600436.html
|