![](http://image60.360doc.com/DownloadImg/2013/04/1214/31585220_4.png)
Introduction
The contents presented here build on a previous article (see Introduction to OpenCV: Playing and Manipulating Video files click here), but I tried to organize them in a way, so that they can be understood without reading part one of this tutorial. Alas, the current article might even be simpler than its predecessor and possibly would have been a better part one. Still, I am very optimistic that the interested reader will not be puzzled that much and consider this as a minor problem.
As already indicated by the title the article mainly concerns the Open Source Computer Vision Library (OpenCV), which is a software platform that provides a great number of high level programming tools for loading, saving, depicting and manipulating images and videos. It is, of course, impossible to give a description of every aspect that is offered by this library in a brief tutorial like this, therefore I will only discuss a small selection of topics. If you are in need of a more extensive overview, if you are interested in mathematical backgrounds or if you are looking for more details on the topics touched on here, you might find the answers you are yearning for in a book on OpenCV like, for example, “Learning OpenCV” by Gary Bradski and Adrian Kaehler, where partly have my “wisdom” from.
The OpenCV function calls are pure C code, but I packed them into two C++ classes (one for image operations, and one for video operations inheriting the methods of the image class). Some people might consider this mixture of programming conventions as a faux pas, I, however, regard it as a good way to keep the code tidy and have most of it at a particular location. Because the OpenCV library only offers a limited set of possibilities to create a graphical interface I built a Windows GUI around the OpenCV code (by creating a Win32 program in Visual C++ 2010 Express Edition).
If you want to turn the presented code into an executable program, you will have to install the OpenCV libraries (get the latest version at e.g. http://opencv.) and include the correct lib files and headers for your program. I give a stepwise description on all this on top of the UsingOpenCV.cpp source code file. Graphical illustrations can be very helpful in finding one’s way through this and ,luckily, further explanations, which offer screenshots also, can be found on the web (e.g. by searching for ‘Using OpenCV in VC++2010’).
In the first part of this tutorial I will give a basic description of how to handle events (in particular mouse events) inOpenCV, then there will be some words on a selection of OpenCV commands which can be used to manipulate images, and finally I will show how to turn visual input into video formats that can be read by most standard video players. I am not an expert in the mathematical details of the presented contents. What I can give is a more or less superficial overview of how the presented functions work and how they can be used in a program.
Processing mouse events with the help of OpenCV
![](http://image60.360doc.com/DownloadImg/2013/04/1214/31585220_5.png)
If you press a key or a button on the computer-mouse or simply move the mouse pointer over a window, the message loop in a Win32 program processes such events and makes them accessible to the programmer. For those who are experienced in developing interactive software handling such messages is daily business. Since OpenCV has its own commands to create windows, it also offers its own procedures to set up message loops for such windows. In most cases the OpenCV specific code for event handling is easy to implement and should be preferred to processing messages coming from OpenCV windows via the standard message loop (something I have not tried out, but surely can be done). The code below concerns mouse events, but for the sake of completeness I also briefly discuss the handling of keyboard input.
Keyboard events can be processed very easily. The command cvWaitKey(timespan);
waits a certain time for a key to be pressed and returns this key as an integer value (ASCII code). Therefore, if you want to process keyboard events set up a while loop, insert the command line key = cvWaitKey(timespan);
and then process keyboard input by checking the value of the variable key like, for example, if(key == ‘q’){ do something}
. Please note, thatcvWaitKey();
also makes the program wait for a specified period of time (e.g. cvWaitKey(100) makes the program wait for 100 milliseconds). This is necessary, for example, to process images at a certain frame rate (find examples in the previous article and below where I discuss the code on how to save a video).
Mouse events require more attention, although they are not very difficult to implement as well. They consist of two parts:
First, you will have to invoke cvSetMouseCallback (const char* window_name, CvMouseCallback my_Mouse_Handler, void* param);
in order to register a callback. The first argument of this function is the name of the window to which the callback is attached (a window created with cvNamedWindow(“window name”,0);
). The second argument is the callback function itself, and the third argument, for instance, the image to which the callback is applied.
Afterwards, you have to set up a mouse handler function (second argument in cvSetMouseCallback
). This function, which I named my_mouse_Handler (int events, int x, int y, int flags, void* param)
in my program, takes 5 arguments.
The first and most important argument is an integer variable that can have one of the following values (ranging from 0 to 9): CV_EVENT_MOUSE_MOVE
(= mouse pointer moves over specified window), CV_EVENT_LBUTTONDOWN
(= left mouse button is pressed), CV_EVENT_RBUTTONDOWN
(= right mouse button is pressed), CV_EVENT_MBUTTONDOWN
(= middle mouse button is pressed), CV_EVENT_LBUTTONUP
, CV_EVENT_RBUTTONUP
, CV_EVENT_MBUTTONUP
(= events that occur after one of the corresponding button has been released), CV_LBUTTONDBLCLK
, CV_LBUTTONDBLCLK
, andCV_LBUTTONDBLCLK
(= when a user double clicks the corresponding buttons).
The second and the third argument of the callback function are the x (= horizontal) and the y (= vertical) position of the mouse-pointer with the upper left corner of a window being the reference point (0,0).
The forth argument is useful if you want to access additional information during a mouse event.CV_EVENT_FLAG_LBUTTON
, CV_EVENT_FLAG_RBUTTON
, CV_EVENT_MBUTTON
check if the user presses one of the corresponding buttons. This might be needed if you want to know if a button is pressed while the mouse pointer moves (e.g., drag and drop operations). CV_EVENT_FLAG_CTRLKEY
, CV_EVENT_FLAG_SHIFTKEY
,CV_EVENT_FLAG_ALTKEY
check if the Ctrl, Shift, or the Alt key has been pressed during a mouse event.
The final argument is a void pointer for any additional information that will be needed. In the code example below I use this argument to obtain a pointer to the image the event handler is operating on.
Using the code
As already mentioned above I packed the OpenCV specific code into two classes. The first class contains some methods on image operations and the mouse-handler. The second one inherits the methods of the image class but also contains code for processing videos. Please note, that within a class a callback function and its variables have to be defined as static.
The program I wrote works on videos. It provides access to the video data; then it loads the first frame of the video and presents it in a window of its own. Mouse operations for which the program implements a handler are done on the image shown in this window.
The most important steps are:
- Capture video file with
cvCreateFileCapture();
by invoking Get_Video_from_File(char* file_name);
which I defined in the Video_OP
class. Please find the contents of this method in the following code sample. It also should give you a “feeling” of how to use some OpenCV commands (likecvNamedWindow();
, for instance).
Collapse | Copy Code
bool Video_OP::Get_Video_from_File(char* file_name)
{
if(!file_name)
return false;
my_p_capture = cvCreateFileCapture(file_name);
if (!my_p_capture) return false;
this->my_grabbed_frame = cvQueryFrame(my_p_capture);
this->captured_size.width = (int)cvGetCaptureProperty(my_p_capture,
CV_CAP_PROP_FRAME_WIDTH);
this->captured_size.height = (int)cvGetCaptureProperty(my_p_capture,
CV_CAP_PROP_FRAME_HEIGHT);
cvNamedWindow(“choose area”,CV_WINDOW_AUTOSIZE)
cvShowImage(“choose area”,my_grabbed_frame);
this->my_total_frame = (int) cvGetCaptureProperty(my_p_capture,CV_CAP_PROP_FRAME_COUNT);
this->Set_Mouse_Callback_for_Image(this->my_grabbed_frame);
return true;
}
After capturing a frame and setting up the mouse callback, mouse events can be processed in themy_Mouse_Handler();
function of the program’s Image_OP
class. The following code sample does not give a description of all possible mouse events. It only presents mouse events that are needed to draw a rectangle onto an image. Please note that the static variables for the static method my_Mouse_Handler();
have to be defined outside the class.
Collapse | Copy Code
void Image_OP::my_Mouse_Handler(int events, int x, int y, int flags, void* param)
{
IplImage *img_orig;
IplImage *img_clone;
img_orig = (IplImage*) param;
int x_ROI =0, y_ROI =0 , wi_ROI =0, he_ROI =0;
switch(events)
{
case CV_EVENT_LBUTTONDOWN:
{
my_point = cvPoint(x, y);
}
break;
case CV_EVENT_MOUSEMOVE:
{
if(flags == CV_EVENT_FLAG_LBUTTON )
{
img_clone = cvCloneImage(img_orig);
cvRectangle(img_clone, my_point,cvPoint(x,y),
CV_RGB(0,255,0),1,8,0);
cvShowImage(“choose area”,img_clone);
}
}
break;
case CV_EVENT_LBUTTONUP:
{
img_clone = cvCloneImage(img_orig);
if(my_point.x > x)
{
x_ROI = x;
wi_ROI = my_point.x - x;
}
else
{
x_ROI = my_point.x;
wi_ROI = x - my_point.x;
}
if(my_point.y > y)
{
y_ROI = y;
he_ROI = my_point.y - y;
}
else
{
y_ROI = my_point.y;
he_ROI = y - my_point.y;
}
my_ROI.x = x_ROI;
my_ROI.y = y_ROI;
my_ROI.width = wi_ROI;
my_ROI.height = he_ROI;
cvSetImageROI(img_clone,cvRect(x_ROI,
y_ROI,wi_ROI, he_ROI));
cvNot(img_clone, img_clone);
cvResetImageROI(img_clone);
cvShowImage(“choose area”, img_clone);
}
break;
} }
A selection of OpenCV functions for processing images
The second part of this tutorial mainly concerns some (mostly) simple OpenCV commands to process images. When using sophisticated methodologies (like optical flow; see first part of this tutorial) to detect or trace motion, it often provides better results “smoothing” images (= or processed frames) first, in order to iron out outliers produced by noise and camera artifacts.
OpenCV offers five different basic smoothing operations, which can be invoked by the commandcvSmooth(IplImage* source, IplImage* destination, int smooth_type, int param1 = 3, int param2 = 0, double param3 = 0, double param4 =0);
. I think it is clear that the first two arguments represent the input and the output image. More interesting is the third parameter that serves as a placeholder for one of five different values (which also determine the meaning of the parameters param1 to param4). In the following part I give an overview of the possible values for parameter three. If you are in need for more details, please, consult a book (like ‘Learning OpenCV’ by Gary Bradski and Adrian Kaehler) or an expert article on this.
The smooth_type
CV_BLUR
, for instance, calculates the mean color values of all pixels within an area around a central pixel (area specified by param1
and param2
).
CV_BLUR_NO_SCALE
does the same as CV_BLUR
but there is no division to create an average.
CV_MEDIAN
performs a similar operation with the only exception that it calculates the median value over the specified area.
CV_GAUSSIAN
is more complicated and does smoothing operations based on the Gaussian function (= normal distribution). param1
and param2
again define the area to which the algorithm is applied. param3 is the sigma value of the Gaussian function (will be calculated automatically if not specified) and if a value for param4 will be given there will be a different sigma value in horizontal (= param3 in this case) and in vertical direction.
CV_BILATERAL
is similar to the Gaussian smoothing, but weights more similar pixels more highly than less similar ones.
In the code samples that are part of this article, only one of the above “smoothing” functions is implemented. The method Blur(int square_size, IplImage*, IplImage*)
(see image on top of page) of the Image_OP
class carries out a simple blur based on the mean of a square area of pixels. How changing the size of this area affects the “blur” can be demonstrated by compiling the source code that comes with this tutorial. Just load a movie, select the option button “Blur” and move the bar of the trackbar control. Attention: If you intend to use other types of smoothing functions (like CV_GAUSSIAN
) problems might occur, because such functions do not accept all values that are returned by the trackbar.
Dilate and Erode
![](http://image60.360doc.com/DownloadImg/2013/04/1214/31585220_7.png)
Another way of removing noise from an image, but isolating or joining disparate regions as well, is based on dilation and erosion. For both kinds of transformations OpenCV offers corresponding functions (cvDilate()
andcvErode()
). These functions have a kernel (a small square or circle with an anchor point in the center of this area) running over an image. While this happens, the maximal (=dilation) or minimal (=erosion) pixel value of the kernel is computed and the pixel of the image under the anchor point is replaced by this maximum or minimum.
Because both functions perform similar tasks, they take the same arguments. For this reason I will discuss them forcvErode(IplImage* src, IplImage* dest,IplConvKernel* kernel = NULL, int iterations = 1);
only. The first two arguments are the source- and the destination image, the third argument is a pointer to anIplConvKernel
structure, and the last argument is the number of iterations performed by the algorithm. Creating your own kernel using the IplConvKernel
structure will not be discussed here, for this reason the standard (3x3 square kernel) kernel will be used.
Again, both functions are implemented as methods of the Image_OP
class and linked to the behavior of the main window’s trackbar control. Just load a video and click on the option button Erode (or on the button Dilate). Moving the bar of the trackbar will then change the parameter iter
(= iterations) of the Image_OP::Erode()
or theImage_OP::Dilate()
method. Depending on which of the two options you have chosen, the images will show expanded bright regions or expanded dark regions.
Drawing contours
![](http://image60.360doc.com/DownloadImg/2013/04/1214/31585220_8.png)
In this section I present some code that is able to extract the contours of images. In OpenCV, contours are represented as sequences of points that form a curve. To filter these point locations, OpenCV provides the functioncvFindContours(IplImage*, CvMemStorage*, CvSeq**,int headerSize,CvContourRetrievalMode,CvChainApproxMethod)
.
The first argument should be an 8-bit single channel image that will be interpreted as a binary image (all nonzero pixels are 1). The second argument is a linked list of memory blocks that is used to handle dynamic memory allocation. The third argument represents a pointer to the linked list in which the found points (contours) are stored.
The next arguments are optional and will not be discussed here in great detail, because they are not used in the code sample. The fourth argument can be simply set to sizeof(CvContour)
. The fifth argument encompasses four options: CV_RETR_EXTRENAL
= extracts extreme outer contours; CV_RETR_LIST
= is the standard option and extracts all contours; CV_RETR_CCOMP
= extracts contours and organizes them in a two level hierarchy; CV_RETR_TREE
= produces hierarchy of nested contours. The sixth argument determines how the contours are approximated (please look this up in a book on OpenCV).
The single step that needs to be carried out to display the contours of an image can be found in the methodImage_OP::Draw_Contours()
(see below). Similar to methods discussed before, one of the method’s arguments (here: first argument defining the threshold) is linked to the trackbar of the program’s main window.
Using the code
Collapse | Copy Codevoid Image_OP::Draw_Contours(int threshold, IplImage* orig_image, IplImage* manipulated_img)
{
CvMemStorage* mem_storage = cvCreateMemStorage(0);
CvSeq* contours =0;
IplImage* gray_img = cvCreateImage(cvSize(orig_img->width,orig_img->height)
IPL_DEPTH_8U,1);
int found_contours =0;
cvNamedWindow(“contours only”);
cvCvtColor(orig_img, gray_img, CV_RGB2GRAY),
cvThreshold (gray_img, gray_img, threshold, 255, CV_THRESH_BINARY);
found_contours = cvFindContours(gray_img, mem_storage, &contours);
cvZero(gray_img);
if(contours)
{
cvDrawContours(gray_img,cvScalarAll(255),cvScalarAll(255),100);
}
cvShowImage(“contours only“, gray_img);
cvReleaseImage(gray_img);
cvReleaseMemStorage(&mem_storage);
}
Saving motion data as video file
Contents of this section are strongly linked to the contents presented in my previous tutorial on OpenCV. The basic structure for the code sample below can already be found there (see the Video_OP::Play_Video()
method).
For this reason I keep the introduction to this topic very short. I just want to say some words on the FourCC notation, which was developed to identify data formats and is widely used to access AVI video codecs. The OpenCV macroCV_FOURCC
provides this functionality and takes a four character code that denotes a particular codec (e.g.,CV_FOURCC(’D’,’I’,’V’,’X’)
) . A prerequisite for applying CV_FOURCC
successfully is, of course, that the corresponding video codec is installed on the machine you are using.
Using the code
- Capture video file by invoking
this->Get_Video_from_File(char* file_name);
- Invoke
Video_OP::Write_Video(int from, int to, char* path);
(see code below). - Create a video writer by invoking
cvCreateVideoWriter(path, CV_FOURCC(’M’,’J’,’P’,’G’);
- Set up loop to process successive frames (or images) of video file.
- Grab frames by calling
cvQueryFrame(CvCapture*);
- Add frames (=images) to video file by calling
cvWriteFrame(CvVideoWriter *,IplImage*);
- Define delay of presentation by using
cvWaitKey(int);
(here: for demonstration purposes only)
Collapse | Copy Codevoid Video_OP::Write_Video(int from, int to, char* path)
{
this->my_on_off = true;
int key =0;
int frame_counter = from;
int fps = this->Get_Frame_Rate();
cvNamedWindow( "write to avi", CV_WINDOW_AUTOSIZE );
this->Go_to_Frame(from);
int frame_counter = from;
CvVideoWriter *video_writer = cvCreateVideoWriter(path,
CV_FOURCC('M','J','P','G'),fps,size);
while(this->my_on_off == true && frame_counter <= to)
{
this->my_grabbed_frame = cvQueryFrame(this->my_p_capture);
if( !this->my_grabbed_frame ) break;
cvWriteFrame(video_writer,my_grabbed_frame);
cvShowImage( "write to avi" ,my_grabbed_frame);
frame_counter++;
key = cvWaitKey(1000 /fps);
if (key == ’q’) break;
}
cvReleaseCapture( &my_p_capture );
cvDestroyWindow( "write to avi");
cvReleaseVideoWriter(&video_writer);
}
...
Additional points of interest
Most of the methods and operations that have been introduced here can be used in combination. This means that image operations that will be performed on the first frame of a video file can be confined to the region that has been selected with the mouse. In addition, these manipulations will be applied to all frames of a video if you click on the button ‘GO’ of the program’s main window.
There are methods in the source code files that have not been discussed here. For example, the Video_OP
class contains a method that turns a movie into single images and a method that does quite the opposite, namely turning single images into a movie. If you try to do the latter you also find some code that demonstrates how to retrieve the files of a folder by invoking the Win32 API functions FindFirstFile()
and FindNextFile()
.
OpenCV offers its own code to create a trackbar (or a slider) and to set up a message handler for it. I preferred to use the Win32 GUI trackbars instead, because it seemed more convenient to me. Still, you find some code in the source code files that shows how to use OpenCV’s own trackbar control. As a side issue the program and its source code files also demonstrate how buttons, sliders, textfields, and option buttons can be placed onto a window and used in a Win32 program.
There is no guarantee that the presented code is bug-free (and not all exceptions are handled), but I hope it is helpful for somebody who is looking for guidance on the topics discussed here.