Saturday, February 04, 2017

Compiling OpenCV with CUDA (GPU) using Visual Studio


I have a tendency to choose the exact wrong thing every time when given a choice; I have sometimes wondered why.  A good thing with doing things almost wrong is that you get to learn about things.I have a feeling that doing things wrong and getting feedback and correcting is somehow fundamental in the way learning process happens.

I usually start learning a technology or language by jumping right in doing things wrong and learning on the way; if you are like me, then this will save you some time and some hair pulling.

Before we start, just a very short introduciton into the why part,trust me just the bare essentials.

OpenCV operates on images , which in computers (at least the ones we have now) is stored a pixel matrices. Various algorithms that opencv provides, for example for object detection for example does a lot of matrix operations. These operations are 'embarrassingly parallel' -data parallel and could be speeded if executed in the GPU.

Now NVDIA GPU have an  parallel programming  API called CUDA which can help in speeding up matrix multiplication. And OpenCV has support for the same; to use it however you need to compile OpenCV with CUDA. CUDA is NVDIA proprietary and it would work with only NVDIA  GPUs.

There is an open API which should work with different typed of GPU cards and that is OpenCL. However it may not be that  tuned for a particular card through. OpenCV has support for OpenCL too; however we will for now use CUDA.

Finally one more thing; CUDA uses BLAS libraries. The CUDA SDK provided by NVDIA has the cublas libraries for it. Don't ask me why I chose to compile OpenBLAS for it; as I said before CMake gives a lot of choices and if you don't know as much as above, you are sure to do some totally unnecessary but very instructive things.

Okay now to to the how;on Windows

First check if your PC or laptop has an NVDIA card. The easiest to do is via dxdiag windows utility


Now see if your card supports CUDA. There is a good utility from TechPowerUp GPU-Z that  will show this information among other like GPU load etc; which will help later to see if the programs are really using the GPU; Or you could check the NVDIA website for the Card and see if it supports; I guess most cards do; or you could check the very detailed page in wikipedia which lists the various generations of the processors  https://en.wikipedia.org/wiki/CUDA



Next step is to download the CUDA SDK from NVDIA.https://developer.nvidia.com/cuda-downloads; If you have a 64 bit system download the 64 bit SDK. Choose defaults and install it.

Then download the OpenCV source code from GIT and download CMake tool. You need to download MS Visual Studio Community edition for C++ compiler.

The main thing in correct compilation is to choose the right settings in CMake; First these are the minimum WITH variables needed to be configured



Miss few or mess with few and you will have lot of errors coming. I tried guessing and removed and got lot of errors while running the program. WITH_CUDA is madatory; If you need to see the videos image in GUI make sure to select WIN32UI and FFMPEG. I am still not sure if some are needed or why they are here. Please don't feel appaled; I learn this way; I have no clue initially and I learn to figure it out the hard way. It is something to do with being stupid.Why I removed the defaults was to cut down on the compile time from better half of the day to something more reasonable.

 Then I found that the best way to reduce the compile time was to limit the architecture to the number I though the GPU card was supporting. In my case for GeForce GT 720M card in the CUDA wiki page the architecture code name was Fermi and compute capability was given as 2.1 . That did not work; so I gave 2.0 and I found compile time decreased considerably.


After that you Configure and make sure you select the 64 bit Visual Studio Compiler. Select 32 bit or do some other mistake and you will be led to lot of Configurations erros



If that is the case CMake will automatically select the 64 bit libraries from the CUDA SDK. Else it will try to take the 32 bit libraies and you may get configuration error about BLAS



With that you may be able to compile your OpenCV program . Note that when I used the default ARCH_BIN setting which goes all the way from 1 to 5 I got some linker errors -

Severity Code Description Project File Line Suppression State
Error LNK2019 unresolved external symbol __cudaRegisterLinkedBinary_54_tmpxft_000028d8_00000000_15_gpu_mat_compute_37_cpp1_ii_71482d89 referenced in function "void __cdecl __sti____cudaRegisterAll_54_tmpxft_000028d8_00000000_15_gpu_mat_compute_37_cpp1_ii_71482d89(void)" (?__sti____cudaRegisterAll_54_tmpxft_000028d8_00000000_15_gpu_mat_compute_37_cpp1_ii_71482d89@@YAXXZ) opencv_core D:\build\opencv2\modules\core\cuda_compile_generated_gpu_mat.cu.obj 1


For your program using  the above built OpenCV usually most of the libraries  given below are needed. If your build of OpenCV is proper you would get these many dlls in the output folder. If some are missing try to build it from Visual Studio


opencv_calib3d320.lib
opencv_core320.lib
opencv_features2d320.lib
opencv_flann320.lib
opencv_highgui320.lib
opencv_imgcodecs320.lib
opencv_imgproc320.lib
opencv_ml320.lib
opencv_objdetect320.lib
opencv_shape320.lib
opencv_ts320.lib
opencv_video320.lib
opencv_videoio320.lib
opencv_cudaimgproc320.lib
opencv_cudaarithm320.lib
opencv_cudabgsegm320.lib
opencv_cudacodec320.lib
opencv_cudaimgproc320.lib
opencv_cudalegacy320.lib
opencv_cudaobjdetect320.lib
opencv_cudawarping320.lib
opencv_cudev320.lib
opencv_cudafilters320.lib

If you get include errors see the link

http://answers.opencv.org/question/29885/does-opencv_moduleshpp-exist-in-opencv248/

Finally check with GPU-Z and see if running the program is really using the GPU


Note for building your OpenCV solutions (1) using these libs and the following headers have to be added to OpenCV

(1) People detection example - https://gist.github.com/alexcpn/aeb8a4b8304639d8f91cc2fbc0c1c7df

Include Directories

C/C++ --> General --> Additional Include Directories

- D:\opencv\modules\calib3d\include;D:\opencv\modules\videoio\include;D:\opencv\modules\video\include;D:\opencv\modules\imgcodecs\include;D:\opencv\modules\cudaoptflow\include;D:\opencv\modules\cudastereo\include;D:\build\opencv4;d:\opencv\modules\core\include;D:\opencv\modules\cudawarping\include;D:\opencv\include;D:\opencv\modules\cudaobjdetect\include;D:\opencv\modules\cudaimgproc\include;D:\opencv\modules\imgproc\include;D:\opencv\modules\highgui\include;D:\opencv\modules\objdetect\include;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include

Note opencv2/opencv_modules.hpp is from the opencv build folder d:\build\opencv4\opencv2
and not from opencv git source; This dir should be in include path
 (d:\build\opencv4\ is the output directory specified in CMake)




Libs
Linker-->Input--> Additional Dependencies --> opencv_calib3d320.lib;opencv_core320.lib;opencv_features2d320.lib;opencv_flann320.lib;opencv_highgui320.lib;opencv_imgcodecs320.lib;opencv_imgproc320.lib;opencv_ml320.lib;opencv_objdetect320.lib;opencv_shape320.lib;opencv_ts320.lib;opencv_video320.lib;opencv_videoio320.lib;opencv_cudaimgproc320.lib;opencv_cudaarithm320.lib;opencv_cudabgsegm320.lib;opencv_cudacodec320.lib;opencv_cudalegacy320.lib;opencv_cudaobjdetect320.lib;opencv_cudawarping320.lib;opencv_cudev320.lib;opencv_cudafilters320.lib;%(AdditionalDependencies)

Lib Directories : - D:\build\opencv4\lib\Release

Here is what I did to install the latest OpenCV in an x86 664 bit machine running Ubuntu

     sudo apt-get install -y build-essential cmake
//video codecs; these many are not given in opencv site but got this from some other blog; I am not sure what is the bare minimum
     sudo apt-get install -y libdc1394-22-dev libavcodec-dev libavformat-dev libswscale-dev libtheora-dev libvorbis-dev libxvidcore-dev libx264-dev yasm libopencore-amrnb-dev libopencore-amrwb-dev libv4l-dev libxine2-dev
     sudo apt-get install -y libtbb-dev libeigen3-dev
     sudo apt-get install libavformat-dev libswscale-dev

    //  The below should be done at the begining; I did not do this and got some broken package error above; so did it; you learn the hard way :)
     sudo apt-get -y update  
     sudo apt-get -y upgrade
     sudo apt-get -y autoremove

     sudo apt-get install -y libtbb-dev libeigen3-dev
     sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
     sudo apt-get install libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
     sudo apt-get install -y qt5-default
     sudo apt-get install -y zlib1g-dev libjpeg-dev libwebp-dev libpng-dev libtiff5-dev libjasper-dev libopenexr-dev libgdal-dev




No comments:

Total Pageviews