Thursday, October 31, 2019

GPU and CUDA explorations with OpenCV


Bscancompute
    While loop 9.01159 sec.
      While loop 12.5376 sec.
        While loop 13.7008 sec.

            2nd run
              While loop 9.03921 sec.
                While loop 8.88572 sec.
                  While loop 8.32227 sec.

                      Test wo FFT,
                        Before while loop 0.0226007 sec.
                          While loop 3.82644 sec.
                            While loop 3.6609 sec.
                              While loop 3.07903 sec.

                              • A simple sample program, to perform a 2D DFT and inverse, gave the following results on running more than once -
                              • samplecudadft,
                                480x360 fft and inv fft
                                DFT and inverse, with upload/dl 0.255261 sec.
                                (after running several times).

                                2400x360,
                                DFT and inverse, with upload/dl 0.261917 sec.

                                CPU
                                E:\OCT\opencvcuda\OpencvCuda\x64\Release>samplecudadft
                                DFT and inverse, on CPU 0.0144597 sec.
                                DFT and inverse, with upload/dl 0.260429 sec.
                              • bottleneck seems to be FFT "planning" and not upload download.

                                E:\OCT\opencvcuda\OpencvCuda\x64\Release>samplecudadftDFT and inverse, on CPU 0.0146046 sec.
                                DFT and inverse, with upload/dl 0.263986 sec.
                                DFT and inverse, without upload/dl 0.260521 sec.
                                and running it a 2nd time,
                                DFT and inverse, 2nd time without upload/dl 0.00618453 sec.
                              • Even when called as a function, quite fast after the first time.
                                DFT and inverse, as a function 0.00678208 sec.
                                DFT and inverse, as a function 0.00605592 sec.
                                DFT and inverse, as a function 0.00654344 sec.
                                DFT and inverse, as a function 0.00608254 sec.
                                DFT and inverse, as a function 0.00651554 sec.
                                DFT and inverse, as a function 0.00623649 sec.
                                DFT and inverse, as a function 0.00607195 sec.
                                DFT and inverse, as a function 0.0063154 sec.
                                DFT and inverse, as a function 0.00597894 sec.
                                DFT and inverse, as a function 0.00685649 sec.
                                DFT and inverse, as a function 0.00608895 sec.
                              • Will probably need to optimize based on
                                https://docs.opencv.org/master/dd/d3d/tutorial_gpu_basics_similarity.html
                              • Currently, with upload / download and variable assignment not optimized,
                                on cpu,
                                While loop 9.15065 sec.
                                on gpu,
                                While loop 9.34765 sec.
                                where variables were initialized, 64 to 32 conversion and back were included.

                                More important, the Bscan image shows up a bug.

                              2 comments:

                              1. How did you manage to actually install the binary from james. I am confused. I downloaded the file from downloads section but i have no idea what to do with it.

                                ReplyDelete
                              2. "binary from james" - Installation of pre-built OpenCV on Windows is described at https://docs.opencv.org/master/d3/d52/tutorial_windows_install.html

                                ReplyDelete