Building the OpenMP* Version

To build the OpenMP* version, you will modify the sample application to use OpenMP* parallelization and then compile the modified code. You will then run the application and compare the time with the baseline performance time.

  1. Remove all of the files that were created when you build the serial version by running the following command in a terminal session:

    %make clean

  2. Open the source file src/build_with_openmp/build_with_openmp.cpp in your favorite code editor.
  3. Do the following in the draw_task function:

    • Remove the comment marks from the OpenMP* pragma omp for nowait schedule(dynamic) pragma to create the parallel regions and distribute the execution of the for loop iterations to the team of threads . The schedule(dynamic) clause describes that each thread will receive a small number of loop iterations to execute, and then when finished takes on another small number of loop iterations. The clause improves performance by load balancing the application so that each thread remains busy. The nowait clause removes the implicit barrier at the end of the parallel for region.

    • Remove the comment marks from the line if ... continue; inside the parallel region. The line prevents other threads from doing work when the video is stopped.

    • Add comment marks to the line if … return; inside the parallel region. OpenMP parallel regions cannot branch out of the parallel region.

    • Remove the comment marks from the line video->next_frame(); at the end of the parallel region to replace the if ... return; line inside the parallel region that you added comment marks.

  4. In the thread_trace function, remove the comment marks from the #pragma omp parallel to create a parallel region for the draw_task function to run in.

  5. Build the sample by running the following command in a terminal session:

    %make openmp

  6. Run the sample application.

Compare the time to render the image to the baseline performance time.

If you wish to explicitly set the number of threads, you can set the environment variable OMP_NUM_THREADS=N where N is the number of threads. Alternatively, you can use the function void omp_set_num_threads(int nthreads) that is declared in omp_lib.h. Make sure to call this function before the parallel region is defined.

Options that use OpenMP* are available for both Intel and non-Intel microprocessors, but these options may perform additional optimizations on Intel® microprocessors than they perform on non-Intel microprocessors. The list of major, user-visible OpenMP* constructs and features that may perform differently on Intel versus non-Intel microprocessors includes: