Home | Programming | C++ | BOOST | 01
RSS Githublogo Youtubelogo higgs.at kovacs.cf

without multithreading

With Boost Thread multithreaded programs can be made. A program which is not written for multithreads is always running seriell. Every step is done after the previous one is completely done. If a program is optimized for multi threads the program runs (partially) parallel.

This means we can break the program into steps and every step is done in a single thread. The threads are executed on different CPUs.

Not every problem can be broken down into threads and therefore not every program can be threaded. Running a multithread program also requires a hardware which supports this(multicore CPU).

The program below calculates prime numbers between 0 and 10000000. It consists of a function called calc_prime() which takes the number we want to check if it is a prime number. The calculation in the function should be straight forward (with the modulo-operator). If the number is a prime it will be printed.

#include <iostream>
#include <math.h>

int calc_prime(int prime_to_test){
     bool prime_flag = true;
     float prim_divide = 0;

     for (int t1 = 2 ; t1 <= sqrt(prime_to_test) ; t1++){
          prim_divide = prime_to_test%t1;

          if (prim_divide == 0 and t1 != prime_to_test){
               prime_flag = false;
               break;
          }
     }

     if (prime_flag == true){
          std::cout << prime_to_test << std::endl;
     }
}

int main(int argc, char *argv[]){
     int max_prime_number = 10000000;

     for (int t2 = 2 ; t2 < max_prime_number ; t2++){
          calc_prime(t2);
     }

     return 0;
}

In the main function the calc_prime() will be called for each number until the limit(max_prime_number) is reached.



with multithreading

While the code above is not using (multi)threads, the code below is:
Fist we will include the boost/thread libraries with #include <boost/thread.hpp >.

The function calc_prime() is the same as above!

In addition we will use a second function called void prime_threads(int tr_num, int max_prime_n, int numCPU) . This function will take three arguments:

tr_num : The number of the thread
max_prime_n : Highest number we want to check if it is prime
numCPU : Number of cores in the CPU

Inside of the function prime_threads the numbers will be checked if they are prime.
#include <boost/thread.hpp>
#include <iostream>
#include <math.h>

void wait(int seconds)
{
     boost::this_thread::sleep(boost::posix_time::seconds(seconds));
}

int calc_prime(int prime_to_test){
     bool prime_flag = true;
     float prim_divide = 0;

     for (int t1 = 2 ; t1 <= sqrt(prime_to_test) ; t1++){
          prim_divide = prime_to_test%t1;

          if (prim_divide == 0 and t1 != prime_to_test){
               prime_flag = false;
               break;
          }
     }

     if (prime_flag == true){
          std::cout << prime_to_test << std::endl;
     }
}

void prime_threads(int tr_num, int max_prime_n, int numCPU)
{
     for (int act_test_num = tr_num ; act_test_num <= max_prime_n ; act_test_num+=numCPU){
          if (act_test_num != 1)
               calc_prime(act_test_num);
     }
}

int main(int argc, char *argv[]){

     int numCPU = sysconf( _SC_NPROCESSORS_ONLN );
     int max_prime_n = 10000000;

     boost::thread tread1(prime_threads, 1, max_prime_n, numCPU);
     boost::thread tread2(prime_threads, 2, max_prime_n, numCPU);
     boost::thread tread3(prime_threads, 3, max_prime_n, numCPU);
     boost::thread tread4(prime_threads, 4, max_prime_n, numCPU);

     tread1.join();
     tread2.join();
     tread3.join();
     tread4.join();

     return 0;
}

In the main function we will determine, how many CPUs we can use with the command sysconf( _SC_NPROCESSORS_ONLN )

I had 4 CPUs available. Then we set max_prime_n to 10000000. We want to find all primes between 0 and 10000000.

Then the important (multithreading) part comes:
We will create four threads (by hand now) with the constructor boost::thread treadn where n goes from 1 to 4. We will start as many threads(Every thread needs to have a different name) as CPUs we got available! The constructor takes the name of the function we want to start in this thread, the number 1 to 4, max_prime_n and the amount of CPUs we have.

Every thread recieves a number (1,2,3 or 4). In the function prime_threads every thread starts searching primes beginning at this number. The next number checked for prime is act_test_num+=numCPU. With this we segmented the prime search! Each of the four threads is now running at the same time.

At last the command tread.join(); tells the main function to exit only if this thread is done. This command is called for every thread because we dont want the main function to exit before any thread is done!

The code without multithreading checkes every number from 0 to 10000000 - one number after all!

The multithreaded program now has four threads:

The table below illustrates the four threads and which number every thread is checking for being prime.

Thread Number checked by the tread
thread1  1  5  9 13 17
thread2  2  6 10 14 18
thread3  3  7 11 15 19
thread4  4  8 12 16 20


Without multithreading the calculation is like below:
Thread Number checked by the tread
thread1  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

It is like we have only one thread!

If it needs more time to set up the threads than running the program without threads the compiler is bright enough to skip the threading. Therefore the number max_prime_n should be high enough :)

Under Linux the program can be compiled with g++ prime_thread.cpp -lboost_system -pthread -lboost_thread-mt and executed with ./a.out

With the command time ./a.out the running time of the program can be tracked. The picture below shows the arithmetic mean of five runs for different amount of threads.

prime_thread_5r_e.png"

The maximum time saved can be achieved with three threads. More than three threads does not speed the calculation up because more threads can't be executed on the CPUs.

Why the significant decline of running time at ~3 threads when there are 4 Cores in the CPU?

Because the OS needs some resources!

Despite this method not being the best to calculate primes, the object of multithreading can be shown quite good :)

The examples above are not optimized completely. They should give a short overview how to use the library BOOST thread and how to set up a multithread program (which runs on a multicore CPU).