User Tools

Site Tools


cs415pdc:lab3

Lab 3 - Pthreads

The programs for this lab estimate PI using the Maclaurin series, which is an infinite series approximation typically covered in calculus. The programs compute the estimate by summing the first n terms in the series where n is a command-line argument (along with the number of threads to use). Typically the programs are run like so:

program_name number_threads n

Unless otherwise specified, use n=100000000.

For all these, you need timer.h

Thread overhead

Before getting into these programs, let's see how much extra work (overhead) is required to create threads in the first place. To find out:

  1. Compile pth_do_nothing.c as described in the file header comment.
  2. Run the program on one thread: ./pth_do_nothing 1 and record the result
  3. Repeat the run for 2, 4, and 8 threads, then try 64, 128, and 512 threads. Record the results; label each result with the process count
Why oh why? Detecting race conditions
  1. Compile pth_pi.c as described in the file
  2. Run the program with 8 threads and n=100000, again but with n=1000000, and again but with n=10000000. The estimate should improve when computing more terms. Does it?
  3. Run the program with 8 threads and n=100000000 terms, record the parallel result.
  4. Do this run 5 more times. Is the parallel result consistent across all runs? Why not?

These results are consistent with a race condition - the threads are stomping all over each others computations. The next two programs attempt to fix this by establishing and enforcing a critical section.

Critical section through busy-waiting
  1. Compile both pth_pi_busy1.c and pth_pi_busy2.c
  2. Run pth_pi_busy1 with 2 threads and n=100000000 terms 3 times. Are the results consistent? How much faster (or slower) is this program than the serial (single-threaded) version?
  3. Run pth_pi_busy2 with 2 threads and n=100000000 terms. How much faster (or slower) is this program than pth_pi_busy1?
  4. Run pth_pi_busy2 for 2, 4, and 8 threads, then try 64, 128, and 512 threads. Record the results; label each result with the process count.
  5. Compute speedup for the last 6 runs, noting specifically what happens when the number of threads is greater than the number of cores.
Critical section through mutual exclusion
  1. Do the same runs as pth_pi_busy2, recording runtime and estimate.
  2. Are the estimates significantly better or worse? Give the worst error (= difference between estimate and computed pi value, to 3 non-zero significant digits).
  3. For the last 6, what is your speedup?
cs415pdc/lab3.txt · Last modified: 2017/10/17 15:25 by scarl