Recall from the previous section that our parallelDiffusion program with parameters 8000 1000 100 produced the following:
This is process: 0 After 100 time steps some results are: actual u[1000][10] = 0.874806 computed u[1000][10] = 0.873978 This is process: 0 After 100 time steps some results are: actual u[2000][10] = 0.874775 computed u[2000][10] = 0.848010
But wait! We've got a problem. The computed result at u[400][10] is just fine and agrees with the serialDiffusion result. But the computed result at u[1000][10] deviates from the serailDiffusion program's output by more than 3%. Something's wrong...
This calls for debugging; we'll get to that in a second, when we try out TotalView. But since the ITA trace file parallelDiffusion.stf has already been created, let's have a look at it.
In your ~/MPI_Tutorial/ParallelDiffusion/ directory there should be a file named parallelDiffusion.stf . This trace file was created automatically by ITC as parallelDiffusion executed as was the case with our previous examples.
Exercise: Do this!
As you did before, open the ITA trace file by entering:
[agopu@bc81 agopu]$ cd ~/MPI_Tutorial/ParallelDiffusion [agopu@bc81 ParallelDiffusion]$ traceanalyzer parallelDiffusion.stf &You will first see the ITA initial timeline for the parallelDiffusion program run on four processors. Follow the instructions explained previously in the Initial ITA timeline and whole trace and Detailed ITA timeline sections to open up the detailed ITA timeline; you should see something similar to what is shown below:
The above screen shot of the timeline is much more crowded than the ones we saw in our earlier examples. We had specified 100 time steps in this run, and you can see 100 interprocessor communication events (the vertical black lines).
Exercise: Do this!
But you should try to zoom in on a portion of the timeline: Point the mouse at one of the processor bars, then left-click, drag your mouse horizontally across a small portion [we recommend you try to zoom into a region between two of the vertical lines and zoom in a bit further so you can see details at the milli-second scale], and release; You should see an expanded timeline for the small portion you chose. It should look similar to the screen shot below:
It should be obvious that the best part of time is spent on computation [green portions on the bars, very good] and then there are MPI_Send and MPI_Recv communications between processes [Red portions in the bar, less is better] to exchange shadow point details.
Exercise: Do this!
Try to play around with the interval that's profiled by using your mouse (i.e. zoom into a small interval of time to see how the trace looks).
Hopefully you are done exploring ITA. So let's turn now to TotalView to try to unearth the source of our inaccuracy.
| Previous: View ITC Instrumented Trace ... | Up: Table of Contents | Next: Introducing TotalView: Debugging roundRobin |
|---|