ARC3D is one of the Perfect Club Benchmark Codes. In this code, an implicit solution procedure is employed for lines of the grid in all three dimensions. For this 1D parallelisation, two of the three dimension solutions require no communication, the third operates across the data partition requiring the use of a pipeline across the processor topology. The nature of the code and the communication startup latency suffered by the hardware platforms can degrade the parallel performance, therefore results with and without this pipeline are shown. Further optimization of this code section could either increase the number of parallel pipelines or transpose the program data prior to this pipeline section, both improving efficiency.

Code information: 3600 lines of source and 25 subroutines

Total Parallelization Time using ParaWise : Approximately 2 hours.

User Time: Approximately 20 minutes.

  Results

bullet

Cray T3D

bullet

IBM SP2

bullet

Transtech Paramid

bullet

Parsys SN9500

  Cray T3D

2-D Partition 

(64x64x64 test case)

Processors Time(secs) Speed Up

1

103.5

-

16(4x4) 8.8 11.7
32(4x8) 5.2 19.8
64(8x8) 3.3 31.2

 

  IBM SP2

1-D Partition 

(64x64x64 test case)

2-D Partition 

(64x64x64 test case)

Processors Time(secs) Speed Up Processors Time(secs) Speed Up
1 107.3

-

1

106.6

-

4 32.8 3.2 4(2x2) 28.1 3.7
16 25.6 4.1 16(4x4) 13.1 8.1

  Transtech Paramid

1-D Partition 

(40x33x40)

Processors

Synchronous Speed Up

Overlapping calc and comm

Speed Up

1 - -
2 1.65(1.8) 1.75(1.92)
4 2.45(2.9) 3.03(3.59)
6 2.95(3.7) 3.88(4.98)
8 3.45(4.5) 4.67(6.58)

The results in parenthesis represent the Speed Up with the pipeline removed.

Speed Up Graph of ARC3D for a 40x33x40 problem on the Transtech Paramid.

Time Graph of ARC3D for a 40x33x40 problem on the Transtech Paramid.

  Parsys SN9500

1-D Partition 

(40x33x40)

Processors

Synchronous Speed Up

Overlapping calc and comm

Speed Up

1 - -
2 1.94 1.92
3 2.64 2.69
4 3.24 3.30
5 4.19 4.28
6 4.90 5.01

Speed Up Graph of ARC3D for a 40x23x30 problem on the Parsys SN9500.

Time Graph of ARC3D for a 40x23x30 problem on the Parsys SN9500.