Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Version

Operation

Cluster Size

Rows

Columns

Total Maps

Total Reduces

Time (seconds)

Bytes Read

Bytes Written

Trunk 718158

Mult

2 node

300

300

2

2

12 seconds

1,464,484

2,929,092

Trunk 720735

Mult

2 node

1,000

1,000

2

2

20 seconds

16,166,452

32,333,028

Trunk 722320

Mult

2 node

3,000

3,000

4

2

124 seconds

590,672,392

872,228,808

No Format
NOTE: The following numbers are obtained by using poe+ on the entire code, including minimal I/O and matrix construction.

Matrix-Matrix Multiply of 5,000 by 5,000 dense matrix

Mflip/s  Wall sec   Library
-------  --------   -------------------------------------------
 8,300       30     PESSL PDGEMM (16 processors)
 7,900       32     ScaLAPACK routine PDGEMM (16 processors)
 7,900       32     ESSL-SMP routine DGEMM (16 threads)
 7,900       32     NAG-SMP routine F01CKF (16 threads)
 1,200      213     ESSL routine DGEMM

Matrix-Matrix Multiply of 20,000 by 20,000 dense matrix

Mflip/s  Wall sec   Library and configuration
-------  --------   -------------------------------------------
158,900     100     ScaLAPACK PDGEMM (256 proc, 16 nodes) 
146,200     110     PESSL PDGEMM (256 proc, 16 nodes) 
105,400     150     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 128) 
100,960     160     PESSL PDGEMM (144 proc, 9 nodes, block 128) 
 79,400     200     PESSL PDGEMM (144 proc, 9 nodes, block 1024) 
 74,800     214     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 1024) 
 55,000     290     PESSL PDGEMM (64 proc, 4 nodes) 
 50,000     320     ScaLAPACK PDGEMM (64 proc, 4 nodes) 
 27,160     590     PESSL PDGEMM (32 proc, 2 nodes) 
 25,630     625     ScaLAPACK PDGEMM (32 proc, 2 nodes) 
 15,800   1,010     PESSL PDGEMM (16 Proc, 1 node)
 15,600   1,025     ScaLAPACK PDGEMM (16 Proc, 1 node)

Matrix-Matrix Multiply of Larger Dense Matrix

Gflip/s Wall sec Size    Library and configuration
------- -------- -------  -------------------------------------------
163.6   1,529   50,000  ScaLAPACK PDGEMM (256 proc, 16 nodes)
163.4   1,531   50,000  PESSL PDGEMM (256 proc, 16 nodes)
179.6  11,141  100,000  PESSL PDGEMM (256 proc, 16 nodes, 128 block)
210.7   9,495  100,000  ScaLAPACK PDGEMM (256 proc, 16 nodes, 128 block)

...