You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 25 Next »

Benchmarks

This performance contains data load and export operations.

Dependencies Information :

  • Hadoop 0.18.2
  • Hbase 0.18.1

Hardware Information :

  • 4 Intel(R) Xeon(R) CPU 2.33GHz, SATA hard disk, Physical Memory 16,626,844 KB

  • Dense matrix add
  • Dense matrix multiply

Version

Operation

Cluster Size

Rows

Columns

Total Maps

Total Reduces

Time (seconds)

Bytes Read

Bytes Written

mapred.child.java.opts

Trunk 718158

Mult

2 node

300

300

2

2

12 seconds

1,464,484

2,929,092

-Xmx200m

Trunk 720735

Mult

2 node

1,000

1,000

2

2

20 seconds

16,166,452

32,333,028

-Xmx200m

Trunk 722320

Add

2 node

3,000

3,000

4

2

298 seconds

1,053,503,366

1,575,781,107

-Xmx200m

Trunk 722320

Mult

2 node

3,000

3,000

4

2

124 seconds

590,672,392

872,228,808

-Xmx200m

Trunk 722320

Mult

2 node

5,000

5,000

50

4

912 seconds

24,434,034,076

34,631,558,186

-Xmx200m

Version

Operation

Cluster Size

Rows

Columns

Total Maps

Total Reduces

Time (seconds)

Bytes Read

Bytes Written

mapred.child.java.opts

Trunk 718158

Mult

8 node

3000

3000

4

4

171 seconds

590,672,392

872,228,808

-Xmx200m

NOTE: The following numbers are obtained by using poe+ on the entire code, including minimal I/O and matrix construction.

Matrix-Matrix Multiply of 5,000 by 5,000 dense matrix

Mflip/s  Wall sec   Library
-------  --------   -------------------------------------------
 8,300       30     PESSL PDGEMM (16 processors)
 7,900       32     ScaLAPACK routine PDGEMM (16 processors)
 7,900       32     ESSL-SMP routine DGEMM (16 threads)
 7,900       32     NAG-SMP routine F01CKF (16 threads)
 1,200      213     ESSL routine DGEMM

Matrix-Matrix Multiply of 20,000 by 20,000 dense matrix

Mflip/s  Wall sec   Library and configuration
-------  --------   -------------------------------------------
158,900     100     ScaLAPACK PDGEMM (256 proc, 16 nodes) 
146,200     110     PESSL PDGEMM (256 proc, 16 nodes) 
105,400     150     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 128) 
100,960     160     PESSL PDGEMM (144 proc, 9 nodes, block 128) 
 79,400     200     PESSL PDGEMM (144 proc, 9 nodes, block 1024) 
 74,800     214     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 1024) 
 55,000     290     PESSL PDGEMM (64 proc, 4 nodes) 
 50,000     320     ScaLAPACK PDGEMM (64 proc, 4 nodes) 
 27,160     590     PESSL PDGEMM (32 proc, 2 nodes) 
 25,630     625     ScaLAPACK PDGEMM (32 proc, 2 nodes) 
 15,800   1,010     PESSL PDGEMM (16 Proc, 1 node)
 15,600   1,025     ScaLAPACK PDGEMM (16 Proc, 1 node)

Matrix-Matrix Multiply of Larger Dense Matrix

Gflip/s Wall sec Size    Library and configuration
------- -------- -------  -------------------------------------------
163.6   1,529   50,000  ScaLAPACK PDGEMM (256 proc, 16 nodes)
163.4   1,531   50,000  PESSL PDGEMM (256 proc, 16 nodes)
179.6  11,141  100,000  PESSL PDGEMM (256 proc, 16 nodes, 128 block)
210.7   9,495  100,000  ScaLAPACK PDGEMM (256 proc, 16 nodes, 128 block)

  • Dense LU factorization
  • Transpose
  • Matrix tridiagonalization, for eigenvalue computations of symmetric matrices.
  • No labels