You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 59 Next »

Benchmarks

This performance contains data load and export operations.

Dependencies Information :

  • Hadoop 0.18.2
  • Hbase 0.18.1

Hardware Information :

  • 4 Intel(R) Xeon(R) CPU 2.33GHz, SATA hard disk, Physical Memory 16,626,844 KB

  • Dense matrix add
  • Dense matrix multiply

NOTE that 10,000 by 10,000 matrix takes 800MB and 1 hour on single node.

Version

Operation

Cluster Size

Rows

Columns

Total Maps

Total Reduces

Time (seconds)

Bytes Written

Trunk 712655

Add

2 node

1,000

1,000

2

2

17 seconds

66,326,104

Trunk 712658

Mult

2 node

300

300

2

2

181 seconds

5,929,512

Version

Operation

Cluster Size

Rows

Columns

Total Maps

Total Reduces

Time (seconds)

Bytes Read

Bytes Written

Trunk 718158

Mult

2 node

300

300

2

2

12 seconds

1,464,484

2,929,092

Trunk 720735

Mult

2 node

1,000

1,000

2

2

20 seconds

16,166,452

32,333,028

Trunk 722320

Mult

2 node

3,000

3,000

4

2

124 seconds

590,672,392

872,228,808

NOTE: The following numbers are obtained by using poe+ on the entire code, including minimal I/O and matrix construction.

Matrix-Matrix Multiply of 5,000 by 5,000 dense matrix

Mflip/s  Wall sec   Library
-------  --------   -------------------------------------------
 8,300       30     PESSL PDGEMM (16 processors)
 7,900       32     ScaLAPACK routine PDGEMM (16 processors)
 7,900       32     ESSL-SMP routine DGEMM (16 threads)
 7,900       32     NAG-SMP routine F01CKF (16 threads)
 1,200      213     ESSL routine DGEMM

Matrix-Matrix Multiply of 20,000 by 20,000 dense matrix

Mflip/s  Wall sec   Library and configuration
-------  --------   -------------------------------------------
158,900     100     ScaLAPACK PDGEMM (256 proc, 16 nodes) 
146,200     110     PESSL PDGEMM (256 proc, 16 nodes) 
105,400     150     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 128) 
100,960     160     PESSL PDGEMM (144 proc, 9 nodes, block 128) 
 79,400     200     PESSL PDGEMM (144 proc, 9 nodes, block 1024) 
 74,800     214     ScaLAPACK PDGEMM (144 proc, 9 nodes, block 1024) 
 55,000     290     PESSL PDGEMM (64 proc, 4 nodes) 
 50,000     320     ScaLAPACK PDGEMM (64 proc, 4 nodes) 
 27,160     590     PESSL PDGEMM (32 proc, 2 nodes) 
 25,630     625     ScaLAPACK PDGEMM (32 proc, 2 nodes) 
 15,800   1,010     PESSL PDGEMM (16 Proc, 1 node)
 15,600   1,025     ScaLAPACK PDGEMM (16 Proc, 1 node)

Matrix-Matrix Multiply of Larger Dense Matrix

Gflip/s Wall sec Size    Library and configuration
------- -------- -------  -------------------------------------------
163.6   1,529   50,000  ScaLAPACK PDGEMM (256 proc, 16 nodes)
163.4   1,531   50,000  PESSL PDGEMM (256 proc, 16 nodes)
179.6  11,141  100,000  PESSL PDGEMM (256 proc, 16 nodes, 128 block)
210.7   9,495  100,000  ScaLAPACK PDGEMM (256 proc, 16 nodes, 128 block)

  • Dense LU factorization
  • Transpose
  • Matrix tridiagonalization, for eigenvalue computations of symmetric matrices.
  • No labels