Math Kernel Library Developer Guide
The most significant parameters in HPL.dat are P, Q, NB, and N. Specify them as follows:
P and Q - the number of rows and columns in the process grid, respectively.
P*Q must be the number of MPI processes that HPL is using.
Choose P≤Q.
NB - the block size of the data distribution.
The table below shows recommended values of NB for different Intel® processors:
Processor |
NB |
---|---|
Intel® Xeon® Processor X56*/E56*/E7-*/E7*/X7* (codenamed Nehalem or Westmere) |
256 |
Intel Xeon Processor E26*/E26* v2 (codenamed Sandy Bridge or Ivy Bridge) |
256 |
Intel Xeon Processor E26* v3/E26* v4 (codenamed Haswell or Broadwell) |
192 |
Intel® Core™ i3/i5/i7-6* Processor (codenamed Skylake Client) | 192 |
Intel® Xeon Phi™ Processor 72* (codenamed Knights Landing) |
336 |
Intel Xeon Processor supporting Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions (codenamed Skylake Server) |
384 |
N - the problem size:
For homogeneous runs, choose N divisible by NB*LCM(P,Q), where LCM is the least common multiple of the two numbers.
For heterogeneous runs, see Heterogeneous Support in the Intel® Distribution for LINPACK* Benchmark for how to choose N.
Increasing N usually increases performance, but the size of N is bounded by memory. In general, you can compute the memory required to store the matrix (which does not count internal buffers) as 8*N*N/(P*Q) bytes, where N is the problem size and P and Q are the process grids in HPL.dat. A general rule of thumb is to choose a problem size that fills 80% of memory.
Product and Performance Information |
---|
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex. Notice revision #20201201 |