Intel® oneAPI Math Kernel Library Developer Reference - Fortran
Performs scaling and packing of the matrix into the previously allocated buffer.
call sgemm_pack (identifier, trans, m, n, k, alpha, src, ld, dest)
call dgemm_pack (identifier, trans, m, n, k, alpha, src, ld, dest)
The ?gemm_pack routine is one of a set of related routines that enable use of an internal packed storage. Call ?gemm_pack after you allocate a buffer whose size is given by ?gemm_pack_get_size. The ?gemm_pack routine scales the identified matrix by alpha and packs it into the buffer allocated previously.
Do not copy the packed matrix to a different address because the internal implementation depends on the alignment of internally-stored metadata.
The ?gemm_pack routine performs this operation:
dest := alpha*op(src) as part of the computation C := alpha*op(A)*op(B) + beta*C
where:
For best performance, use the same number of threads for packing and for computing.
If packing for both A and B matrices, you must use the same number of threads for packing A as for packing B.
CHARACTER*1. Specifies which matrix is to be packed:
If identifier = 'A' or 'a', the routine allocates storage to pack matrix A.
If identifier = 'B' or 'b', the routine allocates storage to pack matrix B.
CHARACTER*1. Specifies the form of op(src) used in the packing:
If trans = 'N' or 'n' op(src) = src.
If trans = 'T' or 't' op(src) = srcT.
If trans = 'C' or 'c' op(src) = srcH.
INTEGER. Specifies the number of rows of the matrix op(A) and of the matrix C. The value of m must be at least zero.
INTEGER. Specifies the number of columns of the matrix op(B) and the number of columns of the matrix C. The value of n must be at least zero.
INTEGER. Specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B). The value of k must be at least zero.
REAL for sgemm_pack
DOUBLE PRECISION for dgemm_pack
Specifies the scalar alpha.
REAL for sgemm_pack
DOUBLE PRECISION for dgemm_pack
Array:
trans = 'N' or 'n' |
trans = 'T', 't', 'C', or 'c' |
|
identifier = 'A' or 'a' |
Size ld*k. Before entry, the leading m-by-k part of the array src must contain the matrix A. |
Size ld*m. Before entry, the leading k-by-m part of the array src must contain the matrix A. |
identifier = 'B' or 'b' |
Size ld*n. Before entry, the leading k-by-n part of the array src must contain the matrix B. |
Size ld*k. Before entry, the leading n-by-k part of the array src must contain the matrix B. |
INTEGER. Specifies the leading dimension of src as declared in the calling (sub)program.
trans = 'N' or 'n' |
trans = 'T', 't', 'C', or 'c' |
|
identifier = 'A' or 'a' |
ld must be at least max(1, m). |
ld must be at least max(1, k). |
identifier = 'B' or 'b' |
ld must be at least max(1, k). |
ld must be at least max(1, n). |
POINTER.
Scaled and packed internal storage buffer.
dest |
Overwritten by the matrix alpha*op(src). |