BLAS-like Extensions

Intel® oneAPI Math Kernel Library provides C and Fortran routines to extend the functionality of the BLAS routines. These include routines to compute vector products, matrix-vector products, and matrix-matrix products.

Intel® oneAPI Math Kernel Library also provides routines to perform certain data manipulation, including matrix in-place and out-of-place transposition operations combined with simple matrix arithmetic operations. Transposition operations are Copy As Is, Conjugate transpose, Transpose, and Conjugate. Each routine adds the possibility of scaling during the transposition operation by giving some alpha and/or beta parameters. Each routine supports both row-major orderings and column-major orderings.

Table “BLAS-like Extensions” lists these routines.

The <?> symbol in the routine short names is a precision prefix that indicates the data type:

:
s: float
d: double
c: MKL_Complex8
z: MKL_Complex16

BLAS-like Extensions
Routine	Data Types	Description
cblas_?axpby	s, d, c, z	Scales two vectors, adds them to one another and stores result in the vector (routines).
cblas_?axpy_batch cblas_?axpy_batch_strided	s, d, c, z	Computes groups of vector-scalar products added to a vector.
cblas_?dgmm_batch_strided cblas_?dgmm_batch	s, d, c, z	Computes groups of diagonal matrix-general matrix product
cblas_?gemm_alloc	s, d	Allocates storage for a packed matrix.
cblas_?gemm_batch cblas_?gemm_batch_strided	s, d, c, z	Computes scalar-matrix-matrix products and adds the results to scalar matrix products for groups of general matrices.
cblas_gemm_bf16bf16f32	bfloat16	Computes a matrix-matrix product with general matrices of bfloat16 data type.
cblas_gemm_bf16bf16f32_compute	bfloat16	Computes a matrix-matrix product with general matrices of bfloat16 data type where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
cblas_gemm_*	Integer	Computes a matrix-matrix product with general integer matrices.
cblas_?gemm_compute	h, s, d	Computes a matrix-matrix product with general matrices where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
cblas_gemm_*_compute	Integer	Computes a matrix-matrix product with general integer matrices where one or both input matrices are stored in a packed data structure and adds the result to a scalar-matrix product.
cblas_?gemm_pack	h, s, d	Performs scaling and packing of the matrix into the previously allocated buffer.
cblas_gemm_*_pack	Integer, bfloat16	Pack the matrix into the buffer allocated previously.
cblas_?gemm_pack_get_size	h, s, d	Returns the number of bytes required to store the packed matrix.
cblas_gemm_*_pack_get_size	Integer, bfloat16	Returns the number of bytes required to store the packed matrix.
cblas_?gemm3m	c, z	Computes a scalar-matrix-matrix product using matrix multiplications and adds the result to a scalar-matrix product.
cblas_?gemm3m_batch	c, z	Computes a scalar-matrix-matrix product using matrix multiplications and adds the result to a scalar-matrix product.
cblas_?gemmt	s, d, c, z	Computes a matrix-matrix product with general matrices but updates only the upper or lower triangular part of the result matrix.
cblas_?gemv_batch_strided cblas_?gemv_batch	s, d, c, z	Computes groups of matrix-vector product using general matrices.
cblas_?trsm_batch ?cblas_?trsm_batch_strided	s, d, c, z	Solves a triangular matrix equation for a group of matrices.
mkl_?imatcopy	s, d, c, z	Performs scaling and in-place transposition/copying of matrices.
mkl_?imatcopy_batch_strided mkl_?imatcopy_batch	s, d, c, z	Computes groups of in-place matrix copy/transposition with scaling using general matrices.
mkl_?omatadd	s, d, c, z	Performs scaling and sum of two matrices including their out-of-place transposition/copying.
mkl_?omatcopy	s, d, c, z	Performs scaling and out-of-place transposition/copying of matrices.
mkl_?omatcopy_batch_strided mkl_?omatcopy_batch	s, d, c, z	Computes groups of out of place matrix copy/transposition with scaling using general matrices.
mkl_?omatcopy2	s, d, c, z	Performs two-strided scaling and out-of-place transposition/copying of matrices.
mkl_jit_create_?gemm	s, d, c, z	Creates a handle on a jitter and generates a GEMM kernel that computes a scalar-matrix-matrix product and adds the result to a scalar-matrix product, with general matrices.
mkl_jit_destroy		Deletes the previously created jitter and the generated GEMM kernel.
mkl_jit_get_?gemm_ptr	s, d, c, z	Returns the GEMM kernel previously generated.