gemm

Computes a matrix-matrix product with general matrices.

Description

The gemm routines compute a scalar-matrix-matrix product and add the result to a scalar-matrix product, with general matrices. The operation is defined as:

C \leftarrow alpha*op(A)*op(B) + beta*C

where:

  • op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH

  • alpha and beta are scalars

  • A, B and C are matrices

  • op(A) is m x k matrix

  • op(B) is k x n matrix

  • C is m x n matrix

gemm supports the following precisions:

Ts

Ta

Tb

Tc

sycl::half

sycl::half

sycl::half

sycl::half

float

sycl::half

sycl::half

float

float

oneapi::mkl::bfloat16

oneapi::mkl::bfloat16

float

float

float

float

float

double

double

double

double

std::complex<float>

std::complex<float>

std::complex<float>

std::complex<float>

std::complex<double>

std::complex<double>

std::complex<double>

std::complex<double>

gemm (Buffer Version)

Syntax

namespace oneapi::mkl::blas::column_major {
    void gemm(sycl::queue &queue,
              oneapi::mkl::transpose transa,
              oneapi::mkl::transpose transb,
              std::int64_t m,
              std::int64_t n,
              std::int64_t k,
              Ts alpha,
              sycl::buffer<Ta,1> &a,
              std::int64_t lda,
              sycl::buffer<Tb,1> &b,
              std::int64_t ldb,
              Ts beta,
              sycl::buffer<Tc,1> &c,
              std::int64_t ldc)
}
namespace oneapi::mkl::blas::row_major {
    void gemm(sycl::queue &queue,
              oneapi::mkl::transpose transa,
              oneapi::mkl::transpose transb,
              std::int64_t m,
              std::int64_t n,
              std::int64_t k,
              Ts alpha,
              sycl::buffer<Ta,1> &a,
              std::int64_t lda,
              sycl::buffer<Tb,1> &b,
              std::int64_t ldb,
              Ts beta,
              sycl::buffer<Tc,1> &c,
              std::int64_t ldc)
}

Input Parameters

queue

The queue where the routine should be executed.

transa

Specifies op(A), the transposition operation applied to matrix A. See Data Types for more details.

transb

Specifies op(B), the transposition operation applied to matrix B. See Data Types for more details.

m

Number of rows of matrix op(A) and matrix C. Must be at least zero.

n

Number of columns of matrix op(B) and matrix C. Must be at least zero.

k

Number of columns of matrix op(A) and rows of matrix op(B). Must be at least zero.

alpha

Scaling factor for matrix-matrix product.

a

Buffer holding input matrix A. See Matrix Storage for more details.

transa = transpose::nontrans

transa = transpose::trans or trans = transpose::conjtrans

Column major

A is m x k matrix. Size of array a must be at least lda * k

A is k x m matrix. Size of array a must be at least lda * m

Row major

A is m x k matrix. Size of array a must be at least lda * m

A is k x m matrix. Size of array a must be at least lda * k

lda

Leading dimension of matrix A. Must be positive.

transa = transpose::nontrans

transa = transpose::trans or trans = transpose::conjtrans

Column major

Must be at least m

Must be at least k

Row major

Must be at least k

Must be at least m

b

Buffer holding input matrix B. See Matrix Storage for more details.

transb = transpose::nontrans

transb = transpose::trans or trans = transpose::conjtrans

Column major

B is k x n matrix. Size of array b must be at least ldb * n

B is n x k matrix. Size of array b must be at least ldb * k

Row major

B is k x n matrix. Size of array b must be at least ldb * k

B is n x k matrix. Size of array b must be at least ldb * n

ldb

Leading dimension of matrix B. Must be positive.

transb = transpose::nontrans

transb = transpose::trans or trans = transpose::conjtrans

Column major

Must be at least k

Must be at least n

Row major

Must be at least n

Must be at least k

beta

Scaling factor for matrix C.

c

Buffer holding input/output matrix C. See Matrix Storage for more details.

Column major

C is m x n matrix. Size of array c must be at least ldc * n

Row major

C is m x n matrix. Size of array c must be at least ldc * m

ldc

Leading dimension of matrix C. Must be positive.

Column major

Must be at least m

Row major

Must be at least n

Output Parameters

c

Output buffer overwritten by alpha * op(A)*op(B) + beta * C.

Note

If beta = 0, matrix C does not need to be initialized before calling gemm.

Examples

An example of how to use buffer version of gemm can be found in oneMKL installation directory, under:

examples/dpcpp/blas/source/gemm.cpp

gemm (USM Version)

Syntax

namespace oneapi::mkl::blas::column_major {
    sycl::event gemm(sycl::queue &queue,
                     oneapi::mkl::transpose transa,
                     oneapi::mkl::transpose transb,
                     std::int64_t m,
                     std::int64_t n,
                     std::int64_t k,
                     Ts alpha,
                     const Ta *a,
                     std::int64_t lda,
                     const Tb *b,
                     std::int64_t ldb,
                     Ts beta,
                     Tc *c,
                     std::int64_t ldc,
                     const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
    sycl::event gemm(sycl::queue &queue,
                     oneapi::mkl::transpose transa,
                     oneapi::mkl::transpose transb,
                     std::int64_t m,
                     std::int64_t n,
                     std::int64_t k,
                     Ts alpha,
                     const Ta *a,
                     std::int64_t lda,
                     const Tb *b,
                     std::int64_t ldb,
                     Ts beta,
                     Tc *c,
                     std::int64_t ldc,
                     const std::vector<sycl::event> &dependencies = {})
}

Input Parameters

queue

The queue where the routine should be executed.

transa

Specifies op(A), the transposition operation applied to matrix A. See Data Types for more details.

transb

Specifies op(B), the transposition operation applied to matrix B. See Data Types for more details.

m

Number of rows of matrix op(A) and matrix C. Must be at least zero.

n

Number of columns of matrix op(B) and matrix C. Must be at least zero.

k

Number of columns of matrix op(A) and rows of matrix op(B). Must be at least zero.

alpha

Scaling factor for matrix-matrix product.

a

Pointer to input matrix A. See Matrix Storage for more details.

A not transposed

A transposed

Column major

A is m x k matrix. Size of array a must be at least lda * k

A is k x m matrix. Size of array a must be at least lda * m

Row major

A is m x k matrix. Size of array a must be at least lda * m

A is k x m matrix. Size of array a must be at least lda * k

lda

Leading dimension of matrix A. Must be positive.

A not transposed

A transposed

Column major

Must be at least m

Must be at least k

Row major

Must be at least k

Must be at least m

b

Pointer to input matrix B. See Matrix Storage for more details.

B not transposed

B transposed

Column major

B is k x n matrix. Size of array b must be at least ldb * n

B is n x k matrix. Size of array b must be at least ldb * k

Row major

B is k x n matrix. Size of array b must be at least ldb * k

B is n x k matrix. Size of array b must be at least ldb * n

ldb

Leading dimension of matrix B. Must be positive.

B not transposed

B transposed

Column major

Must be at least k

Must be at least n

Row major

Must be at least n

Must be at least k

beta

Scaling factor for matrix C.

c

Pointer to input/output matrix C. See Matrix Storage for more details.

Column major

C is m x n matrix. Size of array c must be at least ldc * n

Row major

C is m x n matrix. Size of array c must be at least ldc * m

ldc

Leading dimension of matrix C. Must be positive.

Column major

Must be at least m

Row major

Must be at least n

dependencies

List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.

Output Parameters

c

Pointer to output matrix overwritten by alpha * op(A)*op(B) + beta * C.

Note

If beta = 0, matrix C does not need to be initialized before calling gemm.

Return Values

Output event to wait on to ensure computation is complete.

Examples

An example of how to use USM version of gemm can be found in oneMKL installation directory, under:

examples/dpcpp/blas/source/gemm_usm.cpp