.. _onemkl_blas_syrk_batch: syrk_batch ========== Computes a group of ``syrk`` operations. Description *********** The ``syrk_batch`` routines are batched versions of :ref:`onemkl_blas_syrk`, performing multiple ``syrk`` operations in a single call. Each ``syrk`` operation performs a rank-k update with general matrices. ``syrk_batch`` supports the following precisions: .. list-table:: :header-rows: 1 * - T * - ``float`` * - ``double`` * - ``std::complex`` * - ``std::complex`` syrk_batch (Buffer Version) --------------------------- Buffer version of ``syrk_batch`` supports only strided API. **Strided API** --------------- Strided API operation is defined as: .. code-block:: for i = 0 … batch_size – 1 A and C are matrices at offset i * stridea and i * stridec in a and c. C = alpha * op(A) * op(A)^T + beta * C end for where: - op(``X``) is one of op(``X``) = ``X``, or op(``X``) = ``X``\ :sup:`T`, or op(``X``) = ``X``\ :sup:`H` - ``alpha`` and ``beta`` are scalars - ``A`` is general matrix and ``C`` is symmetric matrix - op(``A``) is ``n`` x ``k`` and ``C`` is ``n`` x ``n`` For strided API, ``a`` and ``c`` buffers contain all the input matrices. The stride between matrices is given by the stride parameters. Total number of matrices in ``a`` and ``c`` buffers is given by ``batch_size`` parameter. Syntax ------ .. code-block:: cpp namespace oneapi::mkl::blas::column_major { void syrk_batch(sycl::queue &queue, oneapi::mkl::uplo upper_lower, oneapi::mkl::transpose trans, std::int64_t n, std::int64_t k, T alpha, sycl::buffer &a, std::int64_t lda, std::int64_t stridea, T beta, sycl::buffer &c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size) } .. code-block:: cpp namespace oneapi::mkl::blas::row_major { void syrk_batch(sycl::queue &queue, oneapi::mkl::uplo upper_lower, oneapi::mkl::transpose trans, std::int64_t n, std::int64_t k, T alpha, sycl::buffer &a, std::int64_t lda, std::int64_t stridea, T beta, sycl::buffer &c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size) } Input Parameters ---------------- queue The queue where the routine should be executed. upper_lower Specifies whether matrices ``C`` are upper or lower triangular. See :ref:`data-types` for more details. trans Specifies op(``A``), transposition operation applied to matrices ``A``. Conjugation is never performed even if ``trans`` = ``transpose::conjtrans``. See :ref:`data-types` for more details. n Number of rows and columns of matrices ``C``. Must be at least zero. k Number of columns of matrices op(``A``). Must be at least zero. alpha Scaling factor for rank-k update. a Buffer holding input matrices ``A``. Size of the buffer must be at least ``stridea`` * ``batch_size``. lda Leading dimension of matrices ``A``. Must be positive. .. list-table:: :header-rows: 1 * - - ``transa`` = ``transpose::nontrans`` - ``transa`` = ``transpose::trans`` or ``trans`` = ``transpose::conjtrans`` * - Column major - Must be at least ``n`` - Must be at least ``k`` * - Row major - Must be at least ``k`` - Must be at least ``n`` stridea Stride between two consecutive ``A`` matrices. .. list-table:: :header-rows: 1 * - - ``transa`` = ``transpose::nontrans`` - ``transa`` = ``transpose::trans`` or ``trans`` = ``transpose::conjtrans`` * - Column major - Must be at least ``lda`` * ``k`` - Must be at least ``lda`` * ``n`` * - Row major - Must be at least ``lda`` * ``n`` - Must be at least ``lda`` * ``k`` beta Scaling factor for matrices ``C``. c Buffer holding input/output matrices ``C``. Size of the buffer must be at least ``stridec`` * ``batch_size``. ldc Leading dimension of matrices ``C``. Must be positive and at least ``n``. stridec Stride between two consecutive ``C`` matrices. Must be least ``ldc`` * ``n``. batch_size Specifies the number of matrix multiply operations to perform. Output Parameters ----------------- c Output buffer overwritten by ``batch_size`` ``syrk`` operations of the form ``alpha`` * op(``A``) * op(``A``)\ :sup:`T` + ``beta`` * ``C``. syrk_batch (USM Version) ------------------------ USM version of ``syrk_batch`` supports group API and strided API. **Group API** ------------- Group API operation is defined as: .. code-block:: idx = 0 for i = 0 … group_count – 1 for j = 0 … group_size – 1 A, and C are matrices in a[idx] and c[idx] C = alpha[i] * op(A) * op(A)^T + beta[i] * C idx := idx + 1 end for end for where: - op(``X``) is one of op(``X``) = ``X``, or op(``X``) = ``X``\ :sup:`T`, or op(``X``) = ``X``\ :sup:`H` - ``alpha`` and ``beta`` are scalars - ``A`` is general matrix and ``C`` is symmetric matrix - op(``A``) is ``n`` x ``k`` and ``C`` is ``n`` x ``n`` For group API, ``a`` and ``c`` arrays contain the pointers for all the input matrices. The total number of matrices in ``a`` and ``c`` are given by: .. math:: total\_batch\_count = \sum_{i=0}^{group\_count-1}group\_size[i] Syntax ------ .. code-block:: cpp namespace oneapi::mkl::blas::column_major { sycl::event syrk_batch(sycl::queue &queue, oneapi::mkl::uplo *upper_lower, oneapi::mkl::transpose *trans, std::int64_t *n, std::int64_t *k, T *alpha, const T **a, std::int64_t *lda, T *beta, T **c, std::int64_t *ldc, std::int64_t group_count, std::int64_t *group_size, const std::vector &dependencies = {}) } .. code-block:: cpp namespace oneapi::mkl::blas::row_major { sycl::event syrk_batch(sycl::queue &queue, oneapi::mkl::uplo *upper_lower, oneapi::mkl::transpose *trans, std::int64_t *n, std::int64_t *k, T *alpha, const T **a, std::int64_t *lda, T *beta, T **c, std::int64_t *ldc, std::int64_t group_count, std::int64_t *group_size, const std::vector &dependencies = {}) } Input Parameters ---------------- queue The queue where the routine should be executed. upper_lower Array of ``group_count`` ``oneapi::mkl::uplo`` values. ``upper_lower[i]`` specifies whether matrices ``C`` are upper or lower triangular in group ``i``. See :ref:`data-types` for more details. trans Array of ``group_count`` ``oneapi::mkl::transpose`` values. ``trans[i]`` specifies op(``A``), transposition operation applied to matrices ``A`` in group ``i``. See :ref:`data-types` for more details. n Array of ``group_count`` integers. ``n[i]`` specifies number of rows and columns of matrices ``C`` in group ``i``. All entries must be at least zero. k Array of ``group_count`` integers. ``k[i]`` specifies number of columns of matrices op(``A``) in group ``i``. All entries must be at least zero. alpha Array of ``group_count`` scalar elements. ``alpha[i]`` specifies scaling factor for every rank-k update in group ``i``. a Array of ``total_batch_count`` pointers for input matrices ``A``. See :ref:`matrix-storage` for more details. .. list-table:: :header-rows: 1 * - - ``trans`` = ``transpose::nontrans`` - ``trans`` = ``transpose::trans`` or ``trans`` = ``transpose::conjtrans`` * - Column major - Size of array ``A[i]`` must be at least ``lda[i]`` * ``k[i]`` - Size of array ``A[i]`` must be at least ``lda[i]`` * ``n[i]`` * - Row major - Size of array ``A[i]`` must be at least ``lda[i]`` * ``n[i]`` - Size of array ``A[i]`` must be at least ``lda[i]`` * ``k[i]`` lda Array of ``group_count`` integers. ``lda[i]`` specifies leading dimension of matrices ``A`` in group ``i``. Must be positive. .. list-table:: :header-rows: 1 * - - ``trans`` = ``transpose::nontrans`` - ``trans`` = ``transpose::trans`` or ``trans`` = ``transpose::conjtrans`` * - Column major - Must be at least ``n[i]``. - Must be at least ``k[i]``. * - Row major - Must be at least ``k[i]``. - Must be at least ``n[i]``. beta Array of ``group_count`` scalar elements. ``beta[i]`` specifies scaling factor for matrices ``C`` in group ``i``. c Array of ``total_batch_count`` pointers for input/output matrices ``C``. Size of array ``C[i]`` must be at least ``ldc[i]`` * ``n[i]``. See :ref:`matrix-storage` for more details. ldc Array of ``group_count`` integers. ``ldc[i]`` specifies leading dimension of matrices ``C`` in group ``i``. Must be positive. group_count Number of groups. Must be at least zero. group_size Array of ``group_count`` integers. ``group_size[i]`` specifies the number of ``syrk`` operations in group ``i``. Each element in ``group_size`` must be at least zero. dependencies List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies. Output Parameters ----------------- c Array of pointers to output matrices ``C`` overwritten by ``total_batch_count`` ``syrk`` operations of the form ``alpha`` * op(``A``) * op(``A``)\ :sup:`T` + ``beta`` * ``C``. Return Values ------------- Output event to wait on to ensure computation is complete. **Strided API** --------------- Strided API operation is defined as: .. code-block:: for i = 0 … batch_size – 1 A and C are matrices at offset i * stridea and i * stridec in a and c. C = alpha * op(A) * op(A)^T + beta * C end for where: - op(``X``) is one of op(``X``) = ``X``, or op(``X``) = ``X``\ :sup:`T`, or op(``X``) = ``X``\ :sup:`H` - ``alpha`` and ``beta`` are scalars - ``A`` is general matrix and ``C`` is symmetric matrix - op(``A``) is ``n`` x ``k`` and ``C`` is ``n`` x ``n`` For strided API, ``a`` and ``c`` arrays contain all the input matrices. The stride between matrices is given by the stride parameters. Total number of matrices in ``a`` and ``c`` arrays is given by ``batch_size`` parameter. Syntax ------ .. code-block:: cpp namespace oneapi::mkl::blas::column_major { sycl::event syrk_batch(sycl::queue &queue, oneapi::mkl::uplo upper_lower, oneapi::mkl::transpose trans, std::int64_t n, std::int64_t k, T alpha, const T *a, std::int64_t lda, std::int64_t stridea, T beta, T *c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size, const std::vector &dependencies = {}) } .. code-block:: cpp namespace oneapi::mkl::blas::row_major { sycl::event syrk_batch(sycl::queue &queue, oneapi::mkl::uplo upper_lower, oneapi::mkl::transpose trans, std::int64_t n, std::int64_t k, T alpha, const T *a, std::int64_t lda, std::int64_t stridea, T beta, T *c, std::int64_t ldc, std::int64_t stridec, std::int64_t batch_size, const std::vector &dependencies = {}) } Input Parameters ---------------- queue The queue where the routine should be executed. upper_lower Specifies whether matrices ``C`` are upper or lower triangular. See :ref:`data-types` for more details. trans Specifies op(``A``), transposition operation applied to matrices ``A``. Conjugation is never performed even if ``trans`` = ``transpose::conjtrans``. See :ref:`data-types` for more details. n Number of rows and columns of matrices ``C``. Must be at least zero. k Number of columns of matrices op(``A``). Must be at least zero. alpha Scaling factor for rank-k update. a Pointer to input matrices ``A``. Size of the array must be at least ``stridea`` * ``batch_size``. lda Leading dimension of matrices ``A``. Must be positive. .. list-table:: :header-rows: 1 * - - ``transa`` = ``transpose::nontrans`` - ``transa`` = ``transpose::trans`` or ``trans`` = ``transpose::conjtrans`` * - Column major - Must be at least ``n`` - Must be at least ``k`` * - Row major - Must be at least ``k`` - Must be at least ``n`` stridea Stride between two consecutive ``A`` matrices. .. list-table:: :header-rows: 1 * - - ``transa`` = ``transpose::nontrans`` - ``transa`` = ``transpose::trans`` or ``trans`` = ``transpose::conjtrans`` * - Column major - Must be at least ``lda`` * ``k`` - Must be at least ``lda`` * ``n`` * - Row major - Must be at least ``lda`` * ``n`` - Must be at least ``lda`` * ``k`` beta Scaling factor for matrices ``C``. c Pointer to input/output matrices ``C``. Size of the array must be at least ``stridec`` * ``batch_size``. ldc Leading dimension of matrices ``C``. Must be positive and at least ``n``. stridec Stride between two consecutive ``C`` matrices. Must be least ``ldc`` * ``n``. batch_size Specifies the number of matrix multiply operations to perform. dependencies List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies. Output Parameters ----------------- c Pointer to output matrices ``C`` overwritten by ``batch_size`` ``syrk`` operations of the form ``alpha`` * op(``A``) * op(``A``)\ :sup:`T` + ``beta`` * ``C``. Return Values ------------- Output event to wait on to ensure computation is complete.