Fork me on GitHub

src/arraymancer/laser/primitives/matrix_multiplication/gemm_packing

  Source Edit

Procs

proc pack_A_mc_kc[T; ukernel: static MicroKernel](
    packedA: ptr UncheckedArray[T]; mc, kc: int; A: MatrixView[T])

Packs panel kc, mc into buffer à (size ~half-L2 cache) Pads if needed Note that A is of shape M, K so it is transposed.

Concretely the outer dimension of packed matrices is k so that Ci, j = Ai, k * Bk, j does not require strided access

  Source Edit
proc pack_B_kc_nc[T; ukernel: static MicroKernel](
    packedB: ptr UncheckedArray[T]; kc, nc: int; B: MatrixView[T])

Packs panel kc, nc for ~B (half-L1 cache) Pads if needed

Concretely the outer dimension of packed matrices is k so that Ci, j = Ai, k * Bk, j does not require strided access

  Source Edit
Arraymancer Technical reference Tutorial Spellbook (How-To's) Under the hood