Procs
proc pack_A_mc_kc[T; ukernel: static MicroKernel]( packedA: ptr UncheckedArray[T]; mc, kc: int; A: MatrixView[T])
-
Packs panel kc, mc into buffer à (size ~half-L2 cache) Pads if needed Note that A is of shape M, K so it is transposed.
Concretely the outer dimension of packed matrices is k so that Ci, j = Ai, k * Bk, j does not require strided access
Source Edit proc pack_B_kc_nc[T; ukernel: static MicroKernel]( packedB: ptr UncheckedArray[T]; kc, nc: int; B: MatrixView[T])
-
Packs panel kc, nc for ~B (half-L1 cache) Pads if needed
Concretely the outer dimension of packed matrices is k so that Ci, j = Ai, k * Bk, j does not require strided access
Source Edit