39 namespace threadblock {
    45   typename ThreadblockShape_,
    47   typename MmaSimtPolicy_,
    70       !(ThreadblockShape::kM % WarpShape::kM) &&
    71       !(ThreadblockShape::kM % WarpShape::kM), 
"Divisibility");
    75       ThreadblockShape::kM / WarpShape::kM,
    76       ThreadblockShape::kN / WarpShape::kN,
    82       WarpShape::kM / (MmaSimtPolicy::WarpShape::kRow * MmaSimtPolicy::LaneMmaShape::kM);
   100       MmaSimtPolicy::WarpShape::kRow, 
   105       MmaSimtPolicy::LaneMmaShape::kM, 
 static int const kM
Definition: include/cutlass/gemm/gemm.h:58
Definition: output_tile_thread_map.h:228
Definition: aligned_buffer.h:35
ThreadblockShape_ ThreadblockShape
Definition: default_thread_map_simt.h:54
MmaSimtPolicy_ MmaSimtPolicy
Definition: default_thread_map_simt.h:56
Tuple defining point in output tile. 
Definition: output_tile_thread_map.h:57
static int const kThreads
Number of participating threads. 
Definition: default_thread_map_simt.h:85
Epilogue for threadblock scoped GEMMs using Tensor Ops. 
Defines common types used for all GEMM-like operators. 
static int const kCount
Definition: include/cutlass/gemm/gemm.h:67
Defines the optimal thread map for SIMT accumulator layouts. 
Definition: default_thread_map_simt.h:52
Defines the size of an element in bits. 
Definition: numeric_types.h:42
Element_ Element
Definition: default_thread_map_simt.h:58
static int const kElementsPerAccess
Definition: default_thread_map_simt.h:59
static int const kIterations
Number of iterations. 
Definition: default_thread_map_simt.h:88
static int const kWarpSize
Definition: default_thread_map_simt.h:67
static int const kPartitionsK
Definition: default_thread_map_simt.h:57
Shape of a matrix multiply-add operation. 
Definition: include/cutlass/gemm/gemm.h:57
Definition: default_thread_map_simt.h:65
WarpShape_ WarpShape
Definition: default_thread_map_simt.h:55
static int const kGroupCount
Computes number of thread-level matrix multiplies are needed to span a warp. 
Definition: default_thread_map_simt.h:81