TV Layout Formula — Interactive Visualizer
mma.sync.m16n8k16 — TV_layout_C: shape ((4,8),(2,2)) : stride ((16,1),(8,64))
col
= tid % 8
row
= (tid / 8) * 2 + (vid % 2) + (vid / 2) * 8
Thread ID (tid):
0
T_inner (tid%8) =
0
, T_outer (tid/8) =
0
Value Index (vid):
0
V_inner (vid%2) =
0
, V_outer (vid/2) =
0
Offset Computation
(tid%8)*1 = 0
+
(tid/8)*16 = 0
+
(vid%2)*8 = 0
+
(vid/2)*64 = 0
=
offset 0
→
col =
0
,
row =
0
→
C[
0, 0
]
Current (tid, vid)
Same thread (all 4 values)
Same value index (all 32 threads)
16×8 Output Matrix C
Thread
0
's Registers
v0
C[0, 0]
v1
C[1, 0]
v2
C[8, 0]
v3
C[9, 0]