Skip to content

Commit 09cca6b

Browse files
jayfoadtru
authored andcommitted
[AMDGPU] Remove one case of vmcnt loop header flushing for GFX12 (#105550)
When a loop contains a VMEM load whose result is only used outside the loop, do not bother to flush vmcnt in the loop head on GFX12. A wait for vmcnt will be required inside the loop anyway, because VMEM instructions can write their VGPR results out of order. (cherry picked from commit fa2dccb)
1 parent 441fb41 commit 09cca6b

File tree

2 files changed

+6
-6
lines changed

2 files changed

+6
-6
lines changed

llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -2390,7 +2390,7 @@ bool SIInsertWaitcnts::shouldFlushVmCnt(MachineLoop *ML,
23902390
}
23912391
if (!ST->hasVscnt() && HasVMemStore && !HasVMemLoad && UsesVgprLoadedOutside)
23922392
return true;
2393-
return HasVMemLoad && UsesVgprLoadedOutside;
2393+
return HasVMemLoad && UsesVgprLoadedOutside && ST->hasVmemWriteVgprInOrder();
23942394
}
23952395

23962396
bool SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {

llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir

+5-5
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,7 @@ body: |
295295
# GFX12-LABEL: waitcnt_vm_loop2
296296
# GFX12-LABEL: bb.0:
297297
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
298-
# GFX12: S_WAIT_LOADCNT 0
298+
# GFX12-NOT: S_WAIT_LOADCNT 0
299299
# GFX12-LABEL: bb.1:
300300
# GFX12: S_WAIT_LOADCNT 0
301301
# GFX12-LABEL: bb.2:
@@ -342,7 +342,7 @@ body: |
342342
# GFX12-LABEL: waitcnt_vm_loop2_store
343343
# GFX12-LABEL: bb.0:
344344
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
345-
# GFX12: S_WAIT_LOADCNT 0
345+
# GFX12-NOT: S_WAIT_LOADCNT 0
346346
# GFX12-LABEL: bb.1:
347347
# GFX12: S_WAIT_LOADCNT 0
348348
# GFX12-LABEL: bb.2:
@@ -499,9 +499,9 @@ body: |
499499
# GFX12-LABEL: waitcnt_vm_loop2_reginterval
500500
# GFX12-LABEL: bb.0:
501501
# GFX12: GLOBAL_LOAD_DWORDX4
502-
# GFX12: S_WAIT_LOADCNT 0
503-
# GFX12-LABEL: bb.1:
504502
# GFX12-NOT: S_WAIT_LOADCNT 0
503+
# GFX12-LABEL: bb.1:
504+
# GFX12: S_WAIT_LOADCNT 0
505505
# GFX12-LABEL: bb.2:
506506
name: waitcnt_vm_loop2_reginterval
507507
body: |
@@ -600,7 +600,7 @@ body: |
600600
# GFX12-LABEL: bb.0:
601601
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
602602
# GFX12: BUFFER_LOAD_FORMAT_X_IDXEN
603-
# GFX12: S_WAIT_LOADCNT 0
603+
# GFX12-NOT: S_WAIT_LOADCNT 0
604604
# GFX12-LABEL: bb.1:
605605
# GFX12: S_WAIT_LOADCNT 0
606606
# GFX12-LABEL: bb.2:

0 commit comments

Comments
 (0)