Skip to content

Conversation

@hvdijk
Copy link
Contributor

@hvdijk hvdijk commented Dec 6, 2025

Previously, nop was treated as just an alias for hint #0. The consequence of that was that all the general rules for hint instructions applied to nop too, in particular that during binary analysis, they were assumed to have unknown effects. This commit adds AArch64::NOP as a standalone instruction with no side effects.

The scheduling update in A55-load-store-alias.s is probably not entirely accurate, but should be more accurate than the previous result.

@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Harald van Dijk (hvdijk)

Changes

Previously, nop was treated as just an alias for hint #0. The consequence of that was that all the general rules for hint instructions applied to nop too, in particular that during binary analysis, they were assumed to have unknown effects. This commit adds AArch64::NOP as a standalone instruction with no side effects.

The scheduling update in A55-load-store-alias.s is probably not entirely accurate, but should be more accurate than the previous result.


Full diff: https://github.com/llvm/llvm-project/pull/170968.diff

7 Files Affected:

  • (modified) bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp (+1-2)
  • (modified) llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp (+2-2)
  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+5-5)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+2-2)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+5-1)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s (+24-24)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s (+1-1)
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index af87d5c12b5ce..e8cecbe60bd69 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -1862,8 +1862,7 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
   }
 
   bool isNoop(const MCInst &Inst) const override {
-    return Inst.getOpcode() == AArch64::HINT &&
-           Inst.getOperand(0).getImm() == 0;
+    return Inst.getOpcode() == AArch64::NOP;
   }
 
   void createNoop(MCInst &Inst) const override {
diff --git a/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp b/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp
index a51f63073403b..7b8909b7a2f5d 100644
--- a/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp
+++ b/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp
@@ -178,11 +178,11 @@ static void insertNopBeforeInstruction(MachineBasicBlock &MBB, MachineInstr* MI,
     MachineInstr *I = getLastNonPseudo(MBB, TII);
     assert(I && "Expected instruction");
     DebugLoc DL = I->getDebugLoc();
-    BuildMI(I->getParent(), DL, TII->get(AArch64::HINT)).addImm(0);
+    BuildMI(I->getParent(), DL, TII->get(AArch64::NOP));
   }
   else {
     DebugLoc DL = MI->getDebugLoc();
-    BuildMI(MBB, MI, DL, TII->get(AArch64::HINT)).addImm(0);
+    BuildMI(MBB, MI, DL, TII->get(AArch64::NOP));
   }
 
   ++NumNopsAdded;
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 8267414e78955..574316160fbcb 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -472,7 +472,7 @@ void AArch64AsmPrinter::emitSled(const MachineInstr &MI, SledKind Kind) {
   EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::B).addImm(8));
 
   for (int8_t I = 0; I < NoopsInSledCount; I++)
-    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::NOP));
 
   OutStreamer->emitLabel(Target);
   recordSled(CurSled, MI, Kind, 2);
@@ -1697,7 +1697,7 @@ void AArch64AsmPrinter::LowerSTACKMAP(MCStreamer &OutStreamer, StackMaps &SM,
 
   // Emit nops.
   for (unsigned i = 0; i < NumNOPBytes; i += 4)
-    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::NOP));
 }
 
 // Lower a patchpoint of the form:
@@ -1731,7 +1731,7 @@ void AArch64AsmPrinter::LowerPATCHPOINT(MCStreamer &OutStreamer, StackMaps &SM,
   assert((NumBytes - EncodedBytes) % 4 == 0 &&
          "Invalid number of NOP bytes requested!");
   for (unsigned i = EncodedBytes; i < NumBytes; i += 4)
-    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::NOP));
 }
 
 void AArch64AsmPrinter::LowerSTATEPOINT(MCStreamer &OutStreamer, StackMaps &SM,
@@ -1740,7 +1740,7 @@ void AArch64AsmPrinter::LowerSTATEPOINT(MCStreamer &OutStreamer, StackMaps &SM,
   if (unsigned PatchBytes = SOpers.getNumPatchBytes()) {
     assert(PatchBytes % 4 == 0 && "Invalid number of NOP bytes requested!");
     for (unsigned i = 0; i < PatchBytes; i += 4)
-      EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+      EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::NOP));
   } else {
     // Lower call target and choose correct opcode
     const MachineOperand &CallTarget = SOpers.getCallTarget();
@@ -2122,7 +2122,7 @@ bool AArch64AsmPrinter::emitDeactivationSymbolRelocation(Value *DS) {
 
   if (isa<GlobalAlias>(DS)) {
     // Just emit the nop directly.
-    EmitToStreamer(MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(MCInstBuilder(AArch64::NOP));
     return true;
   }
   MCSymbol *Dot = OutContext.createTempSymbol();
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 904577b8233d5..d02b1a07b581e 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -6901,11 +6901,11 @@ bool llvm::rewriteAArch64FrameIndex(MachineInstr &MI, unsigned FrameRegIdx,
 void AArch64InstrInfo::insertNoop(MachineBasicBlock &MBB,
                                   MachineBasicBlock::iterator MI) const {
   DebugLoc DL;
-  BuildMI(MBB, MI, DL, get(AArch64::HINT)).addImm(0);
+  BuildMI(MBB, MI, DL, get(AArch64::NOP));
 }
 
 MCInst AArch64InstrInfo::getNop() const {
-  return MCInstBuilder(AArch64::HINT).addImm(0);
+  return MCInstBuilder(AArch64::NOP);
 }
 
 // AArch64 supports MachineCombiner.
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 64017d7cafca3..0cbe867497625 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -1544,7 +1544,6 @@ let hasSideEffects = 1, isCodeGenOnly = 1, isTerminator = 1, isBarrier = 1 in {
 //===----------------------------------------------------------------------===//
 
 def HINT : HintI<"hint">;
-def : InstAlias<"nop",  (HINT 0b000)>;
 def : InstAlias<"yield",(HINT 0b001)>;
 def : InstAlias<"wfe",  (HINT 0b010)>;
 def : InstAlias<"wfi",  (HINT 0b011)>;
@@ -1554,6 +1553,11 @@ def : InstAlias<"dgh",  (HINT 0b110)>;
 def : InstAlias<"esb",  (HINT 0b10000)>, Requires<[HasRAS]>;
 def : InstAlias<"csdb", (HINT 20)>;
 
+let CRm = 0b0000, hasSideEffects = 0 in
+def NOP : SystemNoOperands<0b000, "hint\t#0">;
+
+def : InstAlias<"nop", (NOP)>;
+
 let Predicates = [HasPCDPHINT] in {
     def STSHH: STSHHI;
 }
diff --git a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s
index 3906975b41f6b..b0cfa5ff90cf1 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s
@@ -12,12 +12,12 @@ ldr x3, [x10]
 
 # CHECK:      Iterations:        3
 # CHECK-NEXT: Instructions:      18
-# CHECK-NEXT: Total Cycles:      31
+# CHECK-NEXT: Total Cycles:      25
 # CHECK-NEXT: Total uOps:        18
 
 # CHECK:      Dispatch Width:    2
-# CHECK-NEXT: uOps Per Cycle:    0.58
-# CHECK-NEXT: IPC:               0.58
+# CHECK-NEXT: uOps Per Cycle:    0.72
+# CHECK-NEXT: IPC:               0.72
 # CHECK-NEXT: Block RThroughput: 3.0
 
 # CHECK:      Instruction Info:
@@ -32,7 +32,7 @@ ldr x3, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
-# CHECK-NEXT:  1      1     1.00    *      *      U     nop
+# CHECK-NEXT:  1      1     1.00                        nop
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x3, [x10]
 
@@ -64,27 +64,27 @@ ldr x3, [x10]
 # CHECK-NEXT:  -      -      -      -      -      -      -      -      -     1.00    -      -     ldr	x3, [x10]
 
 # CHECK:      Timeline view:
-# CHECK-NEXT:                     0123456789          0
-# CHECK-NEXT: Index     0123456789          0123456789
+# CHECK-NEXT:                     0123456789
+# CHECK-NEXT: Index     0123456789          01234
 
-# CHECK:      [0,0]     DE   .    .    .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [0,1]     .DE  .    .    .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [0,2]     . DeeE    .    .    .    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [0,3]     .    DE   .    .    .    .    .   nop
-# CHECK-NEXT: [0,4]     .    .DeeE.    .    .    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [0,5]     .    . DeeE    .    .    .    .   ldr	x3, [x10]
-# CHECK-NEXT: [1,0]     .    .    DE   .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [1,1]     .    .    .DE  .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [1,2]     .    .    . DeeE    .    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [1,3]     .    .    .    DE   .    .    .   nop
-# CHECK-NEXT: [1,4]     .    .    .    .DeeE.    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [1,5]     .    .    .    . DeeE    .    .   ldr	x3, [x10]
-# CHECK-NEXT: [2,0]     .    .    .    .    DE   .    .   str	x1, [x10]
-# CHECK-NEXT: [2,1]     .    .    .    .    .DE  .    .   str	x1, [x10]
-# CHECK-NEXT: [2,2]     .    .    .    .    . DeeE    .   ldr	x2, [x10]
-# CHECK-NEXT: [2,3]     .    .    .    .    .    DE   .   nop
-# CHECK-NEXT: [2,4]     .    .    .    .    .    .DeeE.   ldr	x2, [x10]
-# CHECK-NEXT: [2,5]     .    .    .    .    .    . DeeE   ldr	x3, [x10]
+# CHECK:      [0,0]     DE   .    .    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [0,1]     .DE  .    .    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [0,2]     . DeeE    .    .    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [0,3]     .   DE    .    .    .   .   nop
+# CHECK-NEXT: [0,4]     .   DeeE  .    .    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [0,5]     .    DeeE .    .    .   .   ldr	x3, [x10]
+# CHECK-NEXT: [1,0]     .    .  DE.    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [1,1]     .    .   DE    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [1,2]     .    .    DeeE .    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [1,3]     .    .    . DE .    .   .   nop
+# CHECK-NEXT: [1,4]     .    .    . DeeE    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [1,5]     .    .    .  DeeE   .   .   ldr	x3, [x10]
+# CHECK-NEXT: [2,0]     .    .    .    .DE  .   .   str	x1, [x10]
+# CHECK-NEXT: [2,1]     .    .    .    . DE .   .   str	x1, [x10]
+# CHECK-NEXT: [2,2]     .    .    .    .  DeeE  .   ldr	x2, [x10]
+# CHECK-NEXT: [2,3]     .    .    .    .    DE  .   nop
+# CHECK-NEXT: [2,4]     .    .    .    .    DeeE.   ldr	x2, [x10]
+# CHECK-NEXT: [2,5]     .    .    .    .    .DeeE   ldr	x3, [x10]
 
 # CHECK:      Average Wait times (based on the timeline view):
 # CHECK-NEXT: [0]: Executions
diff --git a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s
index dc681f4ce9479..c9704c58b627f 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s
@@ -30,7 +30,7 @@ ldr x3, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
-# CHECK-NEXT:  1      1     1.00    *      *      U     nop
+# CHECK-NEXT:  1      1     1.00                        nop
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x3, [x10]
 

@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2025

@llvm/pr-subscribers-bolt

Author: Harald van Dijk (hvdijk)

Changes

Previously, nop was treated as just an alias for hint #0. The consequence of that was that all the general rules for hint instructions applied to nop too, in particular that during binary analysis, they were assumed to have unknown effects. This commit adds AArch64::NOP as a standalone instruction with no side effects.

The scheduling update in A55-load-store-alias.s is probably not entirely accurate, but should be more accurate than the previous result.


Full diff: https://github.com/llvm/llvm-project/pull/170968.diff

7 Files Affected:

  • (modified) bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp (+1-2)
  • (modified) llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp (+2-2)
  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+5-5)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+2-2)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+5-1)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s (+24-24)
  • (modified) llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s (+1-1)
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index af87d5c12b5ce..e8cecbe60bd69 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -1862,8 +1862,7 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
   }
 
   bool isNoop(const MCInst &Inst) const override {
-    return Inst.getOpcode() == AArch64::HINT &&
-           Inst.getOperand(0).getImm() == 0;
+    return Inst.getOpcode() == AArch64::NOP;
   }
 
   void createNoop(MCInst &Inst) const override {
diff --git a/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp b/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp
index a51f63073403b..7b8909b7a2f5d 100644
--- a/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp
+++ b/llvm/lib/Target/AArch64/AArch64A53Fix835769.cpp
@@ -178,11 +178,11 @@ static void insertNopBeforeInstruction(MachineBasicBlock &MBB, MachineInstr* MI,
     MachineInstr *I = getLastNonPseudo(MBB, TII);
     assert(I && "Expected instruction");
     DebugLoc DL = I->getDebugLoc();
-    BuildMI(I->getParent(), DL, TII->get(AArch64::HINT)).addImm(0);
+    BuildMI(I->getParent(), DL, TII->get(AArch64::NOP));
   }
   else {
     DebugLoc DL = MI->getDebugLoc();
-    BuildMI(MBB, MI, DL, TII->get(AArch64::HINT)).addImm(0);
+    BuildMI(MBB, MI, DL, TII->get(AArch64::NOP));
   }
 
   ++NumNopsAdded;
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 8267414e78955..574316160fbcb 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -472,7 +472,7 @@ void AArch64AsmPrinter::emitSled(const MachineInstr &MI, SledKind Kind) {
   EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::B).addImm(8));
 
   for (int8_t I = 0; I < NoopsInSledCount; I++)
-    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::NOP));
 
   OutStreamer->emitLabel(Target);
   recordSled(CurSled, MI, Kind, 2);
@@ -1697,7 +1697,7 @@ void AArch64AsmPrinter::LowerSTACKMAP(MCStreamer &OutStreamer, StackMaps &SM,
 
   // Emit nops.
   for (unsigned i = 0; i < NumNOPBytes; i += 4)
-    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::NOP));
 }
 
 // Lower a patchpoint of the form:
@@ -1731,7 +1731,7 @@ void AArch64AsmPrinter::LowerPATCHPOINT(MCStreamer &OutStreamer, StackMaps &SM,
   assert((NumBytes - EncodedBytes) % 4 == 0 &&
          "Invalid number of NOP bytes requested!");
   for (unsigned i = EncodedBytes; i < NumBytes; i += 4)
-    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::NOP));
 }
 
 void AArch64AsmPrinter::LowerSTATEPOINT(MCStreamer &OutStreamer, StackMaps &SM,
@@ -1740,7 +1740,7 @@ void AArch64AsmPrinter::LowerSTATEPOINT(MCStreamer &OutStreamer, StackMaps &SM,
   if (unsigned PatchBytes = SOpers.getNumPatchBytes()) {
     assert(PatchBytes % 4 == 0 && "Invalid number of NOP bytes requested!");
     for (unsigned i = 0; i < PatchBytes; i += 4)
-      EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::HINT).addImm(0));
+      EmitToStreamer(OutStreamer, MCInstBuilder(AArch64::NOP));
   } else {
     // Lower call target and choose correct opcode
     const MachineOperand &CallTarget = SOpers.getCallTarget();
@@ -2122,7 +2122,7 @@ bool AArch64AsmPrinter::emitDeactivationSymbolRelocation(Value *DS) {
 
   if (isa<GlobalAlias>(DS)) {
     // Just emit the nop directly.
-    EmitToStreamer(MCInstBuilder(AArch64::HINT).addImm(0));
+    EmitToStreamer(MCInstBuilder(AArch64::NOP));
     return true;
   }
   MCSymbol *Dot = OutContext.createTempSymbol();
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 904577b8233d5..d02b1a07b581e 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -6901,11 +6901,11 @@ bool llvm::rewriteAArch64FrameIndex(MachineInstr &MI, unsigned FrameRegIdx,
 void AArch64InstrInfo::insertNoop(MachineBasicBlock &MBB,
                                   MachineBasicBlock::iterator MI) const {
   DebugLoc DL;
-  BuildMI(MBB, MI, DL, get(AArch64::HINT)).addImm(0);
+  BuildMI(MBB, MI, DL, get(AArch64::NOP));
 }
 
 MCInst AArch64InstrInfo::getNop() const {
-  return MCInstBuilder(AArch64::HINT).addImm(0);
+  return MCInstBuilder(AArch64::NOP);
 }
 
 // AArch64 supports MachineCombiner.
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 64017d7cafca3..0cbe867497625 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -1544,7 +1544,6 @@ let hasSideEffects = 1, isCodeGenOnly = 1, isTerminator = 1, isBarrier = 1 in {
 //===----------------------------------------------------------------------===//
 
 def HINT : HintI<"hint">;
-def : InstAlias<"nop",  (HINT 0b000)>;
 def : InstAlias<"yield",(HINT 0b001)>;
 def : InstAlias<"wfe",  (HINT 0b010)>;
 def : InstAlias<"wfi",  (HINT 0b011)>;
@@ -1554,6 +1553,11 @@ def : InstAlias<"dgh",  (HINT 0b110)>;
 def : InstAlias<"esb",  (HINT 0b10000)>, Requires<[HasRAS]>;
 def : InstAlias<"csdb", (HINT 20)>;
 
+let CRm = 0b0000, hasSideEffects = 0 in
+def NOP : SystemNoOperands<0b000, "hint\t#0">;
+
+def : InstAlias<"nop", (NOP)>;
+
 let Predicates = [HasPCDPHINT] in {
     def STSHH: STSHHI;
 }
diff --git a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s
index 3906975b41f6b..b0cfa5ff90cf1 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-alias.s
@@ -12,12 +12,12 @@ ldr x3, [x10]
 
 # CHECK:      Iterations:        3
 # CHECK-NEXT: Instructions:      18
-# CHECK-NEXT: Total Cycles:      31
+# CHECK-NEXT: Total Cycles:      25
 # CHECK-NEXT: Total uOps:        18
 
 # CHECK:      Dispatch Width:    2
-# CHECK-NEXT: uOps Per Cycle:    0.58
-# CHECK-NEXT: IPC:               0.58
+# CHECK-NEXT: uOps Per Cycle:    0.72
+# CHECK-NEXT: IPC:               0.72
 # CHECK-NEXT: Block RThroughput: 3.0
 
 # CHECK:      Instruction Info:
@@ -32,7 +32,7 @@ ldr x3, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
-# CHECK-NEXT:  1      1     1.00    *      *      U     nop
+# CHECK-NEXT:  1      1     1.00                        nop
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x3, [x10]
 
@@ -64,27 +64,27 @@ ldr x3, [x10]
 # CHECK-NEXT:  -      -      -      -      -      -      -      -      -     1.00    -      -     ldr	x3, [x10]
 
 # CHECK:      Timeline view:
-# CHECK-NEXT:                     0123456789          0
-# CHECK-NEXT: Index     0123456789          0123456789
+# CHECK-NEXT:                     0123456789
+# CHECK-NEXT: Index     0123456789          01234
 
-# CHECK:      [0,0]     DE   .    .    .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [0,1]     .DE  .    .    .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [0,2]     . DeeE    .    .    .    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [0,3]     .    DE   .    .    .    .    .   nop
-# CHECK-NEXT: [0,4]     .    .DeeE.    .    .    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [0,5]     .    . DeeE    .    .    .    .   ldr	x3, [x10]
-# CHECK-NEXT: [1,0]     .    .    DE   .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [1,1]     .    .    .DE  .    .    .    .   str	x1, [x10]
-# CHECK-NEXT: [1,2]     .    .    . DeeE    .    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [1,3]     .    .    .    DE   .    .    .   nop
-# CHECK-NEXT: [1,4]     .    .    .    .DeeE.    .    .   ldr	x2, [x10]
-# CHECK-NEXT: [1,5]     .    .    .    . DeeE    .    .   ldr	x3, [x10]
-# CHECK-NEXT: [2,0]     .    .    .    .    DE   .    .   str	x1, [x10]
-# CHECK-NEXT: [2,1]     .    .    .    .    .DE  .    .   str	x1, [x10]
-# CHECK-NEXT: [2,2]     .    .    .    .    . DeeE    .   ldr	x2, [x10]
-# CHECK-NEXT: [2,3]     .    .    .    .    .    DE   .   nop
-# CHECK-NEXT: [2,4]     .    .    .    .    .    .DeeE.   ldr	x2, [x10]
-# CHECK-NEXT: [2,5]     .    .    .    .    .    . DeeE   ldr	x3, [x10]
+# CHECK:      [0,0]     DE   .    .    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [0,1]     .DE  .    .    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [0,2]     . DeeE    .    .    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [0,3]     .   DE    .    .    .   .   nop
+# CHECK-NEXT: [0,4]     .   DeeE  .    .    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [0,5]     .    DeeE .    .    .   .   ldr	x3, [x10]
+# CHECK-NEXT: [1,0]     .    .  DE.    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [1,1]     .    .   DE    .    .   .   str	x1, [x10]
+# CHECK-NEXT: [1,2]     .    .    DeeE .    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [1,3]     .    .    . DE .    .   .   nop
+# CHECK-NEXT: [1,4]     .    .    . DeeE    .   .   ldr	x2, [x10]
+# CHECK-NEXT: [1,5]     .    .    .  DeeE   .   .   ldr	x3, [x10]
+# CHECK-NEXT: [2,0]     .    .    .    .DE  .   .   str	x1, [x10]
+# CHECK-NEXT: [2,1]     .    .    .    . DE .   .   str	x1, [x10]
+# CHECK-NEXT: [2,2]     .    .    .    .  DeeE  .   ldr	x2, [x10]
+# CHECK-NEXT: [2,3]     .    .    .    .    DE  .   nop
+# CHECK-NEXT: [2,4]     .    .    .    .    DeeE.   ldr	x2, [x10]
+# CHECK-NEXT: [2,5]     .    .    .    .    .DeeE   ldr	x3, [x10]
 
 # CHECK:      Average Wait times (based on the timeline view):
 # CHECK-NEXT: [0]: Executions
diff --git a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s
index dc681f4ce9479..c9704c58b627f 100644
--- a/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s
+++ b/llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s
@@ -30,7 +30,7 @@ ldr x3, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      1     1.00           *            str	x1, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
-# CHECK-NEXT:  1      1     1.00    *      *      U     nop
+# CHECK-NEXT:  1      1     1.00                        nop
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x2, [x10]
 # CHECK-NEXT:  1      3     1.00    *                   ldr	x3, [x10]
 

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 6, 2025

Context: This is part of a larger effort to avoid instructions being unnecessarily marked as having unmodelled side effects. For LLVM, it is largely harmless, it just results in reduced optimisations. For BOLT though, it complicates binary analysis.

@github-actions
Copy link

github-actions bot commented Dec 6, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Previously, nop was treated as just an alias for hint #0. The
consequence of that was that all the general rules for hint instructions
applied to nop too, in particular that during binary analysis, they were
assumed to have unknown effects. This commit adds AArch64::NOP as a
standalone instruction with no side effects.

The scheduling update in A55-load-store-alias.s is probably not entirely
accurate, but should be more accurate than the previous result.
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi - This sounds fine, but where do you expect them to be used?

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 8, 2025

Hi - This sounds fine, but where do you expect them to be used?

There's a fair number of places that should bail out on instructions with unmodelled side effects, but currently don't. For example, we should have something like

--- a/bolt/lib/Core/MCPlusBuilder.cpp
+++ b/bolt/lib/Core/MCPlusBuilder.cpp
@@ -450,6 +450,10 @@ void MCPlusBuilder::getClobberedRegs(const MCInst &Inst,
     return;
 
   const MCInstrDesc &InstInfo = Info->get(Inst.getOpcode());
+  if (InstInfo.hasUnmodeledSideEffects()) {
+    Regs.set();
+    return;
+  }
 
   for (MCPhysReg ImplicitDef : InstInfo.implicit_defs())
     Regs |= getAliases(ImplicitDef, /*OnlySmaller=*/false);

This is what would benefit from this PR: this PR is enough to make InstInfo.hasUnmodeledSideEffects() return false for AArch64's nop.

There are a number of other instructions with the same problem, but I am splitting up the PRs for easier review. Once they're all in, at least for the instructions where we know it is a problem, I can submit the changes to BOLT to check hasUnmodeledSideEffects().

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - for bolt. Sounds good.

We usually don't generate these for codegen (usually we use alignment directives instead). If we were adding them, and the instruction had no side-effects and no inputs/outputs, what is to stop it from being removed? It looks like it is only the late positions in the pipelines of the current uses that is preventing it?

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 8, 2025

We usually don't generate these for codegen (usually we use alignment directives instead).

Usually, but there are exceptions, such as AArch64A53Fix835769.cpp (updated by this PR) where nop gets inserted between instruction sequences that happen to trigger a CPU bug, when the workaround is enabled.

If we were adding them, and the instruction had no side-effects and no inputs/outputs, what is to stop it from being removed? It looks like it is only the late positions in the pipelines of the current uses that is preventing it?

Yes, and I think that is correct. If you're still early enough in the pipeline that you're running passes like DeadMachineInstructionElim that remove instructions that do nothing, you'd want this instruction that does nothing to be removed. The X86 nop instructions are already marked hasSideEffects = 0 too, as is the RISC-V nop which is defined as an alias for a form of addi, which also is marked hasSideEffects = 0.

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good from BOLT perspective. Thanks!

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, It was 835769 that I was thinking of, where they are required for correctness.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants