Fix disassembly of eBPF atomic instructions and semantics of compare-and-exchange #8721

niooss-ledger · 2025-12-02T16:50:56Z

Hello,

eBPF ISA v3 introduced atomic instructions: https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations . These instructions are encoded using BPF_ATOMIC | BPF_W | BPF_STX and BPF_ATOMIC | BPF_DW | BPF_STX for 32-bit and 64-bit operations, with:

BPF_ATOMIC = 0xc0
BPF_DW = 0x18
BPF_W = 0
BPF_STX = 0x03

While Ghidra's semantic section is constructed correctly (atomic add uses an addition ; atomic or uses or ; ...), the disassembly always displays STXXADDW and STXXADDDW. These mnemonics come from the deprecated name BPF_XADD = BPF_ATOMIC | BPF_ADD = 0xc0.

This Pull Request replaces the confusing mnemonics with the ones used by binutils and documented in
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/doc/c-bpf.texi;h=003cb92a457985038a9abc1ffbf347f636eb0586;hb=2bc7af1ff7732451b6a7b09462a815c3284f9613#l745

While testing this, I found a bug in the semantics of atomic compare-and-exchange instruction, described in the next section.

Testing

I wrote a test C program, atomic.c using gcc's atomic built-ins. This program contains many small functions such as:

void atomic_exchange64(volatile unsigned long *ptr, unsigned long *val, unsigned long *ret) {
    __atomic_exchange(ptr, val, ret, __ATOMIC_RELAXED);
}

unsigned long atomic_fetch_add64(volatile unsigned long *addr, unsigned long value) {
    return __atomic_fetch_add(addr, value, __ATOMIC_RELAXED);
}

I compiled this program in a Debian 13 container with different options:

sudo apt-get install clang gcc-bpf llvm
clang -Wall -Wextra -O2 -target bpf -mcpu=v1 -c atomic.c -o compiled/atomic.deb13-clangO2-v1.ebpf
clang -Wall -Wextra -O2 -target bpf -mcpu=v4 -c atomic.c -o compiled/atomic.deb13-clangO2-v4.ebpf
bpf-gcc -Wall -Wextra -O2 -mcpu=v1 -c atomic.c -o compiled/atomic.deb13-gccO2-v1.ebpf
bpf-gcc -Wall -Wextra -O2 -mcpu=v4 -c atomic.c -o compiled/atomic.deb13-gccO2-v4.ebpf

I then disassembled the programs with LLVM's llvm-objdump and binutils' bpf-objdump:

llvm-objdump -rd compiled/atomic.deb13-clangO2-v1.ebpf > atomic.deb13-clangO2-v1.txt
llvm-objdump -rd compiled/atomic.deb13-clangO2-v4.ebpf > atomic.deb13-clangO2-v4.txt
bpf-objdump -rd compiled/atomic.deb13-gccO2-v1.ebpf > atomic.deb13-gccO2-v1.txt
bpf-objdump -rd compiled/atomic.deb13-gccO2-v4.ebpf > atomic.deb13-gccO2-v4.txt

While llvm-objdump produces pseudo-code instructions, such as:

0000000000000000 <atomic_exchange64>:
       0:	79 22 00 00 00 00 00 00	r2 = *(u64 *)(r2 + 0x0)
       1:	db 21 00 00 e1 00 00 00	r2 = xchg_64(r1 + 0x0, r2)
       2:	7b 23 00 00 00 00 00 00	*(u64 *)(r3 + 0x0) = r2
       3:	95 00 00 00 00 00 00 00	exit

... bpf-objdump produces usual ASM mnemonics:

00000000000000b0 <atomic_exchange64>:
  b0:    79 20 00 00 00 00 00 00 	ldxdw %r0,[%r2+0]
  b8:    db 01 00 00 e1 00 00 00 	axchg [%r1+0],%r0
  c0:    7b 03 00 00 00 00 00 00 	stxdw [%r3+0],%r0
  c8:    95 00 00 00 00 00 00 00 	exit

I analyzed the compiled eBPF programs with Ghidra and confirmed the disassembly listing contains (with this Pull Request) similar atomic mnemonics: AXCHG, AFADD, AFOR...

While testing atomic compare-and-exchange, Ghidra showed an unexpected decompiled code. This function

bool atomic_compare_exchange_n64(volatile unsigned long *ptr, unsigned long *expected, unsigned long desired) {
    return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}

... got decompiled to:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar1 = *param_2;
  lVar2 = lVar1;
  if (lVar1 != *param_1) {
    lVar2 = *param_1;
  }
  *param_1 = param_3;
  if (lVar2 != lVar1) {
    *param_2 = lVar2;
  }
  return lVar2 == lVar1;
}

The fact that *param_1 = param_3; is always executed is a bug. When the contents of the pointer does not match the expected value, __atomic_compare_exchange_n does not modify this content (and only reads the actual content to expected, which is what the second if statement is about). This Pull Request fixes this bug by adding goto inst_next; where appropriate.

With this, the decompilation becomes:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar1 = *param_2;
  lVar2 = *param_1;
  if (lVar1 == lVar2) {
    *param_1 = param_3;
    lVar2 = lVar1;
  }
  if (lVar2 != lVar1) {
    *param_2 = lVar2;
  }
  return lVar2 == lVar1;
}

There is a stray statement lVar2 = lVar1; but at least the code is semantically correct.

Attachements

atomic.c

atomic_example.tar.gz

eBPF ISA v3 introduced atomic instructions: https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations These instructions are encoded using BPF_ATOMIC | BPF_W | BPF_STX and BPF_ATOMIC | BPF_DW | BPF_STX for 32-bit and 64-bit operations, with: BPF_ATOMIC = 0xc0 BPF_DW = 0x18 BPF_W = 0 BPF_STX = 0x03 While Ghidra's semantic section is constructed correctly (atomic add uses an addition ; atomic or uses or ; ...), the disassembly always displays STXXADDW and STXXADDDW. These mnemonics come from the deprecated name BPF_XADD = BPF_ATOMIC | BPF_ADD = 0xc0. Replace the confusing mnemonics with the ones used by binutils and documented in https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/doc/c-bpf.texi;h=003cb92a457985038a9abc1ffbf347f636eb0586;hb=2bc7af1ff7732451b6a7b09462a815c3284f9613#l745

niooss-ledger · 2025-12-02T21:08:38Z

I found a 2nd bug in the semantics of the 32-bit CMPXCHG instruction: R0 was only updated when R0:4 != tmp, whereas it should always be updated (with R0 = zext(tmp)). This is what eBPF's documentation in Linux kernel states and what the implementation in linux:kernel/bpf/core.c actually does.

So I modified the 2nd commit and force-pushed. With it, the decompilation looks fine:

bool atomic_compare_exchange64(longlong *param_1,longlong *param_2,longlong *param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar2 = *param_2;
  lVar1 = *param_1;
  if (lVar2 == lVar1) {
    *param_1 = *param_3;
  }
  if (lVar1 != lVar2) {
    *param_2 = lVar1;
  }
  return lVar1 == lVar2;
}

And the 32-bit version too:

Source code of `atomic_compare_exchange_n32`

bool atomic_compare_exchange_n32(
    volatile unsigned int *ptr, unsigned int *expected, unsigned int desired
) {
    return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}

GhidorahRex · 2025-12-03T11:48:22Z

I think there was some confusion there since if something is equal why does it need to be assigned? At least in the 32-bit case it makes sense to still need the fix since it's zero-extending the 32-bit value. For the 64-bit value I think it's unnecessary but that's exactly what the manual specifies - "In either case, the value that was at dst_reg + off before the operation is zero-extended and loaded back to R0." So we should handle it that way.

niooss-ledger · 2025-12-03T12:41:27Z

@GhidorahRex I agree it is unnecessary for the 64-bit compare-and-exchange instruction. I am nonetheless observing a weird behavior in Ghidra's decompiler: when the p-code statements associated with compare-and-exchange always assigns to R0, the decompiler does not show the stray assignment ; when the p-code conditionally assigns to R0, the decompiler shows it even though it is not needed. Would this be another bug in the decompiler?

To reproduce it, I used a simpler C program atomic-cas.c:

#include <stdbool.h>
bool atomic_compare_exchange_n64(
    volatile unsigned long *ptr, unsigned long *expected, unsigned long desired
) {
    return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}

I compiled it with clang version 19.1.7 (3+b1) on Debian 13:

clang -Wall -Wextra -O2 -target bpf -mcpu=v4 -c atomic-cas.c

Its disassembly (using package binutils-bpf) is:

$ bpf-objdump -rd atomic-cas.o 
atomic-cas.o:     file format elf64-bpfle
Disassembly of section .text:

0000000000000000 <atomic_compare_exchange_n64>:
   0:	79 24 00 00 00 00 00 00 	ldxdw %r4,[%r2+0]
   8:	bf 40 00 00 00 00 00 00 	mov %r0,%r4
  10:	db 31 00 00 f1 00 00 00 	acmp [%r1+0],%r3
  18:	b4 01 00 00 01 00 00 00 	mov32 %r1,1
  20:	1d 40 01 00 00 00 00 00 	jeq %r0,%r4,1
  28:	b4 01 00 00 00 00 00 00 	mov32 %r1,0
  30:	bc 13 00 00 00 00 00 00 	mov32 %r3,%r1
  38:	54 03 00 00 01 00 00 00 	and32 %r3,1
  40:	56 03 01 00 00 00 00 00 	jne32 %r3,0,1
  48:	7b 02 00 00 00 00 00 00 	stxdw [%r2+0],%r0
  50:	bc 10 00 00 00 00 00 00 	mov32 %r0,%r1
  58:	95 00 00 00 00 00 00 00 	exit

I am using Ghidra's branch master as of today (commit a8a07b148e464f93606c3177c3d65437034196cf).

1. Scenario 1: the p-code always assigns register `R0`

In this case (which is what this Pull Request currently implements), Ghidra/Processors/eBPF/data/languages/eBPF.sinc contains:

:ACMP [dst + off], src  is imm=0xf1 & off & src & dst & op_ld_st_mode=0x6 & op_ld_st_size=0x3 & op_insn_class=0x3 {
    local tmp:8 = *:8 (dst + off);
    if (R0 != tmp) goto <notEqual>;
    *:8 (dst + off) = src; 
<notEqual>
    R0 = tmp;
}

Loading atomic-cas.o shows a function which seems semantically correct and without any stray assignment statements in the decompiled code:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar2 = *param_2;
  lVar1 = *param_1;
  if (lVar2 == lVar1) {
    *param_1 = param_3;
  }
  if (lVar1 != lVar2) {
    *param_2 = lVar1;
  }
  return lVar1 == lVar2;
}

2. Scenario 2: the p-code conditionally assigns register `R0`

A goto inst_next; is inserted to make the assignment conditionnal:

:ACMP [dst + off], src  is imm=0xf1 & off & src & dst & op_ld_st_mode=0x6 & op_ld_st_size=0x3 & op_insn_class=0x3 {
    local tmp:8 = *:8 (dst + off);
    if (R0 != tmp) goto <notEqual>;
    *:8 (dst + off) = src;
    goto inst_next;
<notEqual>
    R0 = tmp;
}

Rebuilding Ghidra and loading atomic-cas.o again shows an unexpected statement lVar1 = lVar2; in the first if block:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar2 = *param_2;
  lVar1 = *param_1;
  if (lVar2 == lVar1) {
    *param_1 = param_3;
    lVar1 = lVar2;
  }
  if (lVar1 != lVar2) {
    *param_2 = lVar1;
  }
  return lVar1 == lVar2;
}

Therefore I am wondering: why does the decompiler display lVar1 = lVar2; in the 2nd case? It seems like something failed to optimize the generated code:

  if (lVar2 == lVar1) {
    *param_1 = param_3;
    lVar1 = lVar2;
  }

GhidorahRex · 2025-12-03T13:29:17Z

That definitely looks like a decompiler bug. I was able to reproduce it in this narrow case, but I'm concerned it could be a bigger issue. Could you submit that as a separate issue and include all the information from your comment?

Linux kernel's documentation tells in https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations > The BPF_CMPXCHG operation atomically compares the value addressed by > dst_reg + off with R0. If they match, the value addressed by > dst_reg + off is replaced with src_reg. In either case, the value that > was at dst_reg + off before the operation is zero-extended and loaded > back to R0. If the values don't match, *(dst_reg + off) is not supposed to be modified. Moreover, register R0 is always modified and the 32-bit instruction truncates its value (with a zero-extension). This is also clear in the implementation of BPF_CMPXCHG in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.18#n2186 case BPF_CMPXCHG: if (BPF_SIZE(insn->code) == BPF_W) BPF_R0 = (u32) atomic_cmpxchg( (atomic_t *)(unsigned long) (DST + insn->off), (u32) BPF_R0, (u32) SRC); else if (BPF_SIZE(insn->code) == BPF_DW) BPF_R0 = (u64) atomic64_cmpxchg( (atomic64_t *)(unsigned long) (DST + insn->off), (u64) BPF_R0, (u64) SRC); Fix the semantic of the compare-and-exchange instruction accordingly.

niooss-ledger · 2025-12-03T14:22:55Z

@GhidorahRex Thanks for your quick replies! I submitted #8725 . By the way, I also saw this bug also occurs in a slightly different form in x86_64.

Regarding this PR, I changed the fix to make the 64-bit compare-and-exchange instruction not update R0 when it is not needed.

GhidorahRex self-assigned this Dec 2, 2025

GhidorahRex added Type: Bug Something isn't working Status: Triage Information is being gathered Feature: Processor/eBPF labels Dec 2, 2025

niooss-ledger force-pushed the ebpf-fix-disassembly-atomic-instructions branch from 9972ffc to 4b2aa74 Compare December 2, 2025 21:04

niooss-ledger mentioned this pull request Dec 3, 2025

Decompiler displays an unnecessary assignment statement in compare-and-exchange #8725

Open

niooss-ledger force-pushed the ebpf-fix-disassembly-atomic-instructions branch from 4b2aa74 to f4503ab Compare December 3, 2025 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix disassembly of eBPF atomic instructions and semantics of compare-and-exchange #8721

Fix disassembly of eBPF atomic instructions and semantics of compare-and-exchange #8721

niooss-ledger commented Dec 2, 2025

Uh oh!

niooss-ledger commented Dec 2, 2025 •

edited

Loading

Uh oh!

GhidorahRex commented Dec 3, 2025

Uh oh!

niooss-ledger commented Dec 3, 2025

Uh oh!

GhidorahRex commented Dec 3, 2025

Uh oh!

niooss-ledger commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix disassembly of eBPF atomic instructions and semantics of compare-and-exchange #8721

Are you sure you want to change the base?

Fix disassembly of eBPF atomic instructions and semantics of compare-and-exchange #8721

Conversation

niooss-ledger commented Dec 2, 2025

Testing

Attachements

Uh oh!

niooss-ledger commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GhidorahRex commented Dec 3, 2025

Uh oh!

niooss-ledger commented Dec 3, 2025

1. Scenario 1: the p-code always assigns register R0

2. Scenario 2: the p-code conditionally assigns register R0

Uh oh!

GhidorahRex commented Dec 3, 2025

Uh oh!

niooss-ledger commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

niooss-ledger commented Dec 2, 2025 •

edited

Loading

1. Scenario 1: the p-code always assigns register `R0`

2. Scenario 2: the p-code conditionally assigns register `R0`