Skip to content

Conversation

@niooss-ledger
Copy link
Contributor

Hello,

eBPF ISA v3 introduced atomic instructions: https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations . These instructions are encoded using BPF_ATOMIC | BPF_W | BPF_STX and BPF_ATOMIC | BPF_DW | BPF_STX for 32-bit and 64-bit operations, with:

BPF_ATOMIC = 0xc0
BPF_DW = 0x18
BPF_W = 0
BPF_STX = 0x03

While Ghidra's semantic section is constructed correctly (atomic add uses an addition ; atomic or uses or ; ...), the disassembly always displays STXXADDW and STXXADDDW. These mnemonics come from the deprecated name BPF_XADD = BPF_ATOMIC | BPF_ADD = 0xc0.

This Pull Request replaces the confusing mnemonics with the ones used by binutils and documented in
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/doc/c-bpf.texi;h=003cb92a457985038a9abc1ffbf347f636eb0586;hb=2bc7af1ff7732451b6a7b09462a815c3284f9613#l745

While testing this, I found a bug in the semantics of atomic compare-and-exchange instruction, described in the next section.

Testing

I wrote a test C program, atomic.c using gcc's atomic built-ins. This program contains many small functions such as:

void atomic_exchange64(volatile unsigned long *ptr, unsigned long *val, unsigned long *ret) {
    __atomic_exchange(ptr, val, ret, __ATOMIC_RELAXED);
}
unsigned long atomic_fetch_add64(volatile unsigned long *addr, unsigned long value) {
    return __atomic_fetch_add(addr, value, __ATOMIC_RELAXED);
}

I compiled this program in a Debian 13 container with different options:

sudo apt-get install clang gcc-bpf llvm
clang -Wall -Wextra -O2 -target bpf -mcpu=v1 -c atomic.c -o compiled/atomic.deb13-clangO2-v1.ebpf
clang -Wall -Wextra -O2 -target bpf -mcpu=v4 -c atomic.c -o compiled/atomic.deb13-clangO2-v4.ebpf
bpf-gcc -Wall -Wextra -O2 -mcpu=v1 -c atomic.c -o compiled/atomic.deb13-gccO2-v1.ebpf
bpf-gcc -Wall -Wextra -O2 -mcpu=v4 -c atomic.c -o compiled/atomic.deb13-gccO2-v4.ebpf

I then disassembled the programs with LLVM's llvm-objdump and binutils' bpf-objdump:

llvm-objdump -rd compiled/atomic.deb13-clangO2-v1.ebpf > atomic.deb13-clangO2-v1.txt
llvm-objdump -rd compiled/atomic.deb13-clangO2-v4.ebpf > atomic.deb13-clangO2-v4.txt
bpf-objdump -rd compiled/atomic.deb13-gccO2-v1.ebpf > atomic.deb13-gccO2-v1.txt
bpf-objdump -rd compiled/atomic.deb13-gccO2-v4.ebpf > atomic.deb13-gccO2-v4.txt

While llvm-objdump produces pseudo-code instructions, such as:

0000000000000000 <atomic_exchange64>:
       0:	79 22 00 00 00 00 00 00	r2 = *(u64 *)(r2 + 0x0)
       1:	db 21 00 00 e1 00 00 00	r2 = xchg_64(r1 + 0x0, r2)
       2:	7b 23 00 00 00 00 00 00	*(u64 *)(r3 + 0x0) = r2
       3:	95 00 00 00 00 00 00 00	exit

... bpf-objdump produces usual ASM mnemonics:

00000000000000b0 <atomic_exchange64>:
  b0:    79 20 00 00 00 00 00 00 	ldxdw %r0,[%r2+0]
  b8:    db 01 00 00 e1 00 00 00 	axchg [%r1+0],%r0
  c0:    7b 03 00 00 00 00 00 00 	stxdw [%r3+0],%r0
  c8:    95 00 00 00 00 00 00 00 	exit

I analyzed the compiled eBPF programs with Ghidra and confirmed the disassembly listing contains (with this Pull Request) similar atomic mnemonics: AXCHG, AFADD, AFOR...

While testing atomic compare-and-exchange, Ghidra showed an unexpected decompiled code. This function

bool atomic_compare_exchange_n64(volatile unsigned long *ptr, unsigned long *expected, unsigned long desired) {
    return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}

... got decompiled to:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar1 = *param_2;
  lVar2 = lVar1;
  if (lVar1 != *param_1) {
    lVar2 = *param_1;
  }
  *param_1 = param_3;
  if (lVar2 != lVar1) {
    *param_2 = lVar2;
  }
  return lVar2 == lVar1;
}

The fact that *param_1 = param_3; is always executed is a bug. When the contents of the pointer does not match the expected value, __atomic_compare_exchange_n does not modify this content (and only reads the actual content to expected, which is what the second if statement is about). This Pull Request fixes this bug by adding goto inst_next; where appropriate.

With this, the decompilation becomes:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar1 = *param_2;
  lVar2 = *param_1;
  if (lVar1 == lVar2) {
    *param_1 = param_3;
    lVar2 = lVar1;
  }
  if (lVar2 != lVar1) {
    *param_2 = lVar2;
  }
  return lVar2 == lVar1;
}

There is a stray statement lVar2 = lVar1; but at least the code is semantically correct.

Attachements

atomic.c

atomic_example.tar.gz

eBPF ISA v3 introduced atomic instructions:
https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations

These instructions are encoded using BPF_ATOMIC | BPF_W  | BPF_STX
and BPF_ATOMIC | BPF_DW | BPF_STX for 32-bit and 64-bit operations,
with:

    BPF_ATOMIC = 0xc0
    BPF_DW = 0x18
    BPF_W = 0
    BPF_STX = 0x03

While Ghidra's semantic section is constructed correctly (atomic add
uses an addition ; atomic or uses or ; ...), the disassembly always
displays STXXADDW and STXXADDDW. These mnemonics come from the
deprecated name BPF_XADD = BPF_ATOMIC | BPF_ADD = 0xc0.

Replace the confusing mnemonics with the ones used by binutils and
documented in
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/doc/c-bpf.texi;h=003cb92a457985038a9abc1ffbf347f636eb0586;hb=2bc7af1ff7732451b6a7b09462a815c3284f9613#l745
@GhidorahRex GhidorahRex self-assigned this Dec 2, 2025
@GhidorahRex GhidorahRex added Type: Bug Something isn't working Status: Triage Information is being gathered Feature: Processor/eBPF labels Dec 2, 2025
@niooss-ledger niooss-ledger force-pushed the ebpf-fix-disassembly-atomic-instructions branch from 9972ffc to 4b2aa74 Compare December 2, 2025 21:04
@niooss-ledger
Copy link
Contributor Author

niooss-ledger commented Dec 2, 2025

I found a 2nd bug in the semantics of the 32-bit CMPXCHG instruction: R0 was only updated when R0:4 != tmp, whereas it should always be updated (with R0 = zext(tmp)). This is what eBPF's documentation in Linux kernel states and what the implementation in linux:kernel/bpf/core.c actually does.

So I modified the 2nd commit and force-pushed. With it, the decompilation looks fine:

bool atomic_compare_exchange64(longlong *param_1,longlong *param_2,longlong *param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar2 = *param_2;
  lVar1 = *param_1;
  if (lVar2 == lVar1) {
    *param_1 = *param_3;
  }
  if (lVar1 != lVar2) {
    *param_2 = lVar1;
  }
  return lVar1 == lVar2;
}
image

And the 32-bit version too:

image
Source code of `atomic_compare_exchange_n32`
bool atomic_compare_exchange_n32(
    volatile unsigned int *ptr, unsigned int *expected, unsigned int desired
) {
    return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}

@GhidorahRex
Copy link
Collaborator

I think there was some confusion there since if something is equal why does it need to be assigned? At least in the 32-bit case it makes sense to still need the fix since it's zero-extending the 32-bit value. For the 64-bit value I think it's unnecessary but that's exactly what the manual specifies - "In either case, the value that was at dst_reg + off before the operation is zero-extended and loaded back to R0." So we should handle it that way.

@niooss-ledger
Copy link
Contributor Author

@GhidorahRex I agree it is unnecessary for the 64-bit compare-and-exchange instruction. I am nonetheless observing a weird behavior in Ghidra's decompiler: when the p-code statements associated with compare-and-exchange always assigns to R0, the decompiler does not show the stray assignment ; when the p-code conditionally assigns to R0, the decompiler shows it even though it is not needed. Would this be another bug in the decompiler?

To reproduce it, I used a simpler C program atomic-cas.c:

#include <stdbool.h>
bool atomic_compare_exchange_n64(
    volatile unsigned long *ptr, unsigned long *expected, unsigned long desired
) {
    return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}

I compiled it with clang version 19.1.7 (3+b1) on Debian 13:

clang -Wall -Wextra -O2 -target bpf -mcpu=v4 -c atomic-cas.c 

Its disassembly (using package binutils-bpf) is:

$ bpf-objdump -rd atomic-cas.o 
atomic-cas.o:     file format elf64-bpfle
Disassembly of section .text:

0000000000000000 <atomic_compare_exchange_n64>:
   0:	79 24 00 00 00 00 00 00 	ldxdw %r4,[%r2+0]
   8:	bf 40 00 00 00 00 00 00 	mov %r0,%r4
  10:	db 31 00 00 f1 00 00 00 	acmp [%r1+0],%r3
  18:	b4 01 00 00 01 00 00 00 	mov32 %r1,1
  20:	1d 40 01 00 00 00 00 00 	jeq %r0,%r4,1
  28:	b4 01 00 00 00 00 00 00 	mov32 %r1,0
  30:	bc 13 00 00 00 00 00 00 	mov32 %r3,%r1
  38:	54 03 00 00 01 00 00 00 	and32 %r3,1
  40:	56 03 01 00 00 00 00 00 	jne32 %r3,0,1
  48:	7b 02 00 00 00 00 00 00 	stxdw [%r2+0],%r0
  50:	bc 10 00 00 00 00 00 00 	mov32 %r0,%r1
  58:	95 00 00 00 00 00 00 00 	exit

I am using Ghidra's branch master as of today (commit a8a07b148e464f93606c3177c3d65437034196cf).

1. Scenario 1: the p-code always assigns register R0

In this case (which is what this Pull Request currently implements), Ghidra/Processors/eBPF/data/languages/eBPF.sinc contains:

:ACMP [dst + off], src  is imm=0xf1 & off & src & dst & op_ld_st_mode=0x6 & op_ld_st_size=0x3 & op_insn_class=0x3 {
    local tmp:8 = *:8 (dst + off);
    if (R0 != tmp) goto <notEqual>;
    *:8 (dst + off) = src; 
<notEqual>
    R0 = tmp;
}

Loading atomic-cas.o shows a function which seems semantically correct and without any stray assignment statements in the decompiled code:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar2 = *param_2;
  lVar1 = *param_1;
  if (lVar2 == lVar1) {
    *param_1 = param_3;
  }
  if (lVar1 != lVar2) {
    *param_2 = lVar1;
  }
  return lVar1 == lVar2;
}
image

2. Scenario 2: the p-code conditionally assigns register R0

A goto inst_next; is inserted to make the assignment conditionnal:

:ACMP [dst + off], src  is imm=0xf1 & off & src & dst & op_ld_st_mode=0x6 & op_ld_st_size=0x3 & op_insn_class=0x3 {
    local tmp:8 = *:8 (dst + off);
    if (R0 != tmp) goto <notEqual>;
    *:8 (dst + off) = src;
    goto inst_next;
<notEqual>
    R0 = tmp;
}

Rebuilding Ghidra and loading atomic-cas.o again shows an unexpected statement lVar1 = lVar2; in the first if block:

bool atomic_compare_exchange_n64(longlong *param_1,longlong *param_2,longlong param_3)
{
  longlong lVar1;
  longlong lVar2;
  
  lVar2 = *param_2;
  lVar1 = *param_1;
  if (lVar2 == lVar1) {
    *param_1 = param_3;
    lVar1 = lVar2;
  }
  if (lVar1 != lVar2) {
    *param_2 = lVar1;
  }
  return lVar1 == lVar2;
}
image

Therefore I am wondering: why does the decompiler display lVar1 = lVar2; in the 2nd case? It seems like something failed to optimize the generated code:

  if (lVar2 == lVar1) {
    *param_1 = param_3;
    lVar1 = lVar2;
  }

@GhidorahRex
Copy link
Collaborator

That definitely looks like a decompiler bug. I was able to reproduce it in this narrow case, but I'm concerned it could be a bigger issue. Could you submit that as a separate issue and include all the information from your comment?

Linux kernel's documentation tells in
https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations

> The BPF_CMPXCHG operation atomically compares the value addressed by
> dst_reg + off with R0. If they match, the value addressed by
> dst_reg + off is replaced with src_reg. In either case, the value that
> was at dst_reg + off before the operation is zero-extended and loaded
> back to R0.

If the values don't match, *(dst_reg + off) is not supposed to be
modified.

Moreover, register R0 is always modified and the 32-bit instruction
truncates its value (with a zero-extension). This is also clear in the
implementation of BPF_CMPXCHG in
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.18#n2186

    case BPF_CMPXCHG:
            if (BPF_SIZE(insn->code) == BPF_W)
                    BPF_R0 = (u32) atomic_cmpxchg(
                            (atomic_t *)(unsigned long) (DST + insn->off),
                            (u32) BPF_R0, (u32) SRC);
            else if (BPF_SIZE(insn->code) == BPF_DW)
                    BPF_R0 = (u64) atomic64_cmpxchg(
                            (atomic64_t *)(unsigned long) (DST + insn->off),
                            (u64) BPF_R0, (u64) SRC);

Fix the semantic of the compare-and-exchange instruction accordingly.
@niooss-ledger niooss-ledger force-pushed the ebpf-fix-disassembly-atomic-instructions branch from 4b2aa74 to f4503ab Compare December 3, 2025 14:22
@niooss-ledger
Copy link
Contributor Author

@GhidorahRex Thanks for your quick replies! I submitted #8725 . By the way, I also saw this bug also occurs in a slightly different form in x86_64.

Regarding this PR, I changed the fix to make the 64-bit compare-and-exchange instruction not update R0 when it is not needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature: Processor/eBPF Status: Triage Information is being gathered Type: Bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants