-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Fix disassembly of eBPF atomic instructions and semantics of compare-and-exchange #8721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
eBPF ISA v3 introduced atomic instructions: https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations These instructions are encoded using BPF_ATOMIC | BPF_W | BPF_STX and BPF_ATOMIC | BPF_DW | BPF_STX for 32-bit and 64-bit operations, with: BPF_ATOMIC = 0xc0 BPF_DW = 0x18 BPF_W = 0 BPF_STX = 0x03 While Ghidra's semantic section is constructed correctly (atomic add uses an addition ; atomic or uses or ; ...), the disassembly always displays STXXADDW and STXXADDDW. These mnemonics come from the deprecated name BPF_XADD = BPF_ATOMIC | BPF_ADD = 0xc0. Replace the confusing mnemonics with the ones used by binutils and documented in https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/doc/c-bpf.texi;h=003cb92a457985038a9abc1ffbf347f636eb0586;hb=2bc7af1ff7732451b6a7b09462a815c3284f9613#l745
9972ffc to
4b2aa74
Compare
|
I found a 2nd bug in the semantics of the 32-bit CMPXCHG instruction: R0 was only updated when So I modified the 2nd commit and force-pushed. With it, the decompilation looks fine: bool atomic_compare_exchange64(longlong *param_1,longlong *param_2,longlong *param_3)
{
longlong lVar1;
longlong lVar2;
lVar2 = *param_2;
lVar1 = *param_1;
if (lVar2 == lVar1) {
*param_1 = *param_3;
}
if (lVar1 != lVar2) {
*param_2 = lVar1;
}
return lVar1 == lVar2;
}
And the 32-bit version too:
Source code of `atomic_compare_exchange_n32`bool atomic_compare_exchange_n32(
volatile unsigned int *ptr, unsigned int *expected, unsigned int desired
) {
return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
} |
|
I think there was some confusion there since if something is equal why does it need to be assigned? At least in the 32-bit case it makes sense to still need the fix since it's zero-extending the 32-bit value. For the 64-bit value I think it's unnecessary but that's exactly what the manual specifies - "In either case, the value that was at dst_reg + off before the operation is zero-extended and loaded back to R0." So we should handle it that way. |
|
@GhidorahRex I agree it is unnecessary for the 64-bit compare-and-exchange instruction. I am nonetheless observing a weird behavior in Ghidra's decompiler: when the p-code statements associated with compare-and-exchange always assigns to To reproduce it, I used a simpler C program #include <stdbool.h>
bool atomic_compare_exchange_n64(
volatile unsigned long *ptr, unsigned long *expected, unsigned long desired
) {
return __atomic_compare_exchange_n(ptr, expected, desired, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}I compiled it with clang version 19.1.7 (3+b1) on Debian 13: clang -Wall -Wextra -O2 -target bpf -mcpu=v4 -c atomic-cas.c Its disassembly (using package $ bpf-objdump -rd atomic-cas.o
atomic-cas.o: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <atomic_compare_exchange_n64>:
0: 79 24 00 00 00 00 00 00 ldxdw %r4,[%r2+0]
8: bf 40 00 00 00 00 00 00 mov %r0,%r4
10: db 31 00 00 f1 00 00 00 acmp [%r1+0],%r3
18: b4 01 00 00 01 00 00 00 mov32 %r1,1
20: 1d 40 01 00 00 00 00 00 jeq %r0,%r4,1
28: b4 01 00 00 00 00 00 00 mov32 %r1,0
30: bc 13 00 00 00 00 00 00 mov32 %r3,%r1
38: 54 03 00 00 01 00 00 00 and32 %r3,1
40: 56 03 01 00 00 00 00 00 jne32 %r3,0,1
48: 7b 02 00 00 00 00 00 00 stxdw [%r2+0],%r0
50: bc 10 00 00 00 00 00 00 mov32 %r0,%r1
58: 95 00 00 00 00 00 00 00 exitI am using Ghidra's branch 1. Scenario 1: the p-code always assigns register
|
|
That definitely looks like a decompiler bug. I was able to reproduce it in this narrow case, but I'm concerned it could be a bigger issue. Could you submit that as a separate issue and include all the information from your comment? |
Linux kernel's documentation tells in https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations > The BPF_CMPXCHG operation atomically compares the value addressed by > dst_reg + off with R0. If they match, the value addressed by > dst_reg + off is replaced with src_reg. In either case, the value that > was at dst_reg + off before the operation is zero-extended and loaded > back to R0. If the values don't match, *(dst_reg + off) is not supposed to be modified. Moreover, register R0 is always modified and the 32-bit instruction truncates its value (with a zero-extension). This is also clear in the implementation of BPF_CMPXCHG in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.18#n2186 case BPF_CMPXCHG: if (BPF_SIZE(insn->code) == BPF_W) BPF_R0 = (u32) atomic_cmpxchg( (atomic_t *)(unsigned long) (DST + insn->off), (u32) BPF_R0, (u32) SRC); else if (BPF_SIZE(insn->code) == BPF_DW) BPF_R0 = (u64) atomic64_cmpxchg( (atomic64_t *)(unsigned long) (DST + insn->off), (u64) BPF_R0, (u64) SRC); Fix the semantic of the compare-and-exchange instruction accordingly.
4b2aa74 to
f4503ab
Compare
|
@GhidorahRex Thanks for your quick replies! I submitted #8725 . By the way, I also saw this bug also occurs in a slightly different form in x86_64. Regarding this PR, I changed the fix to make the 64-bit compare-and-exchange instruction not update |




Hello,
eBPF ISA v3 introduced atomic instructions: https://www.kernel.org/doc/html/v6.0/bpf/instruction-set.html#atomic-operations . These instructions are encoded using
BPF_ATOMIC | BPF_W | BPF_STXandBPF_ATOMIC | BPF_DW | BPF_STXfor 32-bit and 64-bit operations, with:While Ghidra's semantic section is constructed correctly (atomic add uses an addition ; atomic or uses or ; ...), the disassembly always displays
STXXADDWandSTXXADDDW. These mnemonics come from the deprecated nameBPF_XADD = BPF_ATOMIC | BPF_ADD = 0xc0.This Pull Request replaces the confusing mnemonics with the ones used by binutils and documented in
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/doc/c-bpf.texi;h=003cb92a457985038a9abc1ffbf347f636eb0586;hb=2bc7af1ff7732451b6a7b09462a815c3284f9613#l745
While testing this, I found a bug in the semantics of atomic compare-and-exchange instruction, described in the next section.
Testing
I wrote a test C program, atomic.c using gcc's atomic built-ins. This program contains many small functions such as:
I compiled this program in a Debian 13 container with different options:
I then disassembled the programs with LLVM's
llvm-objdumpand binutils'bpf-objdump:While
llvm-objdumpproduces pseudo-code instructions, such as:...
bpf-objdumpproduces usual ASM mnemonics:I analyzed the compiled eBPF programs with Ghidra and confirmed the disassembly listing contains (with this Pull Request) similar atomic mnemonics:
AXCHG,AFADD,AFOR...While testing atomic compare-and-exchange, Ghidra showed an unexpected decompiled code. This function
... got decompiled to:
The fact that
*param_1 = param_3;is always executed is a bug. When the contents of the pointer does not match the expected value,__atomic_compare_exchange_ndoes not modify this content (and only reads the actual content toexpected, which is what the secondifstatement is about). This Pull Request fixes this bug by addinggoto inst_next;where appropriate.With this, the decompilation becomes:
There is a stray statement
lVar2 = lVar1;but at least the code is semantically correct.Attachements
atomic.c
atomic_example.tar.gz