Skip to content

Conversation

@Fix-Point
Copy link
Contributor

@Fix-Point Fix-Point commented Dec 19, 2025

Summary

This PR introduces an efficient Seqlock (sequential lock) mechanism suitable for concurrent scenarios with frequent reads and rare writes. Seqlock enables lock-free reading while ensuring data consistency through sequence counting.

This is part I of the #17556

Core Features

1. Lock-Free Reading

  • Readers access shared data without acquiring locks, eliminating lock contention
  • Sequence number tracking detects data modifications during read operations
  • Retry mechanism guarantees read consistency

2. Write Protection

  • Writers utilize atomic operations and interrupt protection for exclusive access
  • Sequence number parity indicates write state (even: readable, odd: writing)
  • SMP environments employ memory barriers for operation ordering

Main Interfaces

Initialization

void seqlock_init(seqcount_t *s);

Read Operations

uint32_t read_seqbegin(const seqcount_t *s);
uint32_t read_seqretry(const seqcount_t *s, uint32_t start);

Reader usage pattern:

uint32_t seq;
do {
    seq = read_seqbegin(&seqlock);
    // Read shared data
} while (read_seqretry(&seqlock, seq));

Write Operations

irqstate_t write_seqlock_irqsave(seqcount_t *s);
void write_sequnlock_irqrestore(seqcount_t *s, irqstate_t flags);

Writer usage pattern:

irqstate_t flags = write_seqlock_irqsave(&seqlock);
// Modify shared data
write_sequnlock_irqrestore(&seqlock, flags);

Technical Details

1. Sequence Number Mechanism

  • Sequence counter initializes to 0 (even)
  • Write start increments by 1 (becomes odd)
  • Write completion increments by 1 (returns to even)

2. Memory Barriers

  • SMP systems utilize appropriate read/write memory barriers
  • SMP_WMB(): Write memory barrier
  • SMP_RMB(): Read memory barrier
  • Ensures operation ordering and memory visibility

3. Atomic Operations

  • SMP environments employ atomic read/write and CAS operations
  • Prevents data races and maintains consistency

4. Interrupt Protection

  • Write operations disable interrupts
  • Prevents interrupt handlers from interfering with critical sections

Applicable Scenarios

  • Read operations significantly outnumber write operations
  • Readers can tolerate temporary data inconsistency
  • High-performance read operations are required

Performance Advantages

  • Read operations completely lock-free with minimal overhead
  • Write operations affect only concurrent writers, not concurrent readers
  • Significant performance improvement for read-intensive applications

This implementation accounts for differences between SMP and uniprocessor environments, ensuring correct operation across various configurations.

Impact

Since the Seqlock has not been used yet, it has not impact on current systems now.

Testing

Funtional Correctness and Performance Evaluation

Our test case spinlock_test is as follows: This test case starts different threads to increment a global variable protected by different locks by 1, then checks whether the value is correct after a specified number of operations, and outputs the throughput.

The spinlock_test will be pushed to nuttx-apps after #17556 is merged.

We run the test case on intel64:nsh_pci_smp (Intel Core i7 12700/KVM) using sudo qemu-system-x86_64 -enable-kvm -cpu host,+invtsc,+vmware-cpuid-freq,kvmclock=off -smp 4 -m 2G -kernel nuttx -nographic -serial mon:stdio, the throughput of these locks are:

Test Mode Threads Spinlock Throughput (op/s) Rspinlock Throughput (op/s) Seqcount Throughput (op/s)
non-SMP mode 1 84,791,631 33,708,814 134,760,642
SMP mode 1 92,997,311 34,277,351 63,980,964
SMP mode 2 11,695,600 4,821,645 10,487,860
SMP mode 3 3,906,668 2,173,656 4,515,274
/****************************************************************************
 * apps/testing/ostest/spinlock.c
 *
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.  The
 * ASF licenses this file to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance with the
 * License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the
 * License for the specific language governing permissions and limitations
 * under the License.
 *
 ****************************************************************************/

/****************************************************************************
 * Included Files
 ****************************************************************************/

#include <nuttx/config.h>
#include <stdio.h>
#include <pthread.h>
#include <sys/time.h>
#include <stdint.h>
#include <unistd.h>
#include <assert.h>
#include <inttypes.h>
#include <nuttx/atomic.h>
#include <nuttx/spinlock.h>
#include <nuttx/seqlock.h>

/****************************************************************************
 * Preprocessor Definitions
 ****************************************************************************/

#define MAX_THREAD_NUM (CONFIG_SMP_NCPUS)
#define LOOP_TIMES     (CONFIG_TEST_LOOP_SCALE * 100000)

aligned_data(64) struct spinlock_pub_args_s
{
  FAR void         *lock;
  volatile uint32_t counter;
  atomic_t          barrier;
  uint32_t          thread_num;
};

struct spinlock_thread_args_s
{
  uint64_t delta;
  FAR struct spinlock_pub_args_s *pub;
};

/****************************************************************************
 * Private Functions
 ****************************************************************************/

/* Helper functions for timespec calculating */

static inline uint64_t calc_diff(FAR struct timespec *start,
                                    FAR struct timespec *end)
{
  uint64_t diff_sec = end->tv_sec - start->tv_sec;
  long diff_nsec = end->tv_nsec - start->tv_nsec;
  if (diff_nsec < 0)
    {
      diff_sec -= 1;
      diff_nsec += 1000000000L;
    }

  return diff_sec * 1000000000ULL + diff_nsec;
}

#define LOCK_TEST_FUNC(lock_type, lock_func, unlock_func) \
FAR static void * lock_type##_test_thread(FAR void *arg) \
{ \
  struct spinlock_thread_args_s *param = \
                                (struct spinlock_thread_args_s *)arg; \
  irqstate_t flags; \
  struct timespec start; \
  struct timespec end; \
  uint32_t i; \
  FAR struct spinlock_pub_args_s *pub = param->pub; \
  FAR lock_type *l = (FAR lock_type *)pub->lock; \
  atomic_fetch_add(&pub->barrier, 1); \
  while (atomic_read(&pub->barrier) != pub->thread_num) \
    { \
      sched_yield(); \
    } \
  clock_gettime(CLOCK_REALTIME, &start); \
  for (i = 0; i < LOOP_TIMES; i++) \
    { \
      flags = lock_func(l); \
      pub->counter++; \
      unlock_func(l, flags); \
    } \
  clock_gettime(CLOCK_REALTIME, &end); \
  param->delta = calc_diff(&start, &end); \
  return NULL; \
}

LOCK_TEST_FUNC(spinlock_t, spin_lock_irqsave, spin_unlock_irqrestore)
LOCK_TEST_FUNC(rspinlock_t, rspin_lock_irqsave, rspin_unlock_irqrestore)
LOCK_TEST_FUNC(seqcount_t, write_seqlock_irqsave, write_sequnlock_irqrestore)

static inline
void run_test_thread(void *lock, FAR void *(*thread_func)(FAR void *arg),
                     uint32_t thread_num, const char *lock_type)
{
  pthread_t tid[MAX_THREAD_NUM];
  struct spinlock_pub_args_s pub;
  pthread_attr_t attr;
  struct sched_param sparam;
  struct spinlock_thread_args_s param[MAX_THREAD_NUM];
  struct timespec stime;
  struct timespec etime;
  cpu_set_t cpu_set = 1u;
  int status;
  int i;
  uint64_t total_ns = 0u;

  /* Initialize the public parameters. */

  printf("Test type: %s\n", lock_type);

  pub.lock       = lock;
  pub.counter    = 0u;
  pub.thread_num = thread_num;
  atomic_set_release(&pub.barrier, 0u);

  /* Set affinity to CPU0 */

#ifdef CONFIG_SMP
  cpu_set = 1u;
  if (OK != sched_setaffinity(getpid(), sizeof(cpu_set_t), &cpu_set))
    {
      printf("spinlock_test: ERROR: nxsched_set_affinity failed");
      ASSERT(false);
    }
#else
  UNUSED(cpu_set);
#endif

  /* Boost to maximum priority for test threads. */

  status = pthread_attr_init(&attr);
  if (status != 0)
    {
      printf("spinlock_test: ERROR: "
             "pthread_attr_init failed, status=%d\n",  status);
      ASSERT(false);
    }

  sparam.sched_priority = SCHED_PRIORITY_MAX;
  status = pthread_attr_setschedparam(&attr, &sparam);
  if (status != OK)
    {
      printf("spinlock_test: ERROR: "
             "pthread_attr_setschedparam failed, status=%d\n",  status);
      ASSERT(false);
    }

  clock_gettime(CLOCK_REALTIME, &stime);

  /* Create new test threads. */

  for (i = 0; i < thread_num; i++)
    {
      param[i].pub = &pub;
      param[i].delta = 0;

      /* Set affinity */

#ifdef CONFIG_SMP
      cpu_set = 1u << ((i + 1) % CONFIG_SMP_NCPUS);

      status = pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t),
                                           &cpu_set);
      if (status != OK)
        {
          printf("spinlock_test: ERROR: "
                 "pthread_attr_setaffinity_np failed, status=%d\n",  status);
          ASSERT(false);
        }
#endif

      status = pthread_create(&tid[i], &attr, thread_func, &param[i]);
      if (status != 0)
        {
          printf("spinlock_test: ERROR: "
                 "pthread_create failed, status=%d\n",  status);
          ASSERT(false);
        }
    }

  for (i = 0; i < thread_num; i++)
    {
      status = pthread_join(tid[i], NULL);
      if (status != 0)
        {
          printf("spinlock_test: ERROR: "
                 "pthread_join failed, status=%d\n",  status);
          ASSERT(false);
        }
    }

  /* Calculate the average throughput. */

  clock_gettime(CLOCK_REALTIME, &etime);
  for (i = 0; i < thread_num; i++)
    {
      total_ns += param[i].delta;
    }

  printf("%s: Test Results:\n", lock_type);
  printf("%s: Final counter: %" PRIu32 "\n", lock_type, pub.counter);
  assert(pub.counter == thread_num * LOOP_TIMES);
  printf("%s: Average throughput : %" PRIu64 " op/s\n", lock_type,
         (uint64_t)NSEC_PER_SEC * LOOP_TIMES * thread_num / total_ns);
}

/****************************************************************************
 * Public Functions
 ****************************************************************************/

/****************************************************************************
 * Name: spinlock_test
 ****************************************************************************/

static void spinlock_test_thread_num(unsigned thread_num)
{
  aligned_data(64) union
    {
      spinlock_t  spinlock;
      rspinlock_t rspinlock;
      seqcount_t  seqcount;
    } lock;

  printf("Start Lock test:\n");
  printf("Thread num: %u, Loop times: %d\n\n", thread_num, LOOP_TIMES);

  spin_lock_init(&lock.spinlock);
  run_test_thread(&lock, spinlock_t_test_thread, thread_num, "spinlock");

  rspin_lock_init(&lock.rspinlock);
  run_test_thread(&lock, rspinlock_t_test_thread, thread_num, "rspinlock");

  seqlock_init(&lock.seqcount);
  run_test_thread(&lock, seqcount_t_test_thread, thread_num, "seqcount");
}

void spinlock_test(void)
{
  unsigned tnr;

  for (tnr = 1; tnr < MAX_THREAD_NUM; tnr++)
    {
      spinlock_test_thread_num(tnr);
    }
}

hujun260 and others added 12 commits December 18, 2025 19:38
This commit added linux-style UP_RMB() and UP_WMB().

Signed-off-by: hujun5 <hujun5@xiaomi.com>
03:17:57  /home/work/ssd1/workspace/Vela-Multi-Boards-dev-system-Build@2/nuttx/include/nuttx/seqlock.h: In function 'read_seqbegin':
03:17:57  /home/work/ssd1/workspace/Vela-Multi-Boards-dev-system-Build@2/nuttx/include/nuttx/seqlock.h:107:3: error: 'asm' undeclared (first use in this function)
03:17:57     SMP_RMB();
03:17:57     ^
03:17:57  /home/work/ssd1/workspace/Vela-Multi-Boards-dev-system-Build@2/nuttx/include/nuttx/seqlock.h:107:3: note: each undeclared identifier is reported only once for each function it appears in
03:17:57  /home/work/ssd1/workspace/Vela-Multi-Boards-dev-system-Build@2/nuttx/include/nuttx/seqlock.h:107:3: error: expected ';' before 'volatile'
03:17:57     SMP_RMB();
03:17:57     ^

Signed-off-by: wangzhi16 <wangzhi16@xiaomi.com>
This commit added seqlock.h.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
This commit removed unnecessary memory barriers of the seqlock
implementation, since they may block the CPU pipeline and lead to
performance degradation.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit added predict_xxx to increase performance.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
read_seq Loop 20,000,000 times
before
233333376
after
183333375

Signed-off-by: hujun5 <hujun5@xiaomi.com>
This commit added SMP_WMB to seqlock.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
This commit provided a better implementation of the seqlock, which
ensure the functional correctness and provide better performance.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit improved the seqlock performance on non-SMP and SMP
platforms by 31.7% on average (83Mops -> 106Mops, tested on qemu-intel64/KVM).

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit implemented seqlock for non-SMP platforms, which achieves
1.62x performance improvement.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
The optimization reduces one judgment in read operations, improving write performance by 3% and read performance by 10%.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
remove warning about less headers.

Signed-off-by: zhangyu117 <zhangyu117@xiaomi.com>
@github-actions github-actions bot added Arch: arm Issues related to ARM (32-bit) architecture Arch: arm64 Issues related to ARM64 (64-bit) architecture Arch: ceva Issues related to CEVA architecture Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: tricore Issues related to the TriCore architecture from Infineon Arch: x86_64 Issues related to the x86_64 architecture Area: OS Components OS Components issues Size: M The size of the change in this PR is medium labels Dec 19, 2025
@Fix-Point Fix-Point changed the title include/nuttx: Introduce seqlock, a sequential based read-write lock. include/nuttx: Introduce seqlock, a sequential count based read-write lock. Dec 19, 2025
Copy link
Contributor

@acassis acassis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fix-Point thank you very much for this well detailed Summary and testing, however new feature requires new Documentation, otherwise we ending with more hidden features.

@Fix-Point
Copy link
Contributor Author

@Fix-Point thank you very much for this well detailed Summary and testing, however new feature requires new Documentation, otherwise we ending with more hidden features.

Thank you for the reminder. I will add the document in the next few days.

@github-actions github-actions bot added Area: Documentation Improvements or additions to documentation Size: L The size of the change in this PR is large labels Dec 22, 2025
@Fix-Point Fix-Point force-pushed the hrtimer_p1 branch 2 times, most recently from a7ff8d4 to 950cd1c Compare December 22, 2025 02:30
@Fix-Point
Copy link
Contributor Author

@Fix-Point thank you very much for this well detailed Summary and testing, however new feature requires new Documentation, otherwise we ending with more hidden features.

The documentation has been added. Please check if it is sufficiently detailed. If anything needs to be added, please point it out.

@anchao
Copy link
Contributor

anchao commented Dec 22, 2025

This commit has too many patches. Could it be consolidated into several focused commits based on feature sets, such as arch, seqlock, and doc? This is just a suggestion; you can keep it if you wish.

image image

@Fix-Point
Copy link
Contributor Author

Fix-Point commented Dec 22, 2025

This commit has too many patches. Could it be consolidated into several focused commits based on feature sets, such as arch, seqlock, and doc? This is just a suggestion; you can keep it if you wish.

image image

I support your point. If all these commits were completed by me alone, I would definitely consolidate them into a few commits (as I've done in past PRs).

However, these PR also include commits of some of my colleagues, such as @hujun260 (who proposed the first seqcount implementation in this PR), @wangzhi16 and @zhangyuduck consolidating their commits would be disrespectful to their work.

So I'm sorry that I can not make this change.

@anchao
Copy link
Contributor

anchao commented Dec 22, 2025

I support your point. If all these commits were completed by me alone, I would definitely consolidate them into a few commits (as I've done in past PRs).

However, these PR also include commits of some of my colleagues, such as @hujun260 (who proposed the first seqcount implementation in this PR), @wangzhi16 and @zhangyuduck consolidating their commits would be disrespectful to their work.

So I'm sorry that I can not make this change.

Okay, let's respect the work achievements of every individual.

hujun260 and others added 4 commits December 22, 2025 17:10
Move the header files to decouple compilation dependencies.

Signed-off-by: hujun5 <hujun5@xiaomi.com>
Since the adafruit-kb2040:smp do not support the `atomic_init`, we have
to remove the function.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit fixed struct name and constant.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit added ducumentation for seqcount.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
Copy link
Contributor

@anchao anchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's merge it — some of the disputed points do not constitute mandatory changes to the project.

@xiaoxiang781216
Copy link
Contributor

@acassis please review the documentation

@acassis acassis merged commit b5eae7d into apache:master Dec 22, 2025
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Arch: arm Issues related to ARM (32-bit) architecture Arch: arm64 Issues related to ARM64 (64-bit) architecture Arch: ceva Issues related to CEVA architecture Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: tricore Issues related to the TriCore architecture from Infineon Arch: x86_64 Issues related to the x86_64 architecture Area: Documentation Improvements or additions to documentation Area: OS Components OS Components issues Size: L The size of the change in this PR is large Size: M The size of the change in this PR is medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants