Skip to content

Conversation

@Fix-Point
Copy link
Contributor

@Fix-Point Fix-Point commented Dec 18, 2025

Summary

High-resolution Timer (HRTimer) is a timer abstraction capable of achieving nanosecond-level timing precision, primarily used in scenarios requiring high-precision clock events. With the advancement of integrated circuit technology, modern high-precision timer hardware (such as the typical x86 HPET) can already meet sub-nanosecond timing requirements and offer femtosecond-level jitter control.

Although the current timer abstraction in the NuttX kernel already supports nanosecond-level timing, its software timer abstraction, wdog, and the timer timeout interrupt handling process remain at microsecond-level (tick) precision, which falls short of high-precision timing demands. Therefore, it is necessary to implement a new timer abstraction—HRTimer, to address the precision limitations of wdog. HRTimer primarily provides the following functional interfaces:

  • Set a timer in nanoseconds: Configure a software timer to trigger at a specified nanosecond time.
  • Cancel a timer: Cancel the software timer.
  • Handle timer timeout: Execute timeout processing after the timer event is triggered.

Design

The new NuttX HRTimer is designed to address the issues of insufficient precision and excessively long blocking times in the current NuttX wdog. It draws on the strengths of the Linux HRTimer design while improving upon its weaknesses. The HRTimer design is divided into two parts: the HRTimer Queue and the HRTimer. The HRTimer Queue is a reusable component that allows users to freely customize their own HRTimer interface by pairing it with a private timer driver, without needing to modify the kernel code.

API Design

The HRTimer Queue is a zero-performance-overhead, composable, and customizable abstraction that provides only asynchronous-style interfaces:

  • hrtimer_queue_start(queue, timer): Asynchronously sends an HRTimer to HRTimer queue.
  • hrtimer_queue_async_cancel(queue, timer): Asynchronously cancels an HRTimer and returns the current reference count of the timer.
  • hrtimer_queue_wait(queue, timer): Waits for the release of all references to the HRTimer to obtain ownership of the HRTimer data structure.

All other user interfaces can be composed based on these three interfaces.

On top of the HRTimer Queue, users only need to implement the following interfaces to customize their own HRTimer implementation:

  • hrtimer_expiry(current): Handles timer expiration, typically called within the execution path of the corresponding timer hardware interrupt handler.
  • hrtimer_reprogram(queue, next_expired): Sets the next timer event.
  • hrtimer_current(): Gets the current time to set relative timers.

After implementing the above three interfaces, users can use the HRTIMER_QUEUE_GENERATE template macro to combine and generate their own hrtimer implementation, which mainly includes the following interfaces:

  • hrtimer_restart(timer, func, arg, time, mode): Restarts a timer that has been asynchronously canceled (its callback function might still be executing). This interface is designed to explicitly remind users to be aware of concurrency issues, as concurrency problems are prone to occur in actual programming and are very difficult to locate. Providing such an interface facilitates quick identification of concurrency issues.
  • hrtimer_start(timer, func, arg, time, mode): Starts a stopped timer. The mode parameter indicates whether it is a relative or absolute timer.
  • hrtimer_async_cancel(timer): Asynchronously cancels a timer. Note that the semantics of this interface are completely different from Linux's try_to_cancel. It ensures that the timer can definitely be canceled successfully, but may need to wait for its callback function to finish execution.
  • hrtimer_cancel(timer): Synchronously cancels a timer. If the timer's callback function is still executing, this function will spin-wait until the callback completes. It ensures that the user can always obtain ownership of the timer.

The design characteristics of HRTimer are as follows:

  • Strict and Simplified HRTimer State Machine: In the old wdog design, wdog could be reset in any state, which introduced unnecessary complexity to certain function implementations. For example, wd_start had to account for the possibility of restarting. In the new HRTimer design, an HRTimer that has already been started and not canceled cannot be started again.

  • Abstracted Sorting Queue: Since no single design can be optimal for all application scenarios, HRTimer abstracts interfaces for inserting and deleting nodes in the sorting queue. This allows for different data structure implementations to be configured for different application scenarios, as shown in Table 1.

    Table 1: Comparison of Several Sorting Queue Implementations

    Sorting Queue Implementation Insert Delete Delete Head Determinism Suitable Scenarios
    Doubly Linked List O(n) O(1) O(1) Moderate Embedded / Soft Real-time Systems
    Red-Black Tree O(logn) O(logn) O(logn) Slightly Poor General Purpose
  • Callback Execution Without Lock Held: HRTimer implements callback execution without lock held, ensuring that the system's blocking time is not limited by the user's callback function. However, this introduces additional states and waits, where waiting for reference release is primarily implemented using hazard pointers. This will be explained in detail in the subsequent state transition diagram.

  • Clear HRTimer Object Ownership Transfer Path: In the wdog implementation, the wdog callback function could restart the current timer directly without regard to ownership, potentially causing concurrency issues. In the new implementation, the HRTimer callback function cannot restart itself. Instead, inspired by Linux's design, the callback function returns whether a restart is needed. If a restart is required, the thread executing the callback function re-enqueues it; otherwise, the thread releases ownership. This change ensures a clear ownership transfer path for the HRTimer object.

  • Non-blocking Timer Restart: To address the issue in Linux where restarting a timer must wait for an already-started callback function to finish, which reduces the real-time performance, the new HRTimer implements a non-blocking timer restart mechanism. This mechanism reuses the last bit of the hazard pointer to mark whether the thread executing the callback has lost write ownership of the HRTimer object. After hrtimer_async_cancel is called, other threads executing callbacks will lose write ownership of the HRTimer (though their callback functions may still be executing). This means the HRTimer can be restarted and repurposed for other callbacks without waiting for the callback function to complete. However, note that the callback function might still be executing, requiring users to consider this concurrency and implement proper synchronization mechanisms within their callback functions. To explicitly remind users of this concurrency, an HRTimer whose callback function has not yet completed execution must be restarted using hrtimer_restart. This function relaxes the state checks on the HRTimer, allowing a timer with the callback running to be started.

  • Deterministic Timer Cancellation: To address the starvation issue present in Linux's timer cancellation, the new HRTimer implementation sets a cancellation state via hrtimer_async_cancel. This cancellation state has a unique and deterministic state transition, eliminating starvation. Memory reclamation is performed through hazard pointer checking loops. Hazard pointer checking ensures that all threads finish executing the callback function and release read ownership (reference release) of the specified HRTimer object.

The valid state transitions of an HRTimer object are shown in Figure 2. States are represented using a simplified notation of State|Ownership, such as HRTIMER_PENDING|shared. The meanings of the simplified ownership markers are as follows:

Ownership Markers

  • |private indicates that the resource is exclusively owned by a specific thread t. Only the owning thread t can read from or write to this resource.
  • |shared indicates that the resource is globally shared and can be read by any thread. However, only the thread t that holds the global lock l (t = Owned(l)) can obtain write ownership of this resource.
  • |half_shared indicates that the resource may be accessed by multiple threads, but only the thread that called async_cancel holds write ownership of this resource. Modifications to it by threads executing callback functions are prevented.

The resource ownership here uses a simplified notation. In actual static analysis or formal verification processes, more complex abstractions such as resource algebra might be employed.

All state transitions not described in the diagram must return failure. For example, a timer in the HRTIMER_PENDING state cannot be started (start) again. Note that there is one exception: a thread that is already in the HRTIMER_CANCELED state can legally call hrtimer_async_cancel again, and the state remains unchanged.

To avoid the overhead caused by threads waiting for callback functions to finish executing, HRTimer adds a restart interface. Under normal circumstances, the start interface cannot start a timer that is already in the canceled state. Only when the user uses this restart interface can a timer whose callback function has not yet completed be started normally. Using this interface serves to explicitly remind users to pay attention to concurrency within their callback functions. Furthermore, when concurrency issues arise with HRTimer, it helps in pinpointing the source of the problem—issues can only originate from callback functions where restart was used to restart the timer.

%%{
  init: {
    'theme': 'base',
    'themeVariables': {
      'primaryColor': '#FFFFFF',
      'primaryTextColor' : '#000000',
      'mermiad-container': "#FFFFFF",
      'primaryBorderColor': '#000000',
      'lineColor': '#000000',
      'secondaryColor': '#FFFFFF',
      'tertiaryColor': '#000000'
    },
    'sequence': { 'mirrorActors': false }
  }
}%%

stateDiagram-v2
    HRTIMER_COMPLETED|private --> HRTIMER_PENDING|shared : hrtimer_start
    HRTIMER_PENDING|shared --> HRTIMER_COMPLETED|private : hrtimer callback return 0 in hrtimer_expiry
    HRTIMER_PENDING|shared --> HRTIMER_PENDING|shared : hrtimer callback return non-zero in hrtimer_expiry
    HRTIMER_PENDING|shared --> HRTIMER_CANCELED|half_shared : hrtimer_async_cancel
    HRTIMER_CANCELED|half_shared --> HRTIMER_CANCELED|private : hrtimer_cancel wait all cores release the references to the timer.
    HRTIMER_CANCELED|half_shared --> HRTIMER_PENDING|shared : hrtimer_restart
    HRTIMER_CANCELED|private --> HRTIMER_COMPLETED|private : Complete the cancel
Loading

Figure 2 HRTimer State Transition Diagram

Performance Evaluation

We conducted 1 million interface calls on the intel64:nsh (Intel 12700) platform and measured their average execution CPU cycles, as shown in the Figure 3 below. It can be observed that the overhead for starting and asynchronously canceling timers is significantly reduced compared to wdog. Additionally, after enabling hrtimer, wdog processing is treated as an hrtimer timer, which lowers the overhead of the wdog interface.
hrtimer tsv

Figure 3 HRtimer API Latency Test

Plan

The merge plan for this PR is as follows:

  1. Introduce definitions of rmb/wmb memory barriers (done by @hujun260 )
  2. Introduce seqlock (done by @hujun260 / @Fix-Point )
  3. Simplify the timer expiration processing flow in preparation for introducing HRtimer.
  4. Modify the semantics of scheduling functions to allow immediate timer event triggering.
  5. Fix buggy wdog to avoid attributing faults to the introduction of hrtimer.
  6. Introduce hrtimer_queue.
  7. Introduce hrtimer.
  8. [WIP] Introdude hrtimer_test.
  9. [WIP] Add HRtimer documents.
  10. [WIP] Enhance the oneshot arch_alarm to support non-tick arguments mode.

Impact

HRTimer currently is disabled by default, so it has no effect on system.

Testing

Tested on intel64:nsh, rv-virt:smp, qemu-armv8a:smp, ostest passed. The hrtimer parallel stress test ran for over 72 hours without errors. The parallel stress test cases is showed in Appendix.

Discussion

This hrtimer is a completely different design from #17517. I believe that the hrtimer in this PR is better than the implementation in #17517 in terms of code reusability, customizability, scalability, performance, memory overhead, reliability, integrity, and MIRSA-C compatibility.

Appendix

The hrtimer parallel stress test cases:

/****************************************************************************
 * apps/testing/ostest/hrtimer.c
 *
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.  The
 * ASF licenses this file to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance with the
 * License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the
 * License for the specific language governing permissions and limitations
 * under the License.
 *
 ****************************************************************************/

/****************************************************************************
 * Included Files
 ****************************************************************************/

#include <nuttx/config.h>
#include <nuttx/arch.h>
#include <nuttx/sched.h>
#include <nuttx/spinlock.h>

#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <syslog.h>
#include <unistd.h>

#include <nuttx/hrtimer.h>

/****************************************************************************
 * Pre-processor Definitions
 ****************************************************************************/

#define HRTIMER_TEST_RAND_ITER     (1024 * 2)
#define HRTIMER_TEST_THREAD_NR     (CONFIG_SMP_NCPUS * 8)
#define HRTIMER_TEST_TOLERENT_NS   (10 * NSEC_PER_TICK)
#define HRTIMER_TEST_CRITICAL_SECTION  1024

#define hrtest_printf(s, ...)      printf("[%d] " s, this_cpu(), __VA_ARGS__)

#define hrtest_delay(delay_ns)     usleep(delay_ns / 1000 + 1)

#define hrtimer_start(timer, cb, arg, delay_ns) hrtimer_start(timer, cb, arg, delay_ns, HRTIMER_MODE_REL)
#define hrtimer_restart(timer, cb, arg, delay_ns) hrtimer_restart(timer, cb, arg, delay_ns, HRTIMER_MODE_REL)

/****************************************************************************
 * Private Type
 ****************************************************************************/

typedef struct hrtimer_tparam_s
{
  FAR hrtimer_t    *timer;
  FAR spinlock_t   *lock;
  uint64_t          interval;
  volatile uint64_t callback_cnt;
  volatile uint64_t triggered_ns;
  volatile uint8_t  current_cpu;
  volatile uint8_t  state;
} hrtimer_tparam_t;

/****************************************************************************
 * Private Functions
 ****************************************************************************/

static uint64_t hrtimer_test_callback(void *arg, uint64_t expired_ns)
{
  FAR hrtimer_tparam_t *hrtimer_tparam = (FAR hrtimer_tparam_t *)arg;

  /* Record the system tick at which the callback was triggered */

  // clock_systime_nsec(&hrtimer_tparam->triggered_ns);
  hrtimer_tparam->triggered_ns = clock_systime_nsec();

  /* Increment the callback count */

  hrtimer_tparam->callback_cnt   += 1;

  return 0;
}

static void hrtimer_test_checkdelay(int64_t diff, uint64_t delay_ns)
{
  /* Ensure the watchdog trigger time is not earlier than expected. */

  ASSERT(diff - delay_ns >= 0);

  /* If the timer latency exceeds the tolerance, print a warning. */

  if (diff - delay_ns > HRTIMER_TEST_TOLERENT_NS)
    {
      hrtest_printf("WARNING: hrtimer latency ns %" PRId64
                    "(> %u may indicate timing error)\n",
                    diff - delay_ns,
                    (unsigned)HRTIMER_TEST_TOLERENT_NS);
    }
}

static void hrtimer_test_once(FAR hrtimer_t *timer,
                              FAR hrtimer_tparam_t *param,
                              uint64_t delay_ns)
{
  uint64_t   cnt;
  int64_t    diff;
  uint64_t   timerset_ns;
  irqstate_t flags;

  hrtest_printf("hrtimer_test_once %" PRIu64 " ns\n", delay_ns);

  /* Save the current callback count. */

  cnt = param->callback_cnt;

  /* Enter a critical section to prevent interruptions. */

  flags = up_irq_save();
  sched_lock();

  /* Record the current system tick before setting the watchdog. */

  // clock_systime_nsec(&timerset_ns);
  timerset_ns = clock_systime_nsec();

  ASSERT(hrtimer_start(timer, hrtimer_test_callback,
                       param, delay_ns) == OK);

  up_irq_restore(flags);
  sched_unlock();

  /* Wait until the callback is triggered exactly once. */

  while (cnt + 1 != param->callback_cnt)
    {
      hrtest_delay(delay_ns);
    }

  /* Check if the delay is within the acceptable tolerance. */

  diff = param->triggered_ns - timerset_ns;

  hrtimer_test_checkdelay(diff, delay_ns);

  hrtimer_cancel(timer);
}

static void hrtimer_test_rand(FAR hrtimer_t *timer,
                              FAR hrtimer_tparam_t *param, uint64_t rand_ns)
{
  uint64_t   cnt;
  unsigned   idx;
  uint64_t   delay_ns;
  uint64_t   timer_setns;
  int64_t    diff;
  irqstate_t flags = 0;

  hrtest_printf("hrtimer_test_rand %" PRIu64 " ns\n", rand_ns);

  /* Perform multiple iterations with random delays. */

  for (idx = 0; idx < HRTIMER_TEST_RAND_ITER; idx++)
    {
      /* Generate a random delay within the specified range. */

      delay_ns = rand() % rand_ns;

      DEBUGASSERT(timer->func == NULL);

      /* Enter critical section if the callback count is odd. */

      cnt = param->callback_cnt;

      if (cnt % 2u)
        {
          flags = up_irq_save();
        }

      timer_setns = clock_systime_nsec();
      ASSERT(hrtimer_start(timer, hrtimer_test_callback, param,
                           delay_ns) == 0);
      if (cnt % 2u)
        {
          up_irq_restore(flags);
        }

      /* Decide to wait for the callback or cancel the watchdog. */

      if (delay_ns % 2u)
        {
          /* Wait for the callback. */

          while (cnt + 1u != param->callback_cnt)
            {
              hrtest_delay(delay_ns);
            }

          /* Check the delay if the callback count is odd. */

          if (cnt % 2u)
            {
              diff = (sclock_t)(param->triggered_ns - timer_setns);
              hrtimer_test_checkdelay(diff, delay_ns);
            }
        }

      hrtimer_cancel(timer);
      DEBUGASSERT(timer->func == NULL);
    }

  hrtimer_cancel(timer);
}

static uint64_t hrtimer_test_rand_cancel_callback(void *arg, uint64_t expired_ns)
{
  FAR hrtimer_tparam_t *tparam = (FAR hrtimer_tparam_t *)arg;
  FAR spinlock_t *lock = tparam->lock;
  uint64_t   delay_ns = 0;
  irqstate_t flags = spin_lock_irqsave(lock);

  /* Random sleep */

  delay_ns = tparam->triggered_ns % tparam->interval;

  /* Check if the version is same. */

  if (expired_ns == tparam->timer->expired)
    {
      tparam->triggered_ns = clock_systime_nsec();

      /* Increment the callback count */

      tparam->callback_cnt++;
    }

  spin_unlock_irqrestore(lock, flags);

  up_ndelay(delay_ns);

  return 0;
}
 
static void hrtimer_test_rand_cancel(FAR hrtimer_t *timer,
                                     FAR hrtimer_tparam_t *param,
                                     uint64_t rand_ns)
{
  uint64_t   cnt;
  unsigned   idx;
  uint64_t   delay_ns;
  irqstate_t flags;
  spinlock_t rand_cancel_lock = SP_UNLOCKED;

  hrtest_printf("hrtimer_test_rand cancel %" PRIu64 " ns\n", rand_ns);

  param->timer    = timer;
  param->interval = rand_ns;
  param->lock     = &rand_cancel_lock;

  /* Perform multiple iterations with random delays. */

  for (idx = 0; idx < HRTIMER_TEST_RAND_ITER; idx++)
    {
      /* Generate a random delay within the specified range. */

      delay_ns = rand() % rand_ns;

      flags = spin_lock_irqsave(&rand_cancel_lock);

      cnt = param->callback_cnt;
      ASSERT(hrtimer_restart(timer, hrtimer_test_rand_cancel_callback, param,
                             delay_ns) == 0);

      spin_unlock_irqrestore(&rand_cancel_lock, flags);

      /* Decide to wait for the callback or cancel the watchdog. */

      if (delay_ns % 2u)
        {
          /* Wait for the callback finished. */

          while (param->callback_cnt != cnt + 1u)
            {
              hrtest_delay(delay_ns);
            }

          while (HRTIMER_ISPENDING(timer))
            {
              hrtest_delay(0);
            }
        }
      else
        {
          hrtimer_async_cancel(timer);
        }
    }

  hrtimer_cancel(timer);
}

static uint64_t hrtimer_test_callback_period(void *arg, uint64_t expired_ns)
{
  FAR hrtimer_tparam_t *tparam   = (FAR hrtimer_tparam_t *)arg;
  sclock_t              interval = tparam->interval;

  tparam->callback_cnt++;
  tparam->triggered_ns = clock_systime_nsec();

  return interval;
}

static void hrtimer_test_period(FAR hrtimer_t *timer,
                                FAR hrtimer_tparam_t *param,
                                uint64_t delay_ns,
                                unsigned int times)
{
  uint64_t cnt;
  clock_t  timer_setns;
  hrtest_printf("hrtimer_test_period %" PRIu64 " ns\n", delay_ns);

  cnt = param->callback_cnt;

  param->interval = delay_ns;

  ASSERT(param->interval > 0);

  // clock_systime_nsec(&timer_setns);
  timer_setns = clock_systime_nsec();

  ASSERT(hrtimer_start(timer, hrtimer_test_callback_period,
                       param, param->interval) == OK);

  hrtest_delay(times * delay_ns);

  hrtimer_cancel(timer);

  DEBUGASSERT(timer->func == NULL);

  hrtest_printf("periodical hrtimer triggered %" PRIu64 " times, "
                 "elapsed nsec %" PRIu64 "\n", param->callback_cnt - cnt,
                 param->triggered_ns - timer_setns);

  if (param->callback_cnt - cnt < times)
    {
      hrtest_printf("WARNING: periodical hrtimer"
                    "triggered times < %u\n", times);
    }
}

#ifdef CONFIG_SMP
// static spinlock_t g_hrtimer_test_spinlock = SP_UNLOCKED;
static uint64_t hrtimer_test_callback_crita(void *arg, uint64_t expired_ns)
{
  FAR hrtimer_tparam_t *hrtimer_tparam = (FAR hrtimer_tparam_t *)arg;

  /* change status */

  if (hrtimer_tparam->current_cpu != this_cpu())
    {
      hrtimer_tparam->current_cpu = this_cpu();
      hrtimer_tparam->callback_cnt++;
    }

  /* check whether parameter be changed by another critical section */

  ASSERT(hrtimer_tparam->state == 0);
  hrtimer_tparam->state = !hrtimer_tparam->state;

  return 0;
}

static uint64_t hrtimer_test_callback_critb(void *arg, uint64_t expired_ns)
{
  FAR hrtimer_tparam_t *hrtimer_tparam = (FAR hrtimer_tparam_t *)arg;

  /* change status */

  if (hrtimer_tparam->current_cpu != this_cpu())
    {
      hrtimer_tparam->current_cpu = this_cpu();
      hrtimer_tparam->callback_cnt++;
    }

  /* check whether parameter be changed by another critical section */

  ASSERT(hrtimer_tparam->state == 1);
  hrtimer_tparam->state = !hrtimer_tparam->state;

  return 0;
}

static void hrtimer_test_critical_section(FAR hrtimer_t *timer,
                                          FAR hrtimer_tparam_t *param)
{
  int cnt = 0;

  DEBUGASSERT(!HRTIMER_ISPENDING(timer));

  while (cnt < HRTIMER_TEST_CRITICAL_SECTION)
    {
      /* set param statue and start wdog */

      param->state = 0;
      param->current_cpu = this_cpu();
      hrtimer_start(timer, hrtimer_test_callback_crita, param, 0);

      /* set param statue and start wdog */

      hrtimer_cancel(timer);
      param->state = 1;
      param->current_cpu = this_cpu();
      hrtimer_start(timer, hrtimer_test_callback_critb, param, 0);

      if (++cnt % 100 == 0)
        {
          printf("hrtimer critical section test %d times.\n", cnt);
        }

      hrtimer_cancel(timer);
    }

  hrtimer_cancel(timer);
}
#endif

static void hrtimer_test_run(FAR hrtimer_tparam_t *param)
{
  uint64_t             cnt;
  uint64_t             rest;
  hrtimer_t     test_hrtimer =
    {
      0
    };

  param->timer = &test_hrtimer;

  /* Wrong arguments of the hrtimer_start */

  ASSERT(hrtimer_start(&test_hrtimer, NULL, NULL, 0) != OK);
  ASSERT(hrtimer_start(&test_hrtimer, NULL, NULL, -1) != OK);

  /* Delay = 0 */

  hrtimer_test_once(&test_hrtimer, param, 0);

  /* Delay > 0, small */

  hrtimer_test_once(&test_hrtimer, param, 1);
  hrtimer_test_once(&test_hrtimer, param, 10);
  hrtimer_test_once(&test_hrtimer, param, 100);
  hrtimer_test_once(&test_hrtimer, param, 1000);
  hrtimer_test_once(&test_hrtimer, param, 10000);

  /* Delay > 0, middle 100us */

  hrtimer_test_once(&test_hrtimer, param, 100000);
  hrtimer_test_once(&test_hrtimer, param, 1000000);
  hrtimer_test_once(&test_hrtimer, param, 10000000);

#ifdef CONFIG_SMP

  /* Test wdog critical section */

  hrtimer_test_critical_section(&test_hrtimer, param);

#endif

  /* Delay > 0, maximum */

  cnt = param->callback_cnt;

  /* Maximum */

  ASSERT(hrtimer_start(&test_hrtimer, hrtimer_test_callback,
                       param, UINT64_MAX) == OK);

  /* Sleep for 1s */

  hrtest_delay(USEC_PER_SEC / 100);

  /* Ensure watchdog not alarmed */

  ASSERT(cnt == param->callback_cnt);

  rest = hrtimer_gettime(&test_hrtimer);

  ASSERT(rest < UINT64_MAX);

  ASSERT(hrtimer_cancel(&test_hrtimer) == OK);

  hrtest_printf("hrtimer_start with maximum delay, cancel OK, rest %" PRIu64 "\n",
                rest);

  /* period wdog delay from 1000us to 10000us */

  hrtimer_test_period(&test_hrtimer, param, 1000000, 128);

  /* Random delay ~12us */

  hrtimer_test_rand(&test_hrtimer, param, 12345);

  hrtimer_test_rand_cancel(&test_hrtimer, param, 67890);
}

/* Multi threaded */

static FAR void *hrtimer_test_thread(FAR void *param)
{
  hrtimer_test_run(param);
  return NULL;
}

/****************************************************************************
 * Public Functions
 ****************************************************************************/

void hrtimer_test(void)
{
  unsigned int   thread_id;
  pthread_attr_t attr;
  pthread_t      pthreads[HRTIMER_TEST_THREAD_NR];
  hrtimer_tparam_t params[HRTIMER_TEST_THREAD_NR] =
    {
      0
    };

  printf("hrtimer_test start...\n");

  ASSERT(pthread_attr_init(&attr) == 0);

  /* Create wdog test thread */

  for (thread_id = 0; thread_id < HRTIMER_TEST_THREAD_NR; thread_id++)
    {
      ASSERT(pthread_create(&pthreads[thread_id], &attr,
                            hrtimer_test_thread, &params[thread_id]) == 0);
    }

  for (thread_id = 0; thread_id < HRTIMER_TEST_THREAD_NR; thread_id++)
    {
      pthread_join(pthreads[thread_id], NULL);
    }

  ASSERT(pthread_attr_destroy(&attr) == 0);

  printf("hrtimer_test end...\n");
}

@github-actions github-actions bot added Area: Documentation Improvements or additions to documentation Arch: arm Issues related to ARM (32-bit) architecture Arch: arm64 Issues related to ARM64 (64-bit) architecture Arch: ceva Issues related to CEVA architecture Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: tricore Issues related to the TriCore architecture from Infineon Arch: x86_64 Issues related to the x86_64 architecture Area: Drivers Drivers issues Area: OS Components OS Components issues Size: XL The size of the change in this PR is very large. Consider breaking down the PR into smaller pieces. labels Dec 18, 2025
@wangchdo
Copy link
Contributor

You don’t need to make such extensive changes to the kernel just to add hrtimer support, and when you claim that the tests have passed, you should also include the corresponding test cases,test logs

flags = spin_lock_irqsave(&g_wdspinlock);
if (wdog != NULL)
{
sched_note_wdog(NOTE_WDOG_CANCEL, (FAR void *)wdog->func,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR contains many of redundant modifications unrelated to hrtimers. In particular, the optimizations for the clock subsystem need to be separated and merged independently. From the perspective of hrtimer implementation details, I do not observe any significant differences; instead, your commits include extensive code for ticks/counter conversion.
I recommend merging #17517 first, as this implementation serves as an optional hrtimer configuration for the system and does not alter the underlying clock system implementation. If your implementation is deemed superior, optimizations can be developed based on #17517

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes prepared for hrtimer are general and can simplify kernel implementation and reduce kernel latency even without the introduction of hrtimer. I am going to split them into several parts.

I do not support the merging of #17517. I believe the design of that HRTimer is completely flawed and lacks corresponding performance and parallel correctness tests.

This HRTimer design is entirely different from #17517, and there is no basis for the claim that it is an improvement upon it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you have taken note of the implementation of hrtimers in HaloOS. In particular, the core logic of the hrtimer_queue component is actually similar to that in HaloOS in terms of both naming and implementation:

https://gitee.com/haloos/vcos_kernel_nuttx/blob/master/sched/hrtimer/hrtimer_queue.h

This naming convention is not commonly adopted—for instance, the Linux kernel uses a different naming approach for equivalent functionality:

https://github.com/torvalds/linux/blob/master/include/linux/hrtimer.h

Additionally, I noticed your first commit was on December 1st, while @wangchdo commit was earlier, right?

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you have taken note of the implementation of hrtimers in HaloOS. In particular, the core logic of the hrtimer_queue component is actually similar to that in HaloOS in terms of both naming and implementation:

https://gitee.com/haloos/vcos_kernel_nuttx/blob/master/sched/hrtimer/hrtimer_queue.h

This naming convention is not commonly adopted—for instance, the Linux kernel uses a different naming approach for equivalent functionality:

https://github.com/torvalds/linux/blob/master/include/linux/hrtimer.h

Additionally, I noticed your first commit was on December 1st, while @wangchdo commit was earlier, right?

image

The HRTimer Queue is a zero-performance-overhead, composable, and customizable abstraction that provides only asynchronous-style interfaces:

  • hrtimer_queue_start(queue, timer): Asynchronously sends an HRTimer to HRTimer queue.
  • hrtimer_queue_async_cancel(queue, timer): Asynchronously cancels an HRTimer and returns the current reference count of the timer.
  • hrtimer_queue_wait(queue, timer): Waits for the release of all references to the HRTimer to obtain ownership of the HRTimer data structure.

https://gitee.com/haloos/vcos_kernel_nuttx/blob/master/sched/hrtimer/hrtimer_queue.h This appears to be a implementation of a literal queue based on a heap.

I don't think these two hrtimer_queue have anything in common except for their names.

@Fix-Point
Copy link
Contributor Author

You don’t need to make such extensive changes to the kernel just to add hrtimer support, and when you claim that the tests have passed, you should also include the corresponding test cases,test logs

Parallel test cases added.

@wangchdo
Copy link
Contributor

By the way, I think your implementation largely builds on the ideas and implementation from the PR I submitted two months ago (such as the RT-tree, the queue abstraction, and the overall motivation):
#17065

In addition, the state machine design is very similar to the one in the PR I submitted four days ago:
#17517

@Fix-Point
Copy link
Contributor Author

Fix-Point commented Dec 18, 2025

By the way, I think your implementation largely builds on the ideas and implementation from the PR I submitted two months ago (such as the RT-tree, the queue abstraction, and the overall motivation): #17065

In addition, the state machine design is very similar to the one in the PR I submitted four days ago: #17517.

  1. As I mentioned before, HRTimer design in this PR was inspired by the Linux HRTimer, where the Linux HRTimer used RB-Tree, is that your idea? Please prove it.

  2. Where is your queue-abstraction? I hadn't even looked at your code until a few days ago. I started HRTimer design 2 months ago and I was busy fixing the ClockDevice driver issue, so I didn't have time to complete my implementation.

  3. Is the state-machine really same? Please prove it with diagrams.

This commit removed nxsched_alarm_tick_expiration to simplify the timer
expiration.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit removed nxsched_alarm_expiration to simplify the timer
expiration.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit moved the g_wdtimernested to sched_timerexpiration, since
wdog and hrtimer can share it.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
@wangchdo
Copy link
Contributor

@Fix-Point @anchao @wangchdo There’s no need for further argue. Please provide your test cases and performance data. Let’s compare the robustness and performance of each solution directly. Whichever performs better should be accepted.

Do you think this is a game? Are robustness and performance really the only criteria? Or do you seriously believe my implementation cannot be optimized further?

@Fix-Point
Copy link
Contributor Author

@Fix-Point @anchao @wangchdo There’s no need for further argue. Please provide your test cases and performance data. Let’s compare the robustness and performance of each solution directly. Whichever performs better should be accepted.

Do you think this is a game? Are robustness and performance really the only criteria? Or do you seriously believe my implementation cannot be optimized further?

Talk is cheap, show me the code/data.

The current wdog implementation is buggy. This patch primarily makes the following changes:

1. Revert the spinlock to critical_section.

    If the wdog use the fine-grained spin-lock, and allow the callback
    execution without the lock held, there will be incorrect synchronization.

    E.g.
    The first `nxsem_timeout` callback function caused the second semaphore wait to fail.

    Core 0 [nxsem_clockwait]             |  Core 1
    enter_critical_section()             |  ...
    wd_start(nxsem_timeout)              |  ...
    nxsem_wait(sem)                      |  wd_expiration() --> nxsem_timeout
    wd_cancel(&rtcb->waitdog)           |  enter_critical_section()
    leave_critical_section()             |  Waiting...
    ....nxsem_clockwait                 |  Waiting...
    enter_critical_section()             |  Waiting...
    wd_start(nxsem_timeout)              |  Waiting...
    nxsem_wait(sem)                      |  Core 1 enter the critical section
                                         |  nxsem_wait_irq(wtcb, ETIMEDOUT) incorrectly wake-up the rtcb.

2. Simplify the expiration.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit removed incorrect hrtimer implementation. This
implementation can not work well for SMP systems.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
VELAPLATFO-78974

This commit introduced hrtimer_queue, a resuable component to generate
user-defined hrtimer implementation.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit introduced the high-resolution timer abstraction. The hrtimer design features including:
Use a strict state machine: an active timer can not be directly restarted, simplifying the implementation.
Abstract the sorted queue for flexibility, allowing different data structures for various use cases.
Execute callbacks with interrupts enabled, using hazard pointers to manage references.
Clear ownership transfer: callbacks return the next expiration time for periodic timers, and the thread executing the callback is responsible for restarting or releasing the timer.
Non-blocking restart: allow restarting a timer even if its callback is still running, requiring proper synchronization in the callback function.
Starvation-free cancellation: use hazard pointers to avoid starvation and ensure safe memory reclamation.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
This commit supported wdog/scheduler hrtimer with
tickless enabled.

Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
@acassis
Copy link
Contributor

acassis commented Dec 18, 2025

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

@wangchdo
Copy link
Contributor

wangchdo commented Dec 18, 2025

@Fix-Point @anchao @wangchdo There’s no need for further argue. Please provide your test cases and performance data. Let’s compare the robustness and performance of each solution directly. Whichever performs better should be accepted.

Do you think this is a game? Are robustness and performance really the only criteria? Or do you seriously believe my implementation cannot be optimized further?

Talk is cheap, show me the code/data.

I aready submitted the test code in apps ostest two days ago, don't let me tell you how to check PR, the test checks the precison exactly as 1 nsec, not like your test code allowing margin.

@Fix-Point
Copy link
Contributor Author

Fix-Point commented Dec 18, 2025

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

I agree.

The main problem is that NuttX should consider all application scenarios, including IoT devices, automotive, multi-core embedded devices, and etc. But, @wangchdo only considered his tc397 board in his design, adding so many ugly workarounds to NuttX kernel just to fix the Tricore timer issues. I believe such code and implementation should be removed later.

I believe the community should encourage better implementations, rather than more meaningless, performance-inefficient, and product-specific implementations.

For the HRtimer, I believe my implementation is better in performance, composability and general. So I insist this hrtimer is a better implementation, not just "a optimization for their hrtimer". I suggest they added optimization on this implementation, that's my opinion.

By the way, It is so funny that every time @wangchdo claims all I have done looks like his work, although these are completely different things.

@xiaoxiang781216 @Fix-Point This looks similar to the hrtimer driver I planned to add in my PR #17065, which was intended to provide access to the hardware timer count.

However, @xiaoxiang781216 mentioned in that PR that adding a new timer driver is not allowed in NuttX, so I removed the implementation, and reimplemented the sched/hrtimer based on the exsiting timer driver.

Your PR to introcude ClockDevice makes me feel confused about the criteria for what kind of functionality is acceptable to add to NuttX...

@xiaoxiang781216
Copy link
Contributor

@Fix-Point could you separate the non hrtimer patch to the new pr?

@Fix-Point
Copy link
Contributor Author

@Fix-Point could you separate the non hrtimer patch to the new pr?

I am splitting these patches into several parts.

@wangchdo
Copy link
Contributor

wangchdo commented Dec 18, 2025

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

I agree.

The main problem is that NuttX should consider all application scenarios, including IoT devices, automotive, multi-core embedded devices, and etc. But, @wangchdo only considered his tc397 board in his design, adding so many ugly workarounds to NuttX kernel just to fix the Tricore timer issues. I believe such code and implementation should be removed later.

I believe the community should encourage better implementations, rather than more meaningless, performance-inefficient, and product-specific implementations.

For the HRtimer, I believe my implementation is better in performance, composability and general. So I insist this hrtimer is a better implementation, not just "a optimization for their hrtimer". I suggest they added optimization on this implementation, that's my opinion.

By the way, It is so funny that every time @wangchdo claims all I have done looks like his work, although these are completely different things.

@xiaoxiang781216 @Fix-Point This looks similar to the hrtimer driver I planned to add in my PR #17065, which was intended to provide access to the hardware timer count.

However, @xiaoxiang781216 mentioned in that PR that adding a new timer driver is not allowed in NuttX, so I removed the implementation, and reimplemented the sched/hrtimer based on the exsiting timer driver.

Your PR to introcude ClockDevice makes me feel confused about the criteria for what kind of functionality is acceptable to add to NuttX...

I don't think it is polite to attack other personally, i am just a personal contributor, not a team working on nuttx, and contribute PRs using my personal time, and they got merged only when are approved by committers or PMC.

If you think they are ugly, inefficient or meaningless,you are welcome to provide comments

I have merged more than 110 commits (timer or tricore arch related are only a small part of them) without receiving any of your comments

@Fix-Point Fix-Point mentioned this pull request Dec 19, 2025
1 task
@Fix-Point
Copy link
Contributor Author

Fix-Point commented Dec 19, 2025

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

I agree.
The main problem is that NuttX should consider all application scenarios, including IoT devices, automotive, multi-core embedded devices, and etc. But, @wangchdo only considered his tc397 board in his design, adding so many ugly workarounds to NuttX kernel just to fix the Tricore timer issues. I believe such code and implementation should be removed later.
I believe the community should encourage better implementations, rather than more meaningless, performance-inefficient, and product-specific implementations.
For the HRtimer, I believe my implementation is better in performance, composability and general. So I insist this hrtimer is a better implementation, not just "a optimization for their hrtimer". I suggest they added optimization on this implementation, that's my opinion.
By the way, It is so funny that every time @wangchdo claims all I have done looks like his work, although these are completely different things.

@xiaoxiang781216 @Fix-Point This looks similar to the hrtimer driver I planned to add in my PR #17065, which was intended to provide access to the hardware timer count.

However, @xiaoxiang781216 mentioned in that PR that adding a new timer driver is not allowed in NuttX, so I removed the implementation, and reimplemented the sched/hrtimer based on the exsiting timer driver.

Your PR to introcude ClockDevice makes me feel confused about the criteria for what kind of functionality is acceptable to add to NuttX...

I don't think it is polite to attack other personally, i am just a personal contributor, not a team working on nuttx, and contribute PRs using my personal time, and they got merged only when are approved by committers or PMC.

If you think they are ugly, inefficient or meaningless,you are welcome to provide comments

I have merged more than 110 commits (timer or tricore arch related are only a small part of them) without receiving any of your comments

I have no intention of arguing with anyone. My only hope is for NuttX to become faster and more reliable—nothing more. I respect and understand all community developers because it is their efforts that drive the progress of NuttX.

My concern is that you have repeatedly accused me, without any evidence, of making my implementation similar to yours and implied that I used your ideas. This is deeply disrespectful to me. Therefore, I would like to clarify a few facts to show that I did not use your ideas at all, hoping to clear up any misunderstandings.

  1. Timeline:
    At the Apache NuttX International Workshop on October 16, 2025, I delivered a presentation titled "CLOCKDEVICE: New Timer Driver Abstraction for NuttX." In the architecture diagram on page 9, I had already included hrtimer in the plan. The PPT was completed and reviewed as early as September.
    By mid-September, the concurrent state machine for HRtimer had already been designed, though the interface details were not yet finalized.
    In October, after referencing the Linux kernel's HRTimer, I drafted a design document, refined the HRTimer design, and discussed it internally with colleagues. From September to November, I was fully occupied with developing, validating, and upstreaming the CLOCKDEVICE driver to the community, leaving me no time to complete the full implementation of hrtimer, which remained in an early prototype stage. Collegues informed me that someone in the community was working on hrtimer, but due to my busy schedule, I never looked into it—this was my mistake.
    On December 1st, I provided a relatively complete implementation of hrtimer, which underwent multiple rounds of internal review and improvements. On December 15th, during a WeChat conversation, I shared my design with you and pointed out issues in your design. To my confusion, you reacted with anger, accusing me of copying your ideas. Prior to this, I had never communicated with you or seen your implementation. Accusing someone of copying ideas is a serious allegation, and it deeply hurt my self-respect. It is you who suggested that I submit my code to the community for evaluation. I agreed with this suggestion and submitted the PR on December 18th.

  2. I only wish to engage in technical discussions. Your hrtimer implementation has the following issues, which cannot be fixed with minor modifications:

  • Completely unusable for users: This is a fundamental flaw in the state machine design and cannot be resolved with minor changes. [BUG] Unusable HRTimer #17567
  • Untested for SMP: Its reliability on SMP platforms is questionable.
  • Lacks detailed design documentation and clear state transition diagrams: ISO 26262 functional safety standards require a structured diagram approach, and state machine designs must be provided for modules.
  • No performance test data: You have not provided any data demonstrating the impact on system execution time.

I believe it is entirely reasonable to replace your implementation with a better one (or refactor it, if that term makes you feel more comfortable).

I do not understand why @anchao has been siding with you while ignoring the facts, which has left me disappointed with the community.

@wangchdo
Copy link
Contributor

wangchdo commented Dec 19, 2025

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

I agree.
The main problem is that NuttX should consider all application scenarios, including IoT devices, automotive, multi-core embedded devices, and etc. But, @wangchdo only considered his tc397 board in his design, adding so many ugly workarounds to NuttX kernel just to fix the Tricore timer issues. I believe such code and implementation should be removed later.
I believe the community should encourage better implementations, rather than more meaningless, performance-inefficient, and product-specific implementations.
For the HRtimer, I believe my implementation is better in performance, composability and general. So I insist this hrtimer is a better implementation, not just "a optimization for their hrtimer". I suggest they added optimization on this implementation, that's my opinion.
By the way, It is so funny that every time @wangchdo claims all I have done looks like his work, although these are completely different things.

@xiaoxiang781216 @Fix-Point This looks similar to the hrtimer driver I planned to add in my PR #17065, which was intended to provide access to the hardware timer count.

However, @xiaoxiang781216 mentioned in that PR that adding a new timer driver is not allowed in NuttX, so I removed the implementation, and reimplemented the sched/hrtimer based on the exsiting timer driver.

Your PR to introcude ClockDevice makes me feel confused about the criteria for what kind of functionality is acceptable to add to NuttX...

I don't think it is polite to attack other personally, i am just a personal contributor, not a team working on nuttx, and contribute PRs using my personal time, and they got merged only when are approved by committers or PMC.
If you think they are ugly, inefficient or meaningless,you are welcome to provide comments
I have merged more than 110 commits (timer or tricore arch related are only a small part of them) without receiving any of your comments

I have no intention of arguing with anyone. My only hope is for NuttX to become faster and more reliable—nothing more. I respect and understand all community developers because it is their efforts that drive the progress of NuttX.

My concern is that you have repeatedly accused me, without any evidence, of making my implementation similar to yours and implied that I used your ideas. This is deeply disrespectful to me. Therefore, I would like to clarify a few facts to show that I did not use your ideas at all, hoping to clear up any misunderstandings.

  1. Timeline:
    At the Apache NuttX International Workshop on October 16, 2025, I delivered a presentation titled "CLOCKDEVICE: New Timer Driver Abstraction for NuttX." In the architecture diagram on page 9, I had already included hrtimer in the plan. The PPT was completed and reviewed as early as September.
    By mid-September, the concurrent state machine for HRtimer had already been designed, though the interface details were not yet finalized.
    In October, after referencing the Linux kernel's HRTimer, I drafted a design document, refined the HRTimer design, and discussed it internally with colleagues. From September to November, I was fully occupied with developing, validating, and upstreaming the CLOCKDEVICE driver to the community, leaving me no time to complete the full implementation of hrtimer, which remained in an early prototype stage. Collegues informed me that someone in the community was working on hrtimer, but due to my busy schedule, I never looked into it—this was my mistake.
    On December 1st, I provided a relatively complete implementation of hrtimer, which underwent multiple rounds of internal review and improvements. On December 15th, during a WeChat conversation, I shared my design with you and pointed out issues in your design. To my confusion, you reacted with anger, accusing me of copying your ideas. Prior to this, I had never communicated with you or seen your implementation. Accusing someone of copying ideas is a serious allegation, and it deeply hurt my self-respect. It is you who suggested that I submit my code to the community for evaluation. I agreed with this suggestion and submitted the PR on December 18th.
  2. I only wish to engage in technical discussions. Your hrtimer implementation has the following issues, which cannot be fixed with minor modifications:
  • Completely unusable for users: This is a fundamental flaw in the state machine design and cannot be resolved with minor changes. [BUG] Unusable HRTimer #17567
  • Untested for SMP: Its reliability on SMP platforms is questionable.
  • Lacks detailed design documentation and clear state transition diagrams: ISO 26262 functional safety standards require a structured diagram approach, and state machine designs must be provided for modules.
  • No performance test data: You have not provided any data demonstrating the impact on system execution time.

I believe it is entirely reasonable to replace your implementation with a better one (or refactor it, if that term makes you feel more comfortable).

I do not understand why @anchao has been siding with you while ignoring the facts, which has left me disappointed with the community.

You should use hrtimer_cancel_sync api if you want to make sure hrtimer_cancel is successful, please check the documentation:

.. c:function:: int hrtimer_cancel_sync(FAR hrtimer_t *hrtimer)

  Cancel a high-resolution timer and wait synchronously until the timer
  becomes inactive.

  This function first calls hrtimer_cancel() to request cancellation of
  the timer.  If the timer callback is currently executing, this function
  will wait until the callback has completed and the timer state has
  transitioned to HRTIMER_STATE_INACTIVE.

  This function may sleep and must not be called from interrupt context.

  :param hrtimer: Timer instance to cancel

  :return: ``OK`` on success; negated errno on failure.

  **POSIX Compatibility:** This is a NON-POSIX interface.

@Fix-Point
Copy link
Contributor Author

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

I agree.
The main problem is that NuttX should consider all application scenarios, including IoT devices, automotive, multi-core embedded devices, and etc. But, @wangchdo only considered his tc397 board in his design, adding so many ugly workarounds to NuttX kernel just to fix the Tricore timer issues. I believe such code and implementation should be removed later.
I believe the community should encourage better implementations, rather than more meaningless, performance-inefficient, and product-specific implementations.
For the HRtimer, I believe my implementation is better in performance, composability and general. So I insist this hrtimer is a better implementation, not just "a optimization for their hrtimer". I suggest they added optimization on this implementation, that's my opinion.
By the way, It is so funny that every time @wangchdo claims all I have done looks like his work, although these are completely different things.

@xiaoxiang781216 @Fix-Point This looks similar to the hrtimer driver I planned to add in my PR #17065, which was intended to provide access to the hardware timer count.

However, @xiaoxiang781216 mentioned in that PR that adding a new timer driver is not allowed in NuttX, so I removed the implementation, and reimplemented the sched/hrtimer based on the exsiting timer driver.

Your PR to introcude ClockDevice makes me feel confused about the criteria for what kind of functionality is acceptable to add to NuttX...

I don't think it is polite to attack other personally, i am just a personal contributor, not a team working on nuttx, and contribute PRs using my personal time, and they got merged only when are approved by committers or PMC.
If you think they are ugly, inefficient or meaningless,you are welcome to provide comments
I have merged more than 110 commits (timer or tricore arch related are only a small part of them) without receiving any of your comments

I have no intention of arguing with anyone. My only hope is for NuttX to become faster and more reliable—nothing more. I respect and understand all community developers because it is their efforts that drive the progress of NuttX.
My concern is that you have repeatedly accused me, without any evidence, of making my implementation similar to yours and implied that I used your ideas. This is deeply disrespectful to me. Therefore, I would like to clarify a few facts to show that I did not use your ideas at all, hoping to clear up any misunderstandings.

  1. Timeline:
    At the Apache NuttX International Workshop on October 16, 2025, I delivered a presentation titled "CLOCKDEVICE: New Timer Driver Abstraction for NuttX." In the architecture diagram on page 9, I had already included hrtimer in the plan. The PPT was completed and reviewed as early as September.
    By mid-September, the concurrent state machine for HRtimer had already been designed, though the interface details were not yet finalized.
    In October, after referencing the Linux kernel's HRTimer, I drafted a design document, refined the HRTimer design, and discussed it internally with colleagues. From September to November, I was fully occupied with developing, validating, and upstreaming the CLOCKDEVICE driver to the community, leaving me no time to complete the full implementation of hrtimer, which remained in an early prototype stage. Collegues informed me that someone in the community was working on hrtimer, but due to my busy schedule, I never looked into it—this was my mistake.
    On December 1st, I provided a relatively complete implementation of hrtimer, which underwent multiple rounds of internal review and improvements. On December 15th, during a WeChat conversation, I shared my design with you and pointed out issues in your design. To my confusion, you reacted with anger, accusing me of copying your ideas. Prior to this, I had never communicated with you or seen your implementation. Accusing someone of copying ideas is a serious allegation, and it deeply hurt my self-respect. It is you who suggested that I submit my code to the community for evaluation. I agreed with this suggestion and submitted the PR on December 18th.
  2. I only wish to engage in technical discussions. Your hrtimer implementation has the following issues, which cannot be fixed with minor modifications:
  • Completely unusable for users: This is a fundamental flaw in the state machine design and cannot be resolved with minor changes. [BUG] Unusable HRTimer #17567
  • Untested for SMP: Its reliability on SMP platforms is questionable.
  • Lacks detailed design documentation and clear state transition diagrams: ISO 26262 functional safety standards require a structured diagram approach, and state machine designs must be provided for modules.
  • No performance test data: You have not provided any data demonstrating the impact on system execution time.

I believe it is entirely reasonable to replace your implementation with a better one (or refactor it, if that term makes you feel more comfortable).
I do not understand why @anchao has been siding with you while ignoring the facts, which has left me disappointed with the community.

You should use hrtimer_cancel_sync api if you want to make sure hrtimer_cancel is successful, please check the documentation:

.. c:function:: int hrtimer_cancel_sync(FAR hrtimer_t *hrtimer)

  Cancel a high-resolution timer and wait synchronously until the timer
  becomes inactive.

  This function first calls hrtimer_cancel() to request cancellation of
  the timer.  If the timer callback is currently executing, this function
  will wait until the callback has completed and the timer state has
  transitioned to HRTIMER_STATE_INACTIVE.

  This function may sleep and must not be called from interrupt context.

  :param hrtimer: Timer instance to cancel

  :return: ``OK`` on success; negated errno on failure.

  **POSIX Compatibility:** This is a NON-POSIX interface.

Please see this #17567 (comment). The question is we can not sleep or spin in the critical section. We must allow the timer restarting without waiting it finished.

@wangchdo
Copy link
Contributor

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

I agree.
The main problem is that NuttX should consider all application scenarios, including IoT devices, automotive, multi-core embedded devices, and etc. But, @wangchdo only considered his tc397 board in his design, adding so many ugly workarounds to NuttX kernel just to fix the Tricore timer issues. I believe such code and implementation should be removed later.
I believe the community should encourage better implementations, rather than more meaningless, performance-inefficient, and product-specific implementations.
For the HRtimer, I believe my implementation is better in performance, composability and general. So I insist this hrtimer is a better implementation, not just "a optimization for their hrtimer". I suggest they added optimization on this implementation, that's my opinion.
By the way, It is so funny that every time @wangchdo claims all I have done looks like his work, although these are completely different things.

@xiaoxiang781216 @Fix-Point This looks similar to the hrtimer driver I planned to add in my PR #17065, which was intended to provide access to the hardware timer count.

However, @xiaoxiang781216 mentioned in that PR that adding a new timer driver is not allowed in NuttX, so I removed the implementation, and reimplemented the sched/hrtimer based on the exsiting timer driver.

Your PR to introcude ClockDevice makes me feel confused about the criteria for what kind of functionality is acceptable to add to NuttX...

I don't think it is polite to attack other personally, i am just a personal contributor, not a team working on nuttx, and contribute PRs using my personal time, and they got merged only when are approved by committers or PMC.
If you think they are ugly, inefficient or meaningless,you are welcome to provide comments
I have merged more than 110 commits (timer or tricore arch related are only a small part of them) without receiving any of your comments

I have no intention of arguing with anyone. My only hope is for NuttX to become faster and more reliable—nothing more. I respect and understand all community developers because it is their efforts that drive the progress of NuttX.
My concern is that you have repeatedly accused me, without any evidence, of making my implementation similar to yours and implied that I used your ideas. This is deeply disrespectful to me. Therefore, I would like to clarify a few facts to show that I did not use your ideas at all, hoping to clear up any misunderstandings.

  1. Timeline:
    At the Apache NuttX International Workshop on October 16, 2025, I delivered a presentation titled "CLOCKDEVICE: New Timer Driver Abstraction for NuttX." In the architecture diagram on page 9, I had already included hrtimer in the plan. The PPT was completed and reviewed as early as September.
    By mid-September, the concurrent state machine for HRtimer had already been designed, though the interface details were not yet finalized.
    In October, after referencing the Linux kernel's HRTimer, I drafted a design document, refined the HRTimer design, and discussed it internally with colleagues. From September to November, I was fully occupied with developing, validating, and upstreaming the CLOCKDEVICE driver to the community, leaving me no time to complete the full implementation of hrtimer, which remained in an early prototype stage. Collegues informed me that someone in the community was working on hrtimer, but due to my busy schedule, I never looked into it—this was my mistake.
    On December 1st, I provided a relatively complete implementation of hrtimer, which underwent multiple rounds of internal review and improvements. On December 15th, during a WeChat conversation, I shared my design with you and pointed out issues in your design. To my confusion, you reacted with anger, accusing me of copying your ideas. Prior to this, I had never communicated with you or seen your implementation. Accusing someone of copying ideas is a serious allegation, and it deeply hurt my self-respect. It is you who suggested that I submit my code to the community for evaluation. I agreed with this suggestion and submitted the PR on December 18th.
  2. I only wish to engage in technical discussions. Your hrtimer implementation has the following issues, which cannot be fixed with minor modifications:
  • Completely unusable for users: This is a fundamental flaw in the state machine design and cannot be resolved with minor changes. [BUG] Unusable HRTimer #17567
  • Untested for SMP: Its reliability on SMP platforms is questionable.
  • Lacks detailed design documentation and clear state transition diagrams: ISO 26262 functional safety standards require a structured diagram approach, and state machine designs must be provided for modules.
  • No performance test data: You have not provided any data demonstrating the impact on system execution time.

I believe it is entirely reasonable to replace your implementation with a better one (or refactor it, if that term makes you feel more comfortable).
I do not understand why @anchao has been siding with you while ignoring the facts, which has left me disappointed with the community.

You should use hrtimer_cancel_sync api if you want to make sure hrtimer_cancel is successful, please check the documentation:

.. c:function:: int hrtimer_cancel_sync(FAR hrtimer_t *hrtimer)

  Cancel a high-resolution timer and wait synchronously until the timer
  becomes inactive.

  This function first calls hrtimer_cancel() to request cancellation of
  the timer.  If the timer callback is currently executing, this function
  will wait until the callback has completed and the timer state has
  transitioned to HRTIMER_STATE_INACTIVE.

  This function may sleep and must not be called from interrupt context.

  :param hrtimer: Timer instance to cancel

  :return: ``OK`` on success; negated errno on failure.

  **POSIX Compatibility:** This is a NON-POSIX interface.

Please see this #17567 (comment). The question is we can not sleep or spin in the critical section. We must allow the timer restarting without waiting it finished.

Can you check this bug-fix PR: #17570, looks like a simple change can fix the issue

@Fix-Point
Copy link
Contributor Author

Fix-Point commented Dec 19, 2025

Guys, instead of compete to see who implemented first or which implementation has best performance, it is better to work together to get best part of each implementation and contribute a better solution to NuttX.

This features is very important to everyone using NuttX (not only LiAuto or Xiaomi).

@Fix-Point and @anchao please align between you both how to integrate it in a way that the best interest is the project quality, not individual ego.

I agree.
The main problem is that NuttX should consider all application scenarios, including IoT devices, automotive, multi-core embedded devices, and etc. But, @wangchdo only considered his tc397 board in his design, adding so many ugly workarounds to NuttX kernel just to fix the Tricore timer issues. I believe such code and implementation should be removed later.
I believe the community should encourage better implementations, rather than more meaningless, performance-inefficient, and product-specific implementations.
For the HRtimer, I believe my implementation is better in performance, composability and general. So I insist this hrtimer is a better implementation, not just "a optimization for their hrtimer". I suggest they added optimization on this implementation, that's my opinion.
By the way, It is so funny that every time @wangchdo claims all I have done looks like his work, although these are completely different things.

@xiaoxiang781216 @Fix-Point This looks similar to the hrtimer driver I planned to add in my PR #17065, which was intended to provide access to the hardware timer count.

However, @xiaoxiang781216 mentioned in that PR that adding a new timer driver is not allowed in NuttX, so I removed the implementation, and reimplemented the sched/hrtimer based on the exsiting timer driver.

Your PR to introcude ClockDevice makes me feel confused about the criteria for what kind of functionality is acceptable to add to NuttX...

I don't think it is polite to attack other personally, i am just a personal contributor, not a team working on nuttx, and contribute PRs using my personal time, and they got merged only when are approved by committers or PMC.
If you think they are ugly, inefficient or meaningless,you are welcome to provide comments
I have merged more than 110 commits (timer or tricore arch related are only a small part of them) without receiving any of your comments

I have no intention of arguing with anyone. My only hope is for NuttX to become faster and more reliable—nothing more. I respect and understand all community developers because it is their efforts that drive the progress of NuttX.
My concern is that you have repeatedly accused me, without any evidence, of making my implementation similar to yours and implied that I used your ideas. This is deeply disrespectful to me. Therefore, I would like to clarify a few facts to show that I did not use your ideas at all, hoping to clear up any misunderstandings.

  1. Timeline:
    At the Apache NuttX International Workshop on October 16, 2025, I delivered a presentation titled "CLOCKDEVICE: New Timer Driver Abstraction for NuttX." In the architecture diagram on page 9, I had already included hrtimer in the plan. The PPT was completed and reviewed as early as September.
    By mid-September, the concurrent state machine for HRtimer had already been designed, though the interface details were not yet finalized.
    In October, after referencing the Linux kernel's HRTimer, I drafted a design document, refined the HRTimer design, and discussed it internally with colleagues. From September to November, I was fully occupied with developing, validating, and upstreaming the CLOCKDEVICE driver to the community, leaving me no time to complete the full implementation of hrtimer, which remained in an early prototype stage. Collegues informed me that someone in the community was working on hrtimer, but due to my busy schedule, I never looked into it—this was my mistake.
    On December 1st, I provided a relatively complete implementation of hrtimer, which underwent multiple rounds of internal review and improvements. On December 15th, during a WeChat conversation, I shared my design with you and pointed out issues in your design. To my confusion, you reacted with anger, accusing me of copying your ideas. Prior to this, I had never communicated with you or seen your implementation. Accusing someone of copying ideas is a serious allegation, and it deeply hurt my self-respect. It is you who suggested that I submit my code to the community for evaluation. I agreed with this suggestion and submitted the PR on December 18th.
  2. I only wish to engage in technical discussions. Your hrtimer implementation has the following issues, which cannot be fixed with minor modifications:
  • Completely unusable for users: This is a fundamental flaw in the state machine design and cannot be resolved with minor changes. [BUG] Unusable HRTimer #17567
  • Untested for SMP: Its reliability on SMP platforms is questionable.
  • Lacks detailed design documentation and clear state transition diagrams: ISO 26262 functional safety standards require a structured diagram approach, and state machine designs must be provided for modules.
  • No performance test data: You have not provided any data demonstrating the impact on system execution time.

I believe it is entirely reasonable to replace your implementation with a better one (or refactor it, if that term makes you feel more comfortable).
I do not understand why @anchao has been siding with you while ignoring the facts, which has left me disappointed with the community.

You should use hrtimer_cancel_sync api if you want to make sure hrtimer_cancel is successful, please check the documentation:

.. c:function:: int hrtimer_cancel_sync(FAR hrtimer_t *hrtimer)

  Cancel a high-resolution timer and wait synchronously until the timer
  becomes inactive.

  This function first calls hrtimer_cancel() to request cancellation of
  the timer.  If the timer callback is currently executing, this function
  will wait until the callback has completed and the timer state has
  transitioned to HRTIMER_STATE_INACTIVE.

  This function may sleep and must not be called from interrupt context.

  :param hrtimer: Timer instance to cancel

  :return: ``OK`` on success; negated errno on failure.

  **POSIX Compatibility:** This is a NON-POSIX interface.

Please see this #17567 (comment). The question is we can not sleep or spin in the critical section. We must allow the timer restarting without waiting it finished.

Can you check this bug-fix PR: #17570, looks like a simple change can fix the issue

Please pay more respect to parallel programming. It does not look like a simple change can fix this at all. See #17570 (comment).

I think you should seriously consider refactoring hrtimer you merged based on this implementation instead of wasting both of our time on a flawed design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Arch: arm Issues related to ARM (32-bit) architecture Arch: arm64 Issues related to ARM64 (64-bit) architecture Arch: ceva Issues related to CEVA architecture Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Arch: tricore Issues related to the TriCore architecture from Infineon Arch: x86_64 Issues related to the x86_64 architecture Area: Documentation Improvements or additions to documentation Area: Drivers Drivers issues Area: OS Components OS Components issues Size: XL The size of the change in this PR is very large. Consider breaking down the PR into smaller pieces.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants