fixed terminating stale threads on trap/proc_exit by hritikgupta · Pull Request #1929 · bytecodealliance/wasm-micro-runtime

hritikgupta · 2023-02-01T17:26:16Z

This is to terminate suspended threads in case an atomic wait occurs with a huge or indefinite (-1) timeout, followed by a proc exit or trap.

core/iwasm/common/wasm_shared_memory.c

loganek · 2023-02-02T00:42:22Z

core/iwasm/common/wasm_shared_memory.c

+    /**
+     * create a copy of wait_map with no lock
+     * this allows invoking methods in the callback
+     * which require acquiring lock on the original map


I don't see there's any lock in the notify_stale_threads callback. Should we rather have a lock on the shared_memory_list_lock mutex at the beginning of the function and release it at the end? Otherwise the map could be modified in the other thread making the traversal corrupted.

the wasm_runtime_atomic_notify step in callback calls acquire_wait_info which tries to acquire shared_memory_list_lock, so I guess encapsulating it with that will cause issues?

HashMap doesn't duplicate the key/values pairs, so wait_map_without_lock will point to the same elements of wait_map

I think you at least want to lock while you copy hashmaps. Also, what happens if after making a copy a new waiter is added to the map?

@loganek While copying the original map, the original map already acquires a lock as it traverses. As inserts happen in the copy map, every insert locks the copy map. So I think the copy should be consistent? If you meant wrapping copy with shared_memory_list_lock -> addressed that.

What could be concerning here is that during the traversal of copy map, the original map could be updated. If I put a lock on the original map during traversal, the execution gets stuck because the callback steps in notify_stale_threads try to put a lock on the original map. (This is exactly why I had to create a copy). This might lead to some entries (newly added ones) in the original map not catered to at that point of time, but would be eventually handled, right?

either way I'm still not sure it would be safe since you're not cloning the map but copying the pointers iiuc

core/iwasm/common/wasm_shared_memory.c

hritikgupta · 2023-02-03T02:25:51Z

closing and reopening to reinitiate the build

core/iwasm/common/wasm_shared_memory.c

core/iwasm/common/wasm_runtime_common.c

wenyongh · 2023-02-06T09:30:28Z

core/iwasm/common/wasm_shared_memory.c

+{
+    AtomicWaitAddressArgs *data = (AtomicWaitAddressArgs *)user_data;
+    if (data->len > 0) {
+        if (!(data->addr = wasm_runtime_realloc(data->addr,


When to free the memory? And why do nothing except LOG_ERROR when malloc/realloc memory failed?

core/iwasm/common/wasm_shared_memory.c

wenyongh · 2023-02-06T09:43:04Z

core/iwasm/common/wasm_shared_memory.c

+    memset(args, 0, sizeof(*args));
+    os_mutex_lock(&shared_memory_list_lock);
+    // create list of addresses
+    bh_hash_map_traverse(wait_map, create_list_of_waiter_addresses, args);


How about traversing two times: the first time to get the total element count of wait_map, use it to allocate memory for AtomicWaitAddressArgs *args with size == offsetof(AtomicWaitAddressArgs, addr) + sizeof(void *) * total_elem_count , and then traverse the second time without allocating memory?

static void xxx_callback(void *key, void *value, void *p_total_elem_count) { *(uint32 *)p_total_elem_count = *(uint32 *)p_total_elem_count + 1; } os_mutex_lock(&shared_memory_list_lock); total_elem_count = 0; bh_hash_map_traverse(wait_map, xxx_callback, (void *)&total_elem_count); allocate memory bh_hash_map_traverse(wait_map, ..., args); /* set each data->addr[i] */ os_mutex_unlock(...)

just wondering if there is any specific advantage of traversing two times? is it to ensure the final traversal of data->addr is safe i.e. we don't go out of bounds?

The purpose of that is to avoid multiple reallocations - knowing the size in advance will let you make only one allocation.

core/iwasm/common/wasm_shared_memory.c

core/iwasm/common/wasm_runtime_common.c

core/iwasm/common/wasm_shared_memory.c

eloparco · 2023-02-07T09:14:10Z

Does this PR work in AOT mode?

Also, can you try it with the sample thread_termination.c after replacing

wasm-micro-runtime/samples/wasi-threads/wasm-apps/thread_termination.c

Line 34 in ee1871d

for (int i = 0; i < TIMEOUT_SECONDS; i++)

with a wait operation?

And try with TEST_TERMINATION_IN_MAIN_THREAD set to 0 and 1

wasm-micro-runtime/samples/wasi-threads/wasm-apps/thread_termination.c

Line 19 in ee1871d

#define TEST_TERMINATION_IN_MAIN_THREAD 1

Just to make sure that the base cases are covered.

core/iwasm/common/wasm_runtime_common.c

This is to terminate suspended threads in case an atomic wait occurs with a huge or indefinite (-1) timeout, followed by a proc exit or trap.

loganek reviewed Feb 2, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

loganek reviewed Feb 2, 2023

View reviewed changes

eloparco reviewed Feb 2, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Show resolved Hide resolved

hritikgupta requested review from eloparco and loganek and removed request for eloparco and loganek February 3, 2023 02:02

hritikgupta closed this Feb 3, 2023

hritikgupta reopened this Feb 3, 2023

loganek reviewed Feb 3, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

loganek reviewed Feb 3, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

loganek reviewed Feb 3, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

loganek reviewed Feb 3, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

loganek reviewed Feb 3, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Show resolved Hide resolved

hritikgupta force-pushed the stale_threads_term branch from c5e57ca to 64abe09 Compare February 5, 2023 20:15

loganek requested changes Feb 5, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

hritikgupta force-pushed the stale_threads_term branch from 64abe09 to 22ba795 Compare February 6, 2023 01:42

hritikgupta commented Feb 6, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

wenyongh reviewed Feb 6, 2023

View reviewed changes

hritikgupta force-pushed the stale_threads_term branch from 22ba795 to aa481de Compare February 6, 2023 13:46

hritikgupta requested review from loganek and wenyongh and removed request for wenyongh February 6, 2023 14:20

hritikgupta force-pushed the stale_threads_term branch from aa481de to bba90ce Compare February 6, 2023 17:04

loganek reviewed Feb 6, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

loganek reviewed Feb 6, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

hritikgupta force-pushed the stale_threads_term branch from bba90ce to ee62523 Compare February 6, 2023 17:51

hritikgupta requested a review from loganek February 6, 2023 17:51

loganek requested changes Feb 6, 2023

View reviewed changes

loganek reviewed Feb 6, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

hritikgupta force-pushed the stale_threads_term branch from 8d79672 to 6119dec Compare February 6, 2023 18:18

hritikgupta requested a review from loganek February 6, 2023 18:19

wenyongh reviewed Feb 7, 2023

View reviewed changes

core/iwasm/common/wasm_runtime_common.c Show resolved Hide resolved

core/iwasm/common/wasm_shared_memory.c Outdated Show resolved Hide resolved

core/iwasm/common/wasm_shared_memory.c Show resolved Hide resolved

hritikgupta force-pushed the stale_threads_term branch from 6119dec to 7cd2e0b Compare February 7, 2023 01:07

hritikgupta requested review from wenyongh and removed request for loganek February 7, 2023 01:08

wenyongh mentioned this pull request Feb 7, 2023

Allow interrupting blocking instructions #1930

Closed

wenyongh reviewed Feb 7, 2023

View reviewed changes

core/iwasm/common/wasm_shared_memory.c Show resolved Hide resolved

loganek approved these changes Feb 7, 2023

View reviewed changes

hritikgupta changed the base branch from dev/wasi_threads to main February 7, 2023 11:17

wenyongh reviewed Feb 7, 2023

View reviewed changes

core/iwasm/common/wasm_runtime_common.c Show resolved Hide resolved

hritikgupta force-pushed the stale_threads_term branch from 7cd2e0b to 5424205 Compare February 7, 2023 12:05

fixed terminating stale threads on trap/proc_exit

ec33f47

hritikgupta force-pushed the stale_threads_term branch from 5424205 to ec33f47 Compare February 7, 2023 12:09

wenyongh merged commit f3c1ad4 into bytecodealliance:main Feb 7, 2023

This was referenced Feb 15, 2023

proc_exit and traps do not stop thread executing blocking instructions #1910

Closed

Do not fetch the next instruction when proc_exit is propagated #1975

Closed

Comments

Conversation

hritikgupta commented Feb 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

loganek Feb 2, 2023

Choose a reason for hiding this comment

Uh oh!

hritikgupta Feb 2, 2023

Choose a reason for hiding this comment

Uh oh!

eloparco Feb 2, 2023

Choose a reason for hiding this comment

Uh oh!

loganek Feb 2, 2023

Choose a reason for hiding this comment

Uh oh!

hritikgupta Feb 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eloparco Feb 2, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hritikgupta commented Feb 3, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenyongh Feb 6, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wenyongh Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hritikgupta Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loganek Feb 6, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eloparco commented Feb 7, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hritikgupta commented Feb 1, 2023 •

edited

Loading

hritikgupta Feb 2, 2023 •

edited

Loading

wenyongh Feb 6, 2023 •

edited

Loading

hritikgupta Feb 6, 2023 •

edited

Loading