[SYCLomatic-test] Block store test #725
[SYCLomatic-test] Block store test #725abhilash1910 wants to merge 15 commits intooneapi-src:SYCLomaticfrom
Conversation
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
| thread_data[i] = dacc_read[global_index * 4 + i]; | ||
| } | ||
|
|
||
| auto *d_r = |
There was a problem hiding this comment.
Could you clarify why we are storing to buffer here and then storing the thread_data to buffer_out? To me, it looks like we are not verifying the group_store operation since buffer_out is being passed to the verification routine which is just the value of thread_data .
| item.get_local_id(2); // Each thread_data has 4 elements | ||
| #pragma unroll | ||
| for (int i = 0; i < 4; ++i) { | ||
| dacc_write[global_index * 4 + i] = thread_data[i]; |
There was a problem hiding this comment.
Similar question to above about the purpose of this and how we are verifying the store routines at all.
There was a problem hiding this comment.
Since the recent changes, we are now doing group_store(tmp).store(item, d_w, thread_data); where d_w is referring to buffer_out. This part seems correct, but my comment about these lines still apply.
It seems like the contents of buffer_out after the group_store operation are being overwritten by the writes in this loop. Can we just remove this entire loop? I am not sure why it is here unless I misunderstand.
There was a problem hiding this comment.
Yes modified to prevent the overwrite, could you please take a look . Thanks
There was a problem hiding this comment.
Thanks, this looks to resolve the issue
| int expected[512]; | ||
| int num_threads = 128; | ||
| int items_per_thread = 4; | ||
| uint32_t sg_sz_val = *sg_sz; | ||
| for (int i = 0; i < num_threads; ++i) { | ||
| for (int j = 0; j < items_per_thread; ++j) { | ||
| expected[items_per_thread * i + j] = | ||
| (i / sg_sz_val) * sg_sz_val * items_per_thread + sg_sz_val * j + | ||
| i % sg_sz_val; | ||
| } | ||
| } |
There was a problem hiding this comment.
It looks like we are setting up expected and then never using it. Am I missing something?
| } | ||
|
|
||
| for (int i = 0; i < 512; ++i) { | ||
| if (ptr[i] != i) { |
There was a problem hiding this comment.
It looks like the actual check is just against the index i.
That also seems to be the source data for the input buffer, does this pass?
There was a problem hiding this comment.
I believe the test is passing since we are overwriting the results of the subgroup store operation in test_store_subgroup_striped_standalone
| sg_sz_acc[0] = item.get_sub_group().get_local_linear_range(); | ||
| } | ||
| dpct::group::store_subgroup_striped<4, int>(item, d_w, thread_data); | ||
| // reapply global mapping |
There was a problem hiding this comment.
We are overwriting the result of the sub-group store here from lines 173 to 179. If we remove this the test fails.
There was a problem hiding this comment.
Interesting... seems like this should not be here.
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
In line with block store PR (oneapi-src/SYCLomatic#1819), this tests only store api.
WIP. Linked to composite PR : #680
cc @danhoeflinger @mmichel11 @yihanwg