-
Notifications
You must be signed in to change notification settings - Fork 1k
Implementing NSE in cube #7543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Implementing NSE in cube #7543
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7543 +/- ##
==========================================
- Coverage 98.97% 98.96% -0.01%
==========================================
Files 87 87
Lines 16733 16744 +11
==========================================
+ Hits 16561 16571 +10
- Misses 172 173 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| usesSD = any(all.vars(jj) == ".SD") | ||
| if (usesSD) { | ||
| if (missing(.SDcols)) { | ||
| .SDcols = names(x)[vapply(x, is.numeric, logical(1L))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you know SDcols needs to be numeric, for example there are aggregate functions that works on character.
|
I think the proper way to address this is to isolate SDcols logic from [ into helper and reuse this helper here. |
|
That’s a fair point. The numeric-only approach was a defensive measure against a specific aggregation failure I encountered. I agree that reimplementing this logic in cube() isn’t ideal and I’ll work towards isolating and reusing the existing .SDcols handling from [.data.table instead |
|
Could you please share the test failures for tests 1378.2, 1378.3 together with your sessionInfo() output? This could warrant opening a separate issue.
Edit: thanks, see #7550.
|
|
Sure. Errors: |
|
Run |
|
Sure, here's the output: |
|
Sorry, I should have been more complete, |
|
Oh my bad, I'll share that too. I'm new to contributing so I'm still learning the ropes. I appreciate the help! :) |
Pertaining to issue #7354
I noticed that cube() can lead to errors when .SDcols is supplied as an evaluative or callable expression (for example - helpers like patterns() or measure()), since it may be forwarded unevaluated and used before being resolved.
To address this, I updated cube() to handle .SDcols using non-standard evaluation in the following way:
the argument is captured with substitute(), if it is callable then it is resolved via eval_with_cols() prior to being passed to groupingsets().
While testing this change, I encountered cases where applying numeric aggregations through j resulted in errors when non-numeric columns were present in .SD. To address this, j is captured as an expression to check if it references .SD, if it does then .SD is restricted to numeric values thereby avoiding unintended application of numeric aggregations to non-numeric data.
During development, I’m seeing two existing test failures (1378.2, 1378.3). I haven’t added new unit tests yet. I’m happy to add targeted tests if there are specific cases that need to be covered, and would appreciate guidance on whether additional test coverage is expected here. I’d also welcome any help on how best to approach the currently failing tests.