-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Summary
Following the discussion in #296, we recognise that while allow_extra_dates = TRUE provides flexibility for target data validation by excluding the date column from strict value checking, some hubs may want optional validation of date values. This issue proposes adding two optional checks that can be deployed via validations.yml for target data validation.
Proposed Optional Checks
1. opt_check_target_date_range
Validates that date values in the target data fall within an acceptable range.
Use case: Ensure dates are within a reasonable historical/future window (e.g., not before a hub's start date, not unreasonably far in the future).
Proposed parameters:
target_tbl: tibble of target data being validated (from caller env)file_path: path to file being validated (from caller env)hub_path: path to hub (from caller env)date_col: name of the date column (from caller env or config)min_date: minimum allowed date (optional, could be absolute date or relative like "2020-01-01")max_date: maximum allowed date (optional, could be absolute date or relative expression)
Example validations.yml deployment:
default:
validate_target_data:
date_range:
fn: "opt_check_target_date_range"
pkg: "hubValidations"
args:
min_date: "2020-01-01"
max_date: !expr Sys.Date() + 3652. opt_check_target_date_dow
Validates that date values fall on expected day(s) of the week.
Use case: Ensure dates align with expected reporting cadence (e.g., epiweek ending dates should be Saturdays, or observations should be on specific weekdays).
Proposed parameters:
target_tbl: tibble of target data being validated (from caller env)file_path: path to file being validated (from caller env)hub_path: path to hub (from caller env)date_col: name of the date column (from caller env or config)expected_dow: integer vector of expected day(s) of week (1 = Sunday, 7 = Saturday following lubridate convention, or could use character like "Saturday")
Example validations.yml deployment:
default:
validate_target_data:
date_dow:
fn: "opt_check_target_date_dow"
pkg: "hubValidations"
args:
expected_dow: 7 # Saturday (epiweek ending day)Implementation Notes
- Both checks should follow the existing
opt_check_*pattern (seeopt_check_tbl_col_timediff.Rfor reference) - Should return appropriate
check_success,check_failure, orcheck_infoconditions - Should handle edge cases gracefully (e.g., missing date column →
check_infoskip message) - Variables available in
validate_target_data()caller environment (per Document target data custom validation and add target data validation vignettes #297):target_tbl,target_tbl_chr,target_type,config_target_data,date_col,allow_extra_dates,file_path,hub_path
Documentation
- Update the "deploying custom functions" vignette with target data examples (after Document target data custom validation and add target data validation vignettes #297 is merged)
- Add entries to
inst/check_table.csv
Related
- Add relaxed date validation for time-series target data #296 - Relaxed date validation for time-series target data (
allow_extra_dates) - Document target data custom validation and add target data validation vignettes #297 - Document target data custom validation and add target data validation vignettes
- Add allow_extra_dates option to target-data.json schema schemas#137 - Config-level support for
allow_extra_dates
Metadata
Metadata
Assignees
Type
Projects
Status