Skip to content

Add optional date validation checks for target data #300

@annakrystalli

Description

@annakrystalli

Summary

Following the discussion in #296, we recognise that while allow_extra_dates = TRUE provides flexibility for target data validation by excluding the date column from strict value checking, some hubs may want optional validation of date values. This issue proposes adding two optional checks that can be deployed via validations.yml for target data validation.

Proposed Optional Checks

1. opt_check_target_date_range

Validates that date values in the target data fall within an acceptable range.

Use case: Ensure dates are within a reasonable historical/future window (e.g., not before a hub's start date, not unreasonably far in the future).

Proposed parameters:

  • target_tbl: tibble of target data being validated (from caller env)
  • file_path: path to file being validated (from caller env)
  • hub_path: path to hub (from caller env)
  • date_col: name of the date column (from caller env or config)
  • min_date: minimum allowed date (optional, could be absolute date or relative like "2020-01-01")
  • max_date: maximum allowed date (optional, could be absolute date or relative expression)

Example validations.yml deployment:

default:
  validate_target_data:
    date_range:
      fn: "opt_check_target_date_range"
      pkg: "hubValidations"
      args:
        min_date: "2020-01-01"
        max_date: !expr Sys.Date() + 365

2. opt_check_target_date_dow

Validates that date values fall on expected day(s) of the week.

Use case: Ensure dates align with expected reporting cadence (e.g., epiweek ending dates should be Saturdays, or observations should be on specific weekdays).

Proposed parameters:

  • target_tbl: tibble of target data being validated (from caller env)
  • file_path: path to file being validated (from caller env)
  • hub_path: path to hub (from caller env)
  • date_col: name of the date column (from caller env or config)
  • expected_dow: integer vector of expected day(s) of week (1 = Sunday, 7 = Saturday following lubridate convention, or could use character like "Saturday")

Example validations.yml deployment:

default:
  validate_target_data:
    date_dow:
      fn: "opt_check_target_date_dow"
      pkg: "hubValidations"
      args:
        expected_dow: 7  # Saturday (epiweek ending day)

Implementation Notes

  • Both checks should follow the existing opt_check_* pattern (see opt_check_tbl_col_timediff.R for reference)
  • Should return appropriate check_success, check_failure, or check_info conditions
  • Should handle edge cases gracefully (e.g., missing date column → check_info skip message)
  • Variables available in validate_target_data() caller environment (per Document target data custom validation and add target data validation vignettes #297):
    • target_tbl, target_tbl_chr, target_type, config_target_data, date_col, allow_extra_dates, file_path, hub_path

Documentation

Related

Metadata

Metadata

Assignees

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions