-
Notifications
You must be signed in to change notification settings - Fork 41
Description
It would be nice to have the Diebold-Mariano (DM) test work on 1D arrays of score differentials, e.g.
[0.1, 0.3, -0.2, 1.3, -0.9, 0.4]
This is a common format that people have data in, because the DM test data is usually obtained from time series data (i.e. 1D arrays). If you have a 1D array in the current implementation scores.stats.statistical_tests.diebold_mariano, you have to
- convert it to an xarray object, and create a label for the time dimension
- then add an extra dimension, the time series dimension (effectively providing a label for the single time series)
- then add the h coordinate as a function of the dimension provided in 2.
This is a lot of data wrangling for no benefit other than getting a format for which scores will do the calculation.
The 1D implementation would look something like this:
diebold_mariano(timeseries, h, *, method='HG', confidence_level=0.95, statistic_distribution='normal')
where
timeseriescould be a list, 1D numpy array or 1D xr.DataArrayhis a positive integer for h-step-ahead forecasts.- other arguments as per https://scores.readthedocs.io/en/stable/api.html#scores.stats.statistical_tests.diebold_mariano
Ideally there would be only one function for this, and it could accept multiple dimensional arrays or 1D arrays. But in the latter case the arguments ts_dim, h_coord are replaced with h, so backwards compatibility with the existing implementation could be tricky.
I'd welcome any comments or feedback.