-
Notifications
You must be signed in to change notification settings - Fork 39
OpenSTEF Meta V0.1 #771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Lars800
wants to merge
79
commits into
research/v4.1.0
Choose a base branch
from
research/HybridForecaster2.0
base: research/v4.1.0
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
OpenSTEF Meta V0.1 #771
Changes from all commits
Commits
Show all changes
79 commits
Select commit
Hold shift + click to select a range
f69b8d2
Added Lightgbm, LightGBM Linear Trees and Hybrid Stacking Forecasters
Lars800 6fcd632
Fixed small issues
Lars800 7523987
Ruff compliance
Lars800 c680aa1
fixed quality checks
Lars800 9c1e3d3
Fixed last issues, Signed-off-by: Lars van Someren <lars.vansomeren@s…
Lars800 4394895
fixed comments
Lars800 5745212
Refactor LightGBM to LGBM
Lars800 a2538b6
Update LGBM and LGBMLinear defaults, fixed comments
Lars800 1788759
Merge branch 'release/v4.0.0' into Openstef4.0.0_Additional_forecasters
Lars800 3d54604
Fixed comments
Lars800 34fc3e5
Added SkopsModelSerializer
Lars800 bad4c44
Fixed issues
Lars800 99c9bc5
Gitignore optimization and dev sandbox
Lars800 4027de7
Added MultiQuantileAdapter Class
Lars800 064a92d
small fix
Lars800 ed83b3a
Hybrid V2
Lars800 bfa2e2f
Small fix
Lars800 ce2172a
Squashed commit of the following:
Lars800 8be453a
set silence
Lars800 8c5743b
Merge branch 'release/v4.0.0' into research/v4.1.0_Additional_Forecas…
Lars800 ea90239
small fix
Lars800 93baa03
Fix final learner
Lars800 4f8ea8f
fixed lgbm efficiency
Lars800 b4bdbdc
updated lgbm linear params
Lars800 ea1f5f7
Fixed type and quality issues
Lars800 22688e0
First Version Sample Weighting Approach
Lars800 9b971d3
MetaForecasterClass
Lars800 5a54c4f
Merge remote-tracking branch 'origin/research/v4.1.0' into research/H…
Lars800 72b1ca7
fix merge issue
Lars800 553e2fd
Fixed type Issues
Lars800 f873f89
Introduced openstef_metalearning
Lars800 3338be1
ResidualForecaster + refactoring
Lars800 1141898
Testing and fixes on Learned Weights Forecaster
Lars800 976a2fc
FinalLearner PreProcessor
Lars800 82795d9
Fixed benchmark references
Lars800 140fe26
Added additional Feature logic to StackingForecaster
Lars800 c053ea5
added example to openstef Meta
Lars800 1d5d97d
RulesForecaster with dummy features
Lars800 100494c
Updated feature specification
Lars800 c484a26
Merge branch 'research/BusinessRulesForecaster' into research/HybridF…
Lars800 d8d10a1
entered flagger feature in new architecture
797eee7
Fix sample weights
Lars800 a300c27
Merge remote-tracking branch 'origin/research/Hybrid_Business' into r…
Lars800 88c6865
Fixes
Lars800 c6749b3
PR compliant
Lars800 4b3aff8
Ensemble Forecast Dataset
Lars800 719ea5c
Make PR compliant
Lars800 e3a587c
fixed toml
Lars800 308d7c8
Really fixed the TOML
Lars800 460548b
Renamed FinalLearner to Forecast Combiner. Eliminated redundant classes
Lars800 b2fca54
fixed small issues
Lars800 ddef9f3
Major Refactor, Working Version
Lars800 9c6de7d
Fixed tests
Lars800 e18ce5a
Prepared TODOs for Florian
Lars800 ece5d18
Small fix
Lars800 c33ce93
Made PR Compliant
Lars800 eb775e4
BugFix
Lars800 e212448
fixes
Lars800 b44fd92
bug fixes
Lars800 51579d0
added learned weights contributions
2899baf
Added Feature Contributions
Lars800 6f88d72
Bugfixes
Lars800 20edf2d
fixes
Lars800 354b6f2
Squashed commit of the following:
Lars800 e6bc447
Fixes
Lars800 bedf6af
fixed tests
Lars800 c9f135f
small fix
Lars800 b10d02c
Merge remote-tracking branch 'origin/research/ExplainableModelContrib…
Lars800 845e384
Stacking Bugfix
Lars800 780e012
Added hard Forecast Selection
Lars800 682ae2f
Improved data handling in EnsembleForecasting model, correct data spl…
Lars800 619c271
Migrated Flagger and Selector to OpenSTEF Models transforms
Lars800 3b6587a
Fixed restore target Forecast Combiner
Lars800 ede0908
Streamline logging statements, Fix quality
Lars800 ab13581
Resolved comments, fixed bug
Lars800 b5a3737
Moved example
Lars800 c650bb8
Merge remote-tracking branch 'origin/research/v4.1.0' into research/H…
Lars800 297f186
Integrated changes to beam structure
Lars800 0ac62c8
make PR compliant
Lars800 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,127 @@ | ||
| """Liander 2024 Benchmark Example. | ||
|
|
||
| ==================================== | ||
|
|
||
| This example demonstrates how to set up and run the Liander 2024 STEF benchmark using OpenSTEF BEAM. | ||
| The benchmark will evaluate XGBoost and GBLinear models on the dataset from HuggingFace. | ||
| """ | ||
|
|
||
| # SPDX-FileCopyrightText: 2025 Contributors to the OpenSTEF project <short.term.energy.forecasts@alliander.com> | ||
| # | ||
| # SPDX-License-Identifier: MPL-2.0 | ||
|
|
||
| import os | ||
| import time | ||
|
|
||
| os.environ["OMP_NUM_THREADS"] = "1" # Set OMP_NUM_THREADS to 1 to avoid issues with parallel execution and xgboost | ||
| os.environ["OPENBLAS_NUM_THREADS"] = "1" | ||
| os.environ["MKL_NUM_THREADS"] = "1" | ||
|
|
||
| import logging | ||
| import multiprocessing | ||
| from datetime import timedelta | ||
| from pathlib import Path | ||
|
|
||
| from openstef_beam.backtesting.backtest_forecaster import BacktestForecasterConfig | ||
| from openstef_beam.benchmarking.baselines import ( | ||
| create_openstef4_preset_backtest_forecaster, | ||
| ) | ||
| from openstef_beam.benchmarking.benchmarks.liander2024 import Liander2024Category, create_liander2024_benchmark_runner | ||
| from openstef_beam.benchmarking.callbacks.strict_execution_callback import StrictExecutionCallback | ||
| from openstef_beam.benchmarking.storage.local_storage import LocalBenchmarkStorage | ||
| from openstef_core.types import LeadTime, Q | ||
| from openstef_meta.presets import ( | ||
| EnsembleWorkflowConfig, | ||
| ) | ||
| from openstef_models.integrations.mlflow.mlflow_storage import MLFlowStorage | ||
|
|
||
| logging.basicConfig(level=logging.INFO, format="[%(asctime)s][%(levelname)s] %(message)s") | ||
|
|
||
| OUTPUT_PATH = Path("./benchmark_results") | ||
|
|
||
| N_PROCESSES = 1 if True else multiprocessing.cpu_count() # Amount of parallel processes to use for the benchmark | ||
|
|
||
| ensemble_type = "learned_weights" # "stacking", "learned_weights" or "rules" | ||
| base_models = ["lgbm", "gblinear"] # combination of "lgbm", "gblinear", "xgboost" and "lgbm_linear" | ||
| combiner_model = ( | ||
| "lgbm" # "lgbm", "xgboost", "rf" or "logistic" for learned weights combiner, gblinear for stacking combiner | ||
| ) | ||
|
|
||
| model = "Ensemble_" + "_".join(base_models) + "_" + ensemble_type + "_" + combiner_model | ||
|
|
||
| # Model configuration | ||
| FORECAST_HORIZONS = [LeadTime.from_string("PT36H")] # Forecast horizon(s) | ||
| PREDICTION_QUANTILES = [ | ||
| Q(0.05), | ||
| Q(0.1), | ||
| Q(0.3), | ||
| Q(0.5), | ||
| Q(0.7), | ||
| Q(0.9), | ||
| Q(0.95), | ||
| ] # Quantiles for probabilistic forecasts | ||
|
|
||
| BENCHMARK_FILTER: list[Liander2024Category] | None = None | ||
|
|
||
| USE_MLFLOW_STORAGE = False | ||
|
|
||
| if USE_MLFLOW_STORAGE: | ||
| storage = MLFlowStorage( | ||
| tracking_uri=str(OUTPUT_PATH / "mlflow_artifacts"), | ||
| local_artifacts_path=OUTPUT_PATH / "mlflow_tracking_artifacts", | ||
| ) | ||
| else: | ||
| storage = None | ||
|
|
||
| workflow_config = EnsembleWorkflowConfig( | ||
| model_id="common_model_", | ||
| ensemble_type=ensemble_type, | ||
| base_models=base_models, # type: ignore | ||
| combiner_model=combiner_model, | ||
| horizons=FORECAST_HORIZONS, | ||
| quantiles=PREDICTION_QUANTILES, | ||
| model_reuse_enable=False, | ||
| mlflow_storage=None, | ||
| radiation_column="shortwave_radiation", | ||
| rolling_aggregate_features=["mean", "median", "max", "min"], | ||
| wind_speed_column="wind_speed_80m", | ||
| pressure_column="surface_pressure", | ||
| temperature_column="temperature_2m", | ||
| relative_humidity_column="relative_humidity_2m", | ||
| energy_price_column="EPEX_NL", | ||
| forecast_combiner_sample_weight_exponent=0, | ||
| forecaster_sample_weight_exponent={"gblinear": 1, "lgbm": 0, "xgboost": 0, "lgbm_linear": 0}, | ||
| ) | ||
|
|
||
|
|
||
| # Create the backtest configuration | ||
| backtest_config = BacktestForecasterConfig( | ||
| requires_training=True, | ||
| predict_length=timedelta(days=7), | ||
| predict_min_length=timedelta(minutes=15), | ||
| predict_context_length=timedelta(days=14), # Context needed for lag features | ||
| predict_context_min_coverage=0.5, | ||
| training_context_length=timedelta(days=90), # Three months of training data | ||
| training_context_min_coverage=0.5, | ||
| predict_sample_interval=timedelta(minutes=15), | ||
| ) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| start_time = time.time() | ||
| create_liander2024_benchmark_runner( | ||
| storage=LocalBenchmarkStorage(base_path=OUTPUT_PATH / model), | ||
| data_dir=Path("../data/liander2024-energy-forecasting-benchmark"), | ||
| callbacks=[StrictExecutionCallback()], | ||
| ).run( | ||
| forecaster_factory=create_openstef4_preset_backtest_forecaster( | ||
| workflow_config=workflow_config, | ||
| cache_dir=OUTPUT_PATH / "cache", | ||
| ), | ||
| run_name=model, | ||
| n_processes=N_PROCESSES, | ||
| filter_args=BENCHMARK_FILTER, | ||
| ) | ||
|
|
||
| end_time = time.time() | ||
| print(f"Benchmark completed in {end_time - start_time:.2f} seconds.") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,121 @@ | ||
| """Liander 2024 Benchmark Example. | ||
|
|
||
| ==================================== | ||
|
|
||
| This example demonstrates how to set up and run the Liander 2024 STEF benchmark using OpenSTEF BEAM. | ||
| The benchmark will evaluate XGBoost and GBLinear models on the dataset from HuggingFace. | ||
| """ | ||
|
|
||
| # SPDX-FileCopyrightText: 2025 Contributors to the OpenSTEF project <short.term.energy.forecasts@alliander.com> | ||
| # | ||
| # SPDX-License-Identifier: MPL-2.0 | ||
|
|
||
| import os | ||
| import time | ||
|
|
||
| os.environ["OMP_NUM_THREADS"] = "1" # Set OMP_NUM_THREADS to 1 to avoid issues with parallel execution and xgboost | ||
| os.environ["OPENBLAS_NUM_THREADS"] = "1" | ||
| os.environ["MKL_NUM_THREADS"] = "1" | ||
|
|
||
| import logging | ||
| import multiprocessing | ||
| from datetime import timedelta | ||
| from pathlib import Path | ||
|
|
||
| from openstef_beam.backtesting.backtest_forecaster import BacktestForecasterConfig | ||
| from openstef_beam.benchmarking.baselines import ( | ||
| create_openstef4_preset_backtest_forecaster, | ||
| ) | ||
| from openstef_beam.benchmarking.benchmarks.liander2024 import Liander2024Category, create_liander2024_benchmark_runner | ||
| from openstef_beam.benchmarking.callbacks.strict_execution_callback import StrictExecutionCallback | ||
| from openstef_beam.benchmarking.storage.local_storage import LocalBenchmarkStorage | ||
| from openstef_core.types import LeadTime, Q | ||
| from openstef_models.integrations.mlflow.mlflow_storage import MLFlowStorage | ||
| from openstef_models.presets import ( | ||
| ForecastingWorkflowConfig, | ||
| ) | ||
|
|
||
| logging.basicConfig(level=logging.INFO, format="[%(asctime)s][%(levelname)s] %(message)s") | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| OUTPUT_PATH = Path("./benchmark_results") | ||
|
|
||
| N_PROCESSES = multiprocessing.cpu_count() # Amount of parallel processes to use for the benchmark | ||
|
|
||
| model = "residual" # Can be "stacking", "learned_weights" or "residual" | ||
|
|
||
| # Model configuration | ||
| FORECAST_HORIZONS = [LeadTime.from_string("PT36H")] # Forecast horizon(s) | ||
| PREDICTION_QUANTILES = [ | ||
| Q(0.05), | ||
| Q(0.1), | ||
| Q(0.3), | ||
| Q(0.5), | ||
| Q(0.7), | ||
| Q(0.9), | ||
| Q(0.95), | ||
| ] # Quantiles for probabilistic forecasts | ||
|
|
||
| BENCHMARK_FILTER: list[Liander2024Category] | None = None | ||
|
|
||
| USE_MLFLOW_STORAGE = False | ||
|
|
||
| if USE_MLFLOW_STORAGE: | ||
| storage = MLFlowStorage( | ||
| tracking_uri=str(OUTPUT_PATH / "mlflow_artifacts"), | ||
| local_artifacts_path=OUTPUT_PATH / "mlflow_tracking_artifacts", | ||
| ) | ||
| else: | ||
| storage = None | ||
|
|
||
| common_config = ForecastingWorkflowConfig( | ||
| model_id="common_model_", | ||
| model=model, | ||
| horizons=FORECAST_HORIZONS, | ||
| quantiles=PREDICTION_QUANTILES, | ||
| model_reuse_enable=False, | ||
| mlflow_storage=None, | ||
| radiation_column="shortwave_radiation", | ||
| rolling_aggregate_features=["mean", "median", "max", "min"], | ||
| wind_speed_column="wind_speed_80m", | ||
| pressure_column="surface_pressure", | ||
| temperature_column="temperature_2m", | ||
| relative_humidity_column="relative_humidity_2m", | ||
| energy_price_column="EPEX_NL", | ||
| ) | ||
|
|
||
|
|
||
| # Create the backtest configuration | ||
| backtest_config = BacktestForecasterConfig( | ||
| requires_training=True, | ||
| predict_length=timedelta(days=7), | ||
| predict_min_length=timedelta(minutes=15), | ||
| predict_context_length=timedelta(days=14), # Context needed for lag features | ||
| predict_context_min_coverage=0.5, | ||
| training_context_length=timedelta(days=90), # Three months of training data | ||
| training_context_min_coverage=0.5, | ||
| predict_sample_interval=timedelta(minutes=15), | ||
| ) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| start_time = time.time() | ||
|
|
||
| # Run for XGBoost model | ||
| create_liander2024_benchmark_runner( | ||
| storage=LocalBenchmarkStorage(base_path=OUTPUT_PATH / model), | ||
| callbacks=[StrictExecutionCallback()], | ||
| ).run( | ||
| forecaster_factory=create_openstef4_preset_backtest_forecaster( | ||
| workflow_config=common_config, | ||
| cache_dir=OUTPUT_PATH / "cache", | ||
| ), | ||
| run_name=model, | ||
| n_processes=N_PROCESSES, | ||
| filter_args=BENCHMARK_FILTER, | ||
| ) | ||
|
|
||
| end_time = time.time() | ||
| msg = f"Benchmark completed in {end_time - start_time:.2f} seconds." | ||
| logger.info(msg) | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.