paradox compatability, remove holdout rows

sebffischer · mb706 · web-flow · commit 7eccb17342ab · 2024-04-16T10:07:25.000+02:00
* check with new paradox

* new paradox syntax

* paradox::

* update vignette to not use holdout role

* add workflows

* fix workflow

* trigger actions

* dev cmd check with paradox master

* news

* Update vignettes/mcboost_basics_extensions.Rmd

* delete workflows

* paradox compatibility, vignette

* update maintainer

* release 0.4.3

---------

Co-authored-by: mb706 &lt;mlr.developer@mb706.com&gt;
diff --git a/.github/workflows/r-cmd-check-paradox.yml b/.github/workflows/r-cmd-check-paradox.yml
@@ -0,0 +1,44 @@
+# r cmd check workflow of the mlr3 ecosystem v0.1.0
+# https://github.com/mlr-org/actions
+on:
+  workflow_dispatch:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+
+name: r-cmd-check-paradox
+
+jobs:
+  r-cmd-check:
+    runs-on: ${{ matrix.config.os }}
+
+    name: ${{ matrix.config.os }} (${{ matrix.config.r }})
+
+    env:
+      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
+
+    strategy:
+      fail-fast: false
+      matrix:
+        config:
+          - {os: ubuntu-latest,   r: 'devel'}
+          - {os: ubuntu-latest,   r: 'release'}
+
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: paradox
+        run: 'echo -e "Remotes:\n    mlr-org/paradox,\n    mlr-org/mlr3learners,\n    mlr-org/mlr3pipelines,\n    mlr-org/mlr3oml" >> DESCRIPTION'
+
+      - uses: r-lib/actions/setup-r@v2
+        with:
+          r-version: ${{ matrix.config.r }}
+
+      - uses: r-lib/actions/setup-r-dependencies@v2
+        with:
+          extra-packages: any::rcmdcheck
+          needs: check
+      - uses: r-lib/actions/check-r-package@v2
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,11 +1,11 @@
 Package: mcboost
 Type: Package
 Title: Multi-Calibration Boosting
-Version: 0.4.2
+Version: 0.4.3
 Authors@R:
     c(person(given = "Florian",
              family = "Pfisterer",
-             role = c("cre", "aut"),
+             role = "aut",
              email = "pfistererf@googlemail.com",
              comment = c(ORCID = "0000-0001-8867-762X")),
       person(given = "Susanne",
@@ -18,18 +18,21 @@ Authors@R:
              role = "ctb",
              email = "c.kern@uni-mannheim.de",
              comment = c(ORCID = "0000-0001-7363-4299")),
-      person(given = "Carolin", 
+      person(given = "Carolin",
              family = "Becker",
              role = "ctb"),
       person(given = "Bernd",
              family = "Bischl",
              role = "ctb",
              email = "bernd_bischl@gmx.net",
-             comment = c(ORCID = "0000-0001-6002-6980"))
+             comment = c(ORCID = "0000-0001-6002-6980")),
+      person(given = "Sebastian",
+             family = "Fischer",
+             role = c("ctb", "cre"),
+             email = "sebf.fischer@gmail.com")
     )
-Maintainer: Florian Pfisterer <pfistererf@googlemail.com>
 Description: Implements 'Multi-Calibration Boosting' (2018) <https://proceedings.mlr.press/v80/hebert-johnson18a.html> and
-    'Multi-Accuracy Boosting' (2019) <arXiv:1805.12317> for the multi-calibration of a machine learning model's prediction.
+    'Multi-Accuracy Boosting' (2019) <doi:10.48550/arXiv.1805.12317> for the multi-calibration of a machine learning model's prediction.
     'MCBoost' updates predictions for sub-groups in an iterative fashion in order to mitigate biases like poor calibration or large accuracy differences across subgroups.
     Multi-Calibration works best in scenarios where the underlying data & labels are unbiased, but resulting models are.
     This is often the case, e.g. when an algorithm fits a majority population while ignoring or under-fitting minority populations.
@@ -66,9 +69,9 @@ Suggests:
     covr,
     testthat (>= 3.1.0)
 Roxygen: list(markdown = TRUE, r6 = TRUE)
-RoxygenNote: 7.2.1
+RoxygenNote: 7.3.1
 VignetteBuilder: knitr
-Collate: 
+Collate:
     'AuditorFitters.R'
     'MCBoost.R'
     'PipelineMCBoost.R'
diff --git a/NEWS.md b/NEWS.md
@@ -1,7 +1,10 @@
-# mcboost (development version)
+# mcboost 0.4.3
+
+* Compatibility with upcoming 'paradox' release.
+* Change the vignette to not use the holdout task.
 
 # mcboost 0.4.2
-* Removed new functionality for survival tasks added in `0.4.0`. 
+* Removed new functionality for survival tasks added in `0.4.0`.
   A dependency, `mlr3proba` was removed from CRAN for now.
   The functionality will be added back when `mlr3proba` is re-introduced to CRAN.
   Users who wish to use `mcboost` for `survival` are adviced to use version `0.4.1` usetogether with the GitHub version of `mlr3proba`.
diff --git a/R/PipeOpMCBoost.R b/R/PipeOpMCBoost.R
@@ -65,19 +65,19 @@ PipeOpMCBoost = R6Class("PipeOpMCBoost",
     #' @param param_vals [`list`] \cr
     #'   List of hyperparameters for the `PipeOp`.
     initialize = function(id = "mcboost", param_vals = list()) {
-      param_set = paradox::ParamSet$new(list(
-        paradox::ParamInt$new("max_iter", lower = 0L, upper = Inf, default = 5L, tags = "train"),
-        paradox::ParamDbl$new("alpha", lower = 0, upper = 1, default = 1e-4, tags = "train"),
-        paradox::ParamDbl$new("eta", lower = 0, upper = 1, default = 1, tags = "train"),
-        paradox::ParamLgl$new("partition", tags = "train", default = TRUE),
-        paradox::ParamInt$new("num_buckets", lower = 1, upper = Inf, default = 2L, tags = "train"),
-        paradox::ParamLgl$new("rebucket", default = FALSE, tags = "train"),
-        paradox::ParamLgl$new("multiplicative", default = TRUE, tags = "train"),
-        paradox::ParamUty$new("auditor_fitter", default = NULL, tags = "train"),
-        paradox::ParamUty$new("subpops", default = NULL, tags = "train"),
-        paradox::ParamUty$new("default_model_class", default = ConstantPredictor, tags = "train"),
-        paradox::ParamUty$new("init_predictor", default = NULL, tags = "train")
-      ))
+      param_set = paradox::ps(
+        max_iter = paradox::p_int(lower = 0L, upper = Inf, default = 5L, tags = "train"),
+        alpha = paradox::p_dbl(lower = 0, upper = 1, default = 1e-4, tags = "train"),
+        eta = paradox::p_dbl(lower = 0, upper = 1, default = 1, tags = "train"),
+        partition = paradox::p_lgl(tags = "train", default = TRUE),
+        num_buckets = paradox::p_int(lower = 1, upper = Inf, default = 2L, tags = "train"),
+        rebucket = paradox::p_lgl(default = FALSE, tags = "train"),
+        multiplicative = paradox::p_lgl(default = TRUE, tags = "train"),
+        auditor_fitter = paradox::p_uty(default = NULL, tags = "train"),
+        subpops = paradox::p_uty(default = NULL, tags = "train"),
+        default_model_class = paradox::p_uty(default = ConstantPredictor, tags = "train"),
+        init_predictor = paradox::p_uty(default = NULL, tags = "train")
+      )
       super$initialize(id,
         param_set = param_set, param_vals = param_vals, packages = character(0),
         input = data.table(name = c("data", "prediction"), train = c("TaskClassif", "TaskClassif"), predict = c("TaskClassif", "TaskClassif")),
diff --git a/cran-comments.md b/cran-comments.md
@@ -1,30 +1,6 @@
-## Reason for resubmission
-
-Removed dependency on package mlr3proba that was removed from CRAN.
-Apologies for not being able to upload a new version in time.
-
 ## R CMD check
 
-Results in one NOTE:
-
-  CRAN repository db overrides:
-    X-CRAN-Comment: Archived on 2022-05-16 as requires archived package 'mlr3proba'.
-
-  The dependency on 'mlr3proba' has been removed in the updated version.
-
-
-There is one NOTE that is only found on Windows (Server 2022, R-devel 64-bit):
-
-```
-* checking for detritus in the temp directory ... NOTE
-Found the following files/directories:
-  'lastMiKTeXException'
-```
-
-As noted in R-hub issue #503, this could be due to a bug/crash in MiKTeX and can likely be ignored.
-
-- WARNINGs or ERRORs
-
-## R-HUB
+0 errors | 0 warnings | 1 note
 
-All checks show "Status: success"
+New maintainer:
+  Sebastian Fischer <sebf.fischer@gmail.com>
diff --git a/man/mcboost-package.Rd b/man/mcboost-package.Rd
diff --git a/vignettes/mcboost_basics_extensions.Rmd b/vignettes/mcboost_basics_extensions.Rmd
@@ -517,10 +517,11 @@ summary(data$ViolentCrimesPerPop)
 ```
 
 We again split our task into **train** and **test**.
-We do this in `mlr3` by simply setting some (here 500) row roles to `"holdout"`.
+We do this in `mlr3` by creating a 2/3 - 1/3 split using `mlr3::partition()` and assigning the train ids to the row role `"use"`.
 
 ```{r}
-tsk$set_row_roles(sample(tsk$row_roles$use, 500), "holdout")
+split = partition(tsk)
+tsk$set_row_roles(split$train, "use")
 ```
 
 ### 6.1 Preprocessing
@@ -571,13 +572,13 @@ mc$multicalibrate(data, labels)
 
 ### 6.3 Evaluation on Test Data
 
-We first create the test task by setting the `holdout` rows to `use`, and then
+We first create the test task by assigning the test ids to the row role `"use"`, and then
 use our preprocessing `pipe's`  predict function to also impute missing values
 for the validation data. Then we again extract features `X` and target `y`.
 
 ```{r}
 test_task = tsk$clone()
-test_task$row_roles$use = test_task$row_roles$holdout
+test_task$row_roles$use = split$test
 test_task = pipe$predict(list(test_task))[[1]]
 test_data = test_task$data(cols = tsk$feature_names)
 test_labels = test_task$data(cols = tsk$target_names)[[1]]