You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Update artifact for github pages deployment workflow, triggers on push/PR to main
- Restore working .ipynb for Genz bcs example
- Update docs makefile
Copy file name to clipboardExpand all lines: docs/auto_examples/ex_genz_bcs.rst
+35-38Lines changed: 35 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,18 +45,15 @@ with parity plots and Root Mean Square Error (RMSE) values used to compare their
45
45
To follow along with the cross-validation algorithm for selecting the optimal eta, see section "Functions for cross-validation algorithm" in the second half of the notebook.
46
46
These methods have been implemented under-the-hood in PyTUQ. Refer to example "Polynomial Chaos Expansion Construction" (``ex_pce.py``) for a demonstration of how to use these methods through a direct call to the PCE class.
47
47
48
-
.. GENERATED FROM PYTHON SOURCE LINES 30-45
48
+
.. GENERATED FROM PYTHON SOURCE LINES 30-42
49
49
50
50
.. code-block:: Python
51
51
52
52
53
-
import os
54
-
import sys
55
53
56
54
import numpy as np
57
55
import copy
58
56
import math
59
-
import pytuq.utils.funcbank as fcb
60
57
from matplotlib import pyplot as plt
61
58
from sklearn.metrics import root_mean_squared_error
62
59
@@ -71,7 +68,7 @@ These methods have been implemented under-the-hood in PyTUQ. Refer to example "P
71
68
72
69
73
70
74
-
.. GENERATED FROM PYTHON SOURCE LINES 46-54
71
+
.. GENERATED FROM PYTHON SOURCE LINES 43-51
75
72
76
73
Constructing PC surrogate and generating data
77
74
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -82,7 +79,7 @@ along with training data and testing data with output noise. This data and the c
82
79
will be used to create the same PC surrogate fitted in all three examples: first with linear regression,
83
80
next using BCS with a given eta, and third using BCS with the most optimal eta.
84
81
85
-
.. GENERATED FROM PYTHON SOURCE LINES 56-61
82
+
.. GENERATED FROM PYTHON SOURCE LINES 53-58
86
83
87
84
.. code-block:: Python
88
85
@@ -98,7 +95,7 @@ next using BCS with a given eta, and third using BCS with the most optimal eta.
98
95
99
96
100
97
101
-
.. GENERATED FROM PYTHON SOURCE LINES 62-87
98
+
.. GENERATED FROM PYTHON SOURCE LINES 59-84
102
99
103
100
.. code-block:: Python
104
101
@@ -134,13 +131,13 @@ next using BCS with a given eta, and third using BCS with the most optimal eta.
134
131
135
132
136
133
137
-
.. GENERATED FROM PYTHON SOURCE LINES 88-91
134
+
.. GENERATED FROM PYTHON SOURCE LINES 85-88
138
135
139
136
With a stochastic dimensionality of 4 (defined above) and a chosen polynomial order of 4, we construct the PC surrogate that
140
137
will be used in both builds. By calling the ``printInfo()`` method from the PCRV variable, you can print the PC surrogate's
141
138
full basis and current coefficients, before BCS selects and retains the most significant PC terms to reduce the basis.
142
139
143
-
.. GENERATED FROM PYTHON SOURCE LINES 91-104
140
+
.. GENERATED FROM PYTHON SOURCE LINES 88-101
144
141
145
142
.. code-block:: Python
146
143
@@ -246,19 +243,19 @@ full basis and current coefficients, before BCS selects and retains the most sig
246
243
247
244
248
245
249
-
.. GENERATED FROM PYTHON SOURCE LINES 105-108
246
+
.. GENERATED FROM PYTHON SOURCE LINES 102-105
250
247
251
248
From the input parameters of our PC surrogate, we have 70 basis terms in our PCE. With 70 training points and no noise, having 70 basis terms would mean that we have a fully determined system, as the number of training points is the same as the number of basis terms. However, with the addition of noise in our training data, it becomes harder for the model to accurately fit all basis terms, leading to potential overfitting. This demonstrates the helpful role BCS might play as a choice for our regression build. As a sparse regression approach, BCS uses regularization to select only the most relevant basis terms, making it particularly effective in situations like this, where we do not have enough clear information to fit all basis terms without overfitting.
252
249
253
250
In the next sections, we will explore the effects of overfitting in more detail.
254
251
255
-
.. GENERATED FROM PYTHON SOURCE LINES 110-113
252
+
.. GENERATED FROM PYTHON SOURCE LINES 107-110
256
253
257
254
Least Squares Regression
258
255
^^^^^^^^^^^^^^^^^^^^^^^^^
259
256
To start, we call the PCE class method of ``build()`` with no arguments to use the default regression option of least squares. Then, through ``evaluate()``, we can generate model predictions for our training and testing data.
260
257
261
-
.. GENERATED FROM PYTHON SOURCE LINES 115-123
258
+
.. GENERATED FROM PYTHON SOURCE LINES 112-120
262
259
263
260
.. code-block:: Python
264
261
@@ -283,7 +280,7 @@ To start, we call the PCE class method of ``build()`` with no arguments to use t
283
280
284
281
285
282
286
-
.. GENERATED FROM PYTHON SOURCE LINES 124-137
283
+
.. GENERATED FROM PYTHON SOURCE LINES 121-134
287
284
288
285
.. code-block:: Python
289
286
@@ -318,7 +315,7 @@ To start, we call the PCE class method of ``build()`` with no arguments to use t
318
315
319
316
320
317
321
-
.. GENERATED FROM PYTHON SOURCE LINES 138-154
318
+
.. GENERATED FROM PYTHON SOURCE LINES 135-151
322
319
323
320
.. code-block:: Python
324
321
@@ -356,7 +353,7 @@ To start, we call the PCE class method of ``build()`` with no arguments to use t
356
353
357
354
358
355
359
-
.. GENERATED FROM PYTHON SOURCE LINES 155-163
356
+
.. GENERATED FROM PYTHON SOURCE LINES 152-160
360
357
361
358
.. code-block:: Python
362
359
@@ -382,19 +379,19 @@ To start, we call the PCE class method of ``build()`` with no arguments to use t
382
379
383
380
384
381
385
-
.. GENERATED FROM PYTHON SOURCE LINES 164-167
382
+
.. GENERATED FROM PYTHON SOURCE LINES 161-164
386
383
387
384
The results above show us the limitations of using least squares regression to construct our surrogate. From the parity plots, we can see how the testing predictions from the LSQ regression are more spread out from the parity line, while the training predictions are extremely close to the line. Because LSQ fits all the basis terms to the training data, the model fits too closely to the noisy training dataset, and the true underlying pattern of the function is not effectively captured. Our RMSE values align with this as well: while the training RMSE is extremely low, the testing RMSE is significantly higher, as the model struggles to generalize to the unseen test data.
388
385
389
386
To improve our model's generalization, we can build our model with BCS instead. As a sparse regression method, BCS reduces the number of basis terms with which we can fit our data to, reducing the risk of overfitting.
390
387
391
-
.. GENERATED FROM PYTHON SOURCE LINES 169-172
388
+
.. GENERATED FROM PYTHON SOURCE LINES 166-169
392
389
393
390
BCS with default settings (default eta)
394
391
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
395
392
In this section, we use the same PC surrogate, ``pce_surr``, for the second build. With the flag ``regression='bcs'``, we choose the BCS method for the fitting. A user-defined eta of 1e-10 is also passed in.
396
393
397
-
.. GENERATED FROM PYTHON SOURCE LINES 172-181
394
+
.. GENERATED FROM PYTHON SOURCE LINES 169-178
398
395
399
396
.. code-block:: Python
400
397
@@ -441,11 +438,11 @@ In this section, we use the same PC surrogate, ``pce_surr``, for the second buil
441
438
442
439
443
440
444
-
.. GENERATED FROM PYTHON SOURCE LINES 182-183
441
+
.. GENERATED FROM PYTHON SOURCE LINES 179-180
445
442
446
443
After fitting, we evaluate the PCE using our training and testing data. To analyze the model's goodness of fit, we first plot the surrogate predictions against the training and testing data respectively.
447
444
448
-
.. GENERATED FROM PYTHON SOURCE LINES 183-188
445
+
.. GENERATED FROM PYTHON SOURCE LINES 180-185
449
446
450
447
.. code-block:: Python
451
448
@@ -461,7 +458,7 @@ After fitting, we evaluate the PCE using our training and testing data. To analy
461
458
462
459
463
460
464
-
.. GENERATED FROM PYTHON SOURCE LINES 189-202
461
+
.. GENERATED FROM PYTHON SOURCE LINES 186-199
465
462
466
463
.. code-block:: Python
467
464
@@ -496,7 +493,7 @@ After fitting, we evaluate the PCE using our training and testing data. To analy
496
493
497
494
498
495
499
-
.. GENERATED FROM PYTHON SOURCE LINES 203-219
496
+
.. GENERATED FROM PYTHON SOURCE LINES 200-216
500
497
501
498
.. code-block:: Python
502
499
@@ -534,7 +531,7 @@ After fitting, we evaluate the PCE using our training and testing data. To analy
534
531
535
532
536
533
537
-
.. GENERATED FROM PYTHON SOURCE LINES 220-228
534
+
.. GENERATED FROM PYTHON SOURCE LINES 217-225
538
535
539
536
.. code-block:: Python
540
537
@@ -560,13 +557,13 @@ After fitting, we evaluate the PCE using our training and testing data. To analy
560
557
561
558
562
559
563
-
.. GENERATED FROM PYTHON SOURCE LINES 229-232
560
+
.. GENERATED FROM PYTHON SOURCE LINES 226-229
564
561
565
562
From our parity plots, we can see how BCS already generalizes better to unseen data as compared to LSQ, with reduced error in our testing data predictions. In our RMSE calculations, notice how the training error is smaller than the testing error. Though the difference in value is small, this amount is still significant as we have noise in our training data yet no noise in our testing data. That the testing error is higher than the training error suggests that overfitting is still happening within our model.
566
563
567
564
In the next section, we explore how finding the optimal value of eta -- the stopping criterion for the BCS parameter of gamma, determined through a Bayesian evidence maximization approach -- can impact model sparsity and accuracy to avoid overfitting.
568
565
569
-
.. GENERATED FROM PYTHON SOURCE LINES 235-241
566
+
.. GENERATED FROM PYTHON SOURCE LINES 232-238
570
567
571
568
BCS with optimal eta (found through cross-validation)
@@ -575,7 +572,7 @@ Before we build our PC surrogate again with the most optimal eta, we first expos
575
572
Functions for cross-validation algorithm
576
573
+++++++++++++++++++++++++++++++++++++++++
577
574
578
-
.. GENERATED FROM PYTHON SOURCE LINES 243-285
575
+
.. GENERATED FROM PYTHON SOURCE LINES 240-282
579
576
580
577
.. code-block:: Python
581
578
@@ -628,7 +625,7 @@ Functions for cross-validation algorithm
628
625
629
626
630
627
631
-
.. GENERATED FROM PYTHON SOURCE LINES 286-330
628
+
.. GENERATED FROM PYTHON SOURCE LINES 283-327
632
629
633
630
.. code-block:: Python
634
631
@@ -683,7 +680,7 @@ Functions for cross-validation algorithm
683
680
684
681
685
682
686
-
.. GENERATED FROM PYTHON SOURCE LINES 331-448
683
+
.. GENERATED FROM PYTHON SOURCE LINES 328-445
687
684
688
685
.. code-block:: Python
689
686
@@ -811,15 +808,15 @@ Functions for cross-validation algorithm
811
808
812
809
813
810
814
-
.. GENERATED FROM PYTHON SOURCE LINES 449-454
811
+
.. GENERATED FROM PYTHON SOURCE LINES 446-451
815
812
816
813
BCS build with the most optimal eta
817
814
+++++++++++++++++++++++++++++++++++++
818
815
Instead of using a default eta, here we call the cross-validation algorithm, ``optimize_eta()``, to choose the most optimal eta from a range of etas given below.
819
816
820
817
- With the flag ``plot=True``, the CV algorithm produces a graph of the training and testing (validation) data's RMSE values for each eta. The eta with the smallest RMSE for the validation data is the one chosen as the optimal eta.
821
818
822
-
.. GENERATED FROM PYTHON SOURCE LINES 454-461
819
+
.. GENERATED FROM PYTHON SOURCE LINES 451-458
823
820
824
821
.. code-block:: Python
825
822
@@ -1007,15 +1004,15 @@ Instead of using a default eta, here we call the cross-validation algorithm, ``o
1007
1004
1008
1005
1009
1006
1010
-
.. GENERATED FROM PYTHON SOURCE LINES 462-467
1007
+
.. GENERATED FROM PYTHON SOURCE LINES 459-464
1011
1008
1012
1009
From our eta plot above, we can see that our most optimal eta falls at :math:`1\times10^{-10}`, where the validation error is the lowest. While this indicates that the model performs well at this eta value, we can still observe a tendency towards overfitting in the model. For larger eta values, the training and validation RMSE lines are close together, suggesting that the model is performing similarly on both seen and unseen datasets, as would be desired. However, as eta decreases, the training RMSE falls while the validation RMSE rises, highlighting a region where overfitting occurs.
1013
1010
1014
1011
This behavior is expected because smaller eta values retain more basis terms, increasing the model's degrees of freedom. While this added flexibility allows the model to fit the training data more closely, it also makes the model more prone to fitting noise rather than capturing the true underlying function. Selecting the most optimal eta of :math:`1\times10^{-4}`, as compared to the earlier user-defined eta of :math:`1\times10^{-10}`, allows us to balance model complexity and generalization.
1015
1012
1016
1013
Now, with the optimum eta obtained, we can run the fitting again and produce parity plots for our predicted output.
1017
1014
1018
-
.. GENERATED FROM PYTHON SOURCE LINES 467-476
1015
+
.. GENERATED FROM PYTHON SOURCE LINES 464-473
1019
1016
1020
1017
.. code-block:: Python
1021
1018
@@ -1052,7 +1049,7 @@ Now, with the optimum eta obtained, we can run the fitting again and produce par
1052
1049
1053
1050
1054
1051
1055
-
.. GENERATED FROM PYTHON SOURCE LINES 477-482
1052
+
.. GENERATED FROM PYTHON SOURCE LINES 474-479
1056
1053
1057
1054
.. code-block:: Python
1058
1055
@@ -1068,7 +1065,7 @@ Now, with the optimum eta obtained, we can run the fitting again and produce par
1068
1065
1069
1066
1070
1067
1071
-
.. GENERATED FROM PYTHON SOURCE LINES 483-496
1068
+
.. GENERATED FROM PYTHON SOURCE LINES 480-493
1072
1069
1073
1070
.. code-block:: Python
1074
1071
@@ -1103,7 +1100,7 @@ Now, with the optimum eta obtained, we can run the fitting again and produce par
1103
1100
1104
1101
1105
1102
1106
-
.. GENERATED FROM PYTHON SOURCE LINES 497-510
1103
+
.. GENERATED FROM PYTHON SOURCE LINES 494-507
1107
1104
1108
1105
.. code-block:: Python
1109
1106
@@ -1138,7 +1135,7 @@ Now, with the optimum eta obtained, we can run the fitting again and produce par
1138
1135
1139
1136
1140
1137
1141
-
.. GENERATED FROM PYTHON SOURCE LINES 511-519
1138
+
.. GENERATED FROM PYTHON SOURCE LINES 508-516
1142
1139
1143
1140
.. code-block:: Python
1144
1141
@@ -1164,7 +1161,7 @@ Now, with the optimum eta obtained, we can run the fitting again and produce par
1164
1161
1165
1162
1166
1163
1167
-
.. GENERATED FROM PYTHON SOURCE LINES 520-523
1164
+
.. GENERATED FROM PYTHON SOURCE LINES 517-520
1168
1165
1169
1166
In these final RMSE calculations, we can see how our training RMSE has decreased from 1.80e-02 to 1.21e-02 by building with the most optimal eta. This indicates that our model has improved in generalization and is performing better on unseen data. Though our training error is still larger than our testing error, this can be attributed to the lack of noise in our testing data, while noise is present in our training data. While the optimal eta reduces overfitting and improves generalization, the noise in our training data still impacts the training error and remains an important consideration during our evaluation of the model performance.
1170
1167
@@ -1173,7 +1170,7 @@ While this demonstration calls the cross-validation algorithm as a function outs
1173
1170
1174
1171
.. rst-class:: sphx-glr-timing
1175
1172
1176
-
**Total running time of the script:** (0 minutes 6.413 seconds)
1173
+
**Total running time of the script:** (0 minutes 10.810 seconds)
0 commit comments