Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ __pycache__/
*.py[cod]
*$py.class

htmls/*.ipynb

# C extensions
*.so

Expand Down
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Released]

## [1.5.0] - 2025-11-21

### Added
- **Reproducibility support**: Added `seed` parameter to Bayesian statistical tests (`bayesian_sign_test`, `bayesian_signed_rank_test`) for deterministic results
- **Reproducibility support**: Added `seed` parameter to `HistoPlot` class for consistent jitter in histogram generation
- **Multi-platform CI**: Comprehensive GitHub Actions workflow testing on Windows, Linux (Ubuntu), and macOS
- **Smoke tests**: Added automated smoke tests for CLI and API functionality across all platforms
- **Environment files**: Added `requirements.txt`, `requirements-dev.txt`, and `environment.yml` for broader compatibility
- **Headless mode documentation**: Complete documentation and examples for running SAES without display (CI/CD, servers)
- **Headless mode examples**: Python and shell script examples for automated workflows (`examples/headless_mode_example.py`, `examples/headless_cli_example.sh`, `examples/headless_cli_example.bat`)
- **New test suite**: Added `test_bayesian_seed.py` specifically for verifying seed reproducibility
- **Documentation**: New reproducibility documentation page (`docs/usage/reproducibility.rst`) with best practices

### Changed
- Updated `test_bayesian.py` to demonstrate both seed parameter usage and backward compatibility
- Enhanced README with sections on reproducibility and headless mode
- Improved documentation structure to include reproducibility guidance

### Fixed
- Ensured all random operations can be made deterministic for reproducible research

## [1.4.0] - 2025-11-15

### Added
Expand Down
45 changes: 45 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,51 @@ source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
```

### Using Environment Files

For broader compatibility, environment files are provided:

```sh
# Using pip with requirements.txt
pip install -r requirements.txt

# Using conda with environment.yml
conda env create -f environment.yml
conda activate saes
```

## 🔄 Reproducibility

SAES supports **deterministic seeds** for reproducible research:

```python
from SAES.statistical_tests.bayesian import bayesian_sign_test
from SAES.plots.histoplot import HistoPlot

# Bayesian tests with seed for reproducibility
result, _ = bayesian_sign_test(data, sample_size=5000, seed=42)

# Histogram plots with consistent jitter
histoplot = HistoPlot(data, metrics, "Accuracy", seed=42)
```

See the [reproducibility documentation](https://jMetal.github.io/SAES/usage/reproducibility.html) for details.

## 💻 Headless Mode

SAES can run in headless mode (without display) for automated workflows, CI/CD pipelines, and server environments:

```bash
# Set matplotlib backend
export MPLBACKEND=Agg

# Run SAES commands
python -m SAES -ls -ds data.csv -ms metrics.csv -m HV -s friedman -op results.tex
python -m SAES -bp -ds data.csv -ms metrics.csv -m HV -i Problem1 -op boxplot.png
```

See `examples/headless_mode_example.py` for a complete Python example or `examples/headless_cli_example.sh` for CLI usage.

## 🤝 Contributors

- [![GitHub](https://img.shields.io/badge/GitHub-100000?style=flat&logo=github&logoColor=white)](https://github.com/rorro6787) **Emilio Rodrigo Carreira Villalta**
Expand Down
20 changes: 18 additions & 2 deletions SAES/statistical_tests/bayesian.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ def bayesian_sign_test(data: pd.DataFrame,
rope_limits=[-0.01, 0.01],
prior_strength=0.5,
prior_place="rope",
sample_size=5000) -> tuple:
sample_size=5000,
seed=None) -> tuple:
"""
Performs the Bayesian sign test to compare the performance of two algorithms across multiple instances.
The Bayesian sign test is a non-parametric statistical test used to compare the performance of two algorithms on multiple instances. The null hypothesis is that the algorithms perform equivalently, which implies their average ranks are equal.
Expand Down Expand Up @@ -40,6 +41,9 @@ def bayesian_sign_test(data: pd.DataFrame,
sample_size (int):
Total number of random_search samples generated. Default is 5000.

seed (int, optional):
Random seed for reproducibility. Default is None (non-deterministic).

Returns:
tuple: A tuple containing the posterior probabilities and the samples drawn from the Dirichlet process. List of posterior probabilities:
- Pr(algorith_1 < algorithm_2)
Expand All @@ -62,6 +66,10 @@ def bayesian_sign_test(data: pd.DataFrame,
else:
raise ValueError("Initialization ERROR. Incorrect number of dimensions for axis 1")

# Set random seed for reproducibility
if seed is not None:
np.random.seed(seed)

# Compute the differences
Z = sample1 - sample2

Expand Down Expand Up @@ -93,7 +101,8 @@ def bayesian_signed_rank_test(data,
rope_limits=[-0.01, 0.01],
prior_strength=1.0,
prior_place="rope",
sample_size=1000) -> tuple:
sample_size=1000,
seed=None) -> tuple:
"""
Performs the Bayesian version of the signed rank test to compare the performance of two algorithms across multiple instances.
The Bayesian sign test is a non-parametric statistical test used to compare the performance of two algorithms on multiple instances. The null hypothesis is that the algorithms perform equivalently, which implies their average ranks are equal.
Expand Down Expand Up @@ -126,6 +135,9 @@ def bayesian_signed_rank_test(data,
sample_size (int):
Total number of random_search samples generated. Default is 5000.

seed (int, optional):
Random seed for reproducibility. Default is None (non-deterministic).

Returns:
tuple: A tuple containing the posterior probabilities and the samples drawn from the Dirichlet process. List of posterior probabilities:
- Pr(algorith_1 < algorithm_2)
Expand Down Expand Up @@ -153,6 +165,10 @@ def weights(n, s):
else:
raise ValueError("Initialization ERROR. Incorrect number of dimensions for axis 1")

# Set random seed for reproducibility
if seed is not None:
np.random.seed(seed)

# Compute the differences
Z = sample1 - sample2
Z0 = [-float("Inf"), 0.0, float("Inf")][["left", "rope", "right"].index(prior_place)]
Expand Down
94 changes: 94 additions & 0 deletions SOFTWARE_X_COMPLIANCE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Software X Compliance

SAES meets all Software X publication requirements.

## 1. Deterministic Seeds ✅

Bayesian tests support `seed` parameter for reproducibility:

```python
from SAES.statistical_tests.bayesian import bayesian_sign_test
result, _ = bayesian_sign_test(data, sample_size=1000, seed=42)
```

## 2. Multi-Platform CI ✅

`.github/workflows/multi-platform-test.yml` tests on:
- Ubuntu, Windows, macOS
- Python 3.10, 3.11, 3.12

## 3. Smoke Tests ✅

Run comprehensive smoke tests (10 tests) - fully automated:

```bash
chmod +x examples/smoke_test.sh
./examples/smoke_test.sh
```

The script automatically:
- Creates virtual environment if needed
- Installs dependencies
- Runs all tests in headless mode

**Tests cover:**
- **LaTeX tables** (4): Mean/Median, Friedman, Wilcoxon pivot, Wilcoxon pairwise
- **Plots** (3): Boxplot single, Boxplot grid, Critical distance
- **Statistical APIs** (3): Bayesian tests with seeds, Plot classes

## 4. Environment Files ✅

Multiple installation options:

```bash
# Option 1: Requirements file
pip install -r requirements.txt

# Option 2: Conda
conda env create -f environment.yml
conda activate saes

# Option 3: Auto-install (smoke test does this)
./examples/smoke_test.sh
```

Files provided:
- `requirements.txt` - Core dependencies
- `requirements-dev.txt` - Development dependencies
- `environment.yml` - Conda environment

## 5. Headless Mode ✅

For CI/CD and server environments:

```bash
export MPLBACKEND=Agg
python -m SAES -ls -ds data.csv -ms metrics.csv -m HV -s friedman -op output.tex
```

The smoke test script runs in headless mode by default.

## Quick Start

```bash
# Clone and test
git clone https://github.com/jMetal/SAES.git
cd SAES
chmod +x examples/smoke_test.sh
./examples/smoke_test.sh
```

## Verification

```bash
# Smoke tests (automated setup)
./examples/smoke_test.sh

# Unit tests
python -m unittest discover tests
```

## Branch

Feature branch: `feature/software-x-requirements`
Ready for merge (not merged yet, as requested)
138 changes: 138 additions & 0 deletions docs/usage/reproducibility.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
Reproducibility and Seeds
=========================

SAES supports deterministic behavior for reproducible research through random seed control.

Why Reproducibility Matters
---------------------------

When analyzing stochastic algorithms, reproducibility is crucial for:

- **Research validation**: Others can verify your results
- **Debugging**: Consistent results make it easier to identify issues
- **Comparisons**: Fair comparison requires consistent conditions
- **Publication**: Many journals and conferences require reproducible results

Functions with Random Seeds
---------------------------

The following SAES functions support deterministic execution via the ``seed`` parameter:

Bayesian Statistical Tests
~~~~~~~~~~~~~~~~~~~~~~~~~~

Both Bayesian tests support the ``seed`` parameter for reproducibility:

.. code-block:: python

from SAES.statistical_tests.bayesian import bayesian_sign_test, bayesian_signed_rank_test
import pandas as pd

data = pd.DataFrame({
'Algorithm_A': [0.9, 0.85, 0.95, 0.9, 0.92],
'Algorithm_B': [0.5, 0.6, 0.55, 0.58, 0.52]
})

# Deterministic results with seed
result1, _ = bayesian_sign_test(data, sample_size=5000, seed=42)
result2, _ = bayesian_sign_test(data, sample_size=5000, seed=42)
# result1 and result2 will be identical

# Same for signed rank test
result3, _ = bayesian_signed_rank_test(data, sample_size=1000, seed=123)

Histogram Plots
~~~~~~~~~~~~~~

The HistoPlot class supports seeding for consistent jitter when handling identical values:

.. code-block:: python

from SAES.plots.histoplot import HistoPlot
import pandas as pd

data = pd.read_csv("results.csv")
metrics = pd.read_csv("metrics.csv")

# Create histoplot with reproducible jitter
histoplot = HistoPlot(data, metrics, "Accuracy", seed=42)
histoplot.save_instance("Problem1", "output.png")

Best Practices
-------------

1. **Always use seeds for published research**: Set explicit seeds for all random operations
2. **Document your seeds**: Include seed values in your research papers and code
3. **Use different seeds for different experiments**: Avoid accidentally reusing the same random sequence
4. **Version control**: Include seed values in your version-controlled analysis scripts

Example: Complete Reproducible Workflow
---------------------------------------

.. code-block:: python

from SAES.statistical_tests.bayesian import bayesian_sign_test, bayesian_signed_rank_test
from SAES.plots.histoplot import HistoPlot
import pandas as pd

# Load data
data = pd.read_csv("algorithm_results.csv")
metrics = pd.read_csv("metrics.csv")

# Reproducible Bayesian analysis
SEED = 42
algorithm_a = data[data['Algorithm'] == 'A']['MetricValue']
algorithm_b = data[data['Algorithm'] == 'B']['MetricValue']

comparison_data = pd.DataFrame({
'Algorithm_A': algorithm_a.values,
'Algorithm_B': algorithm_b.values
})

# Run Bayesian test with seed
result, samples = bayesian_sign_test(
comparison_data,
sample_size=5000,
seed=SEED
)

print(f"P(A < B): {result[0]:.4f}")
print(f"P(A ≈ B): {result[1]:.4f}")
print(f"P(A > B): {result[2]:.4f}")

# Create reproducible visualization
histoplot = HistoPlot(data, metrics, "Accuracy", seed=SEED)
histoplot.save_all_instances("comparison.png")

Headless Mode for Automated Workflows
-------------------------------------

SAES can be run in headless mode (without display) for automated pipelines and CI/CD:

.. code-block:: bash

# Set matplotlib to use non-interactive backend
export MPLBACKEND=Agg

# Run SAES commands
python -m SAES -ls -ds data.csv -ms metrics.csv -m HV -s friedman -op results.tex
python -m SAES -bp -ds data.csv -ms metrics.csv -m HV -i Problem1 -op boxplot.png
python -m SAES -cdp -ds data.csv -ms metrics.csv -m HV -op cdplot.png

For Python scripts in headless environments:

.. code-block:: python

import matplotlib
matplotlib.use('Agg') # Must be called before importing pyplot

from SAES.plots.boxplot import Boxplot
import pandas as pd

# Your analysis code here
data = pd.read_csv("results.csv")
metrics = pd.read_csv("metrics.csv")

boxplot = Boxplot(data, metrics, "Accuracy")
boxplot.save_instance("Problem1", "output.png")

1 change: 1 addition & 0 deletions docs/usage/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ This section provides a brief overview of the three different features that this
html
bayesian
violin
reproducibility
Loading
Loading