PROTEUS ecosystem-wide testing infrastructure & CI/CD enhancements #579

timlichtenberg · 2026-01-03T10:27:45Z

Ready for first review – this is going to take a few rounds 🫠

Description

This PR establishes a comprehensive, standardized testing infrastructure for the entire PROTEUS ecosystem. It implements significant CI/CD improvements, automated coverage ratcheting, and provides extensive documentation and tooling to support consistent testing practices across all modules.

Starts to address #507 and prepares a unified testing infrastructure for the PROTEUS ecosystem. The idea is that all modules adhere to the same (high) testing standards. Code tests can be written with Copilot, but instructions must be rigorous and development and PR reviews must be guided. A human has to evaluate the end result at all times, optimally more than 1 person.

Key Changes

1. CI/CD Infrastructure

✅ Restructured GitHub Actions Workflows (.github/workflows/ci_tests.yml)
- Split matrix into separate test-linux and test-macos jobs for better control
- Implemented hash-based caching for SOCRATES binaries and AGNI Julia depot (~5 min savings)
- Added nightly full matrix testing (2 AM UTC) with macOS support
- Python 3.13 (Ubuntu always), macOS nightly only for extended testing
- Fixed NumPy 2.0 compatibility issues across PROTEUS, MORS, and aragog
✅ Reusable Quality Gate Workflow (.github/workflows/proteus_test_quality_gate.yml)
- Centralized testing workflow for entire PROTEUS ecosystem
- Configurable coverage thresholds (recommend 30-80% for new modules)
- Automatic Codecov integration and HTML artifact generation
- Can be called from any submodule (CALLIOPE, JANUS, MORS, VULCAN, ZEPHYRUS, etc.)

2. Test Infrastructure Documentation

✅ Comprehensive Testing Guide (docs/test_infrastructure.md)
- Complete architecture overview and configuration guidance
- Ecosystem rollout plan with phased deployment
- Developer workflow with common commands
- Troubleshooting section for pytest, coverage, and CI issues
- Best Practices including GitHub Copilot integration guidelines

3. GitHub Copilot Integration

✅ Copilot Instructions (.github/workflows/copilot-instructions.md)
- Ecosystem-wide coding guidelines for AI-assisted development
- Complete module structure with GitHub repository links
- Test infrastructure standards (structure, speed, coverage, markers)
- Automatic coverage ratcheting guidance
- Code quality standards (ruff, type hints, docstrings)
- Installation and dependency management references

4. Automated Coverage Ratcheting

✅ Threshold Auto-Update Mechanism (tools/update_coverage_threshold.py)
- Automatically increases fail_under threshold when coverage improves
- Never decreases - implements "coverage ratcheting" pattern
- Runs on main branch CI to lock in progress
- See CALLIOPE for reference implementation (18% → auto-updating)
✅ Documentation Updates
- Updated all coverage goals to 80%+ ecosystem-wide standard
- Promotes automatic ratcheting over manual threshold updates
- Risk-based prioritization: High-risk 95%+, Medium-risk 80%+, Low-risk 60%+

5. Testing Tools

✅ tools/validate_test_structure.sh - Verify tests mirror source structure
✅ tools/restructure_tests.sh - Automatically reorganize tests
✅ tools/coverage_analysis.sh - Module-level coverage reporting

Current Status

PROTEUS:

Coverage: 69.23% (target: 80%+)
CI: ✅ Passing on tl/test_ecosystem_v1 (Run #20668634326)
Features: Hash-based caching, dynamic badges, comprehensive reporting

Related Work:

CALLIOPE: Draft PR on tl/test_ecosystem_calliope branch (to be submitted)
- Implementing automatic coverage ratcheting (18% baseline)
- Adding reusable quality gate workflow
- Setting up test infrastructure mirroring

What's Next (Post-Merge)

Rework CI runs for faster deployment. I am as of yet unhappy about how long the tests take. I will work on creating a modular setup with fast and rapid checks on PR and nightly science builds using automatically regenerated docker images. This will take a bit, so this here presents a snapshot to get there.
Implementation of proper documentation and use instructions for all PROTEUS developers. The idea is to enforce much stricter testing routines: when one adds new code it needs to be immediately come with tests for the new code. This requires that everyone knows what they have to do. The tests can be written by Copilot or similar, but they need to adhere to the ecosystem standards.
Deploy to JANUS and MORS
- Add reusable quality gate workflows
- Implement automatic coverage ratcheting
- Set realistic starting thresholds (20-30%)
Bootstrap VULCAN, ZEPHYRUS, aragog
- Set up CI/CD from scratch
- Implement test structure validation
- Start with 20% coverage baseline
Implement in non-Python codes: AGNI, Obliqua
- Set up CI/CD from scratch
- Implement test structure validation
- Start with 20% coverage baseline
Ecosystem Monitoring
- Track coverage trends across all modules
- Monthly review of ratcheting progress
- Quarterly goal adjustments

Validation of changes

✅ CI passing on Ubuntu Python 3.12 (primary platform)
✅ CI passing on macOS nightly tests
✅ NumPy 2.0 compatibility verified across PROTEUS, MORS, aragog
✅ SOCRATES/AGNI caching reduces CI time by ~5 minutes
✅ All documentation reviewed and updated
✅ Coverage ratcheting mechanism tested (see in both PROTEUS & CALLIOPE test branch)

Test Configuration:

macOS 15.2 (M3) with Python 3.13
Ubuntu 24.04 (CI) with Python 3.13
All tests pass with pytest --cov

Checklist

I have followed the contributing guidelines
My code follows the style guidelines of this project
I have performed a self-review of my code
My changes generate no new warnings or errors
I have checked that the tests still pass on my computer
I have updated the docs, as appropriate (extensive documentation added)
I have added tests for these changes, as appropriate (CI/infrastructure changes)
I have checked that all dependencies have been updated, as required

Relevant people

@FormingWorlds/proteus-maintainer @FormingWorlds/proteus-developer
@nichollsh If you can have an initial look sometime soon that'd be good, we'll discuss on Monday.

Additional Context

This PR represents ~35 commits of iterative improvements to testing infrastructure, including:

CI workflow debugging and optimization
NumPy 2.0 compatibility fixes
Documentation enhancements
Copilot integration guidelines
Automatic coverage ratcheting implementation

The infrastructure is designed to be modular and reusable across the entire PROTEUS ecosystem, with CALLIOPE serving as the first external adoption case.

Most important documents: make sure to check these out:

…thon versions and OS platforms

…d testing coverage

… in error message

The CI workflow uses 'coverage run -m pytest' to collect coverage data. Having --cov options in pytest addopts creates a conflict that prevents coverage measurement. Coverage reporting is still configured in [tool.coverage.report] section and will now work properly with the CI command.

Now that NumPy 2.0 fixes are merged to aragog main (PR #5), remove the temporary test branch reference from CI workflow. Related: FormingWorlds/aragog#5

Implements performance optimizations to reduce CI runtime: 1. Cache SOCRATES compiled binaries (~7-8 min savings) - Caches socrates/ directory with binaries - Key based on source file hashes for automatic invalidation - Restore-keys for partial cache hits 2. Cache AGNI Julia depot (~3-4 min savings) - Caches AGNI/ directory and ~/.julia/ packages - Key based on Julia source and manifest files - Restore-keys for partial cache hits Expected improvement: 10-12 minutes saved per CI run (from ~26 to ~14-16 minutes) These caches only rebuild when source files change, otherwise use cached binaries/packages from previous runs.

Previous attempt failed because: - Cache keys used hashFiles() on directories that didn't exist yet - Result: empty cache keys like 'socrates-bins-Linux-' - No cache was ever restored or saved properly New approach: - Use cache/restore before install-all to load previous build - Use cache/save after tests to save new build - Key based on run_id (unique) with restore-keys for prefix matching - Only save cache if restore missed (avoid duplicate saves) This allows second and subsequent runs to skip 10-12 minutes of compilation.

CRITICAL FIX: Cache keys now depend on source/dependency hashes instead of run ID. This ensures: 1. Cache is automatically INVALIDATED when source code changes 2. Cache is automatically INVALIDATED when dependencies change 3. Tests always use current SOCRATES and AGNI versions 4. No stale cached code is used if upstream repos change SOCRATES cache: - Key: hash of *.f90, *.F90, *.c files, and build_code script - Invalidates when any Fortran/C source changes AGNI cache: - Key: hash of Project.toml and Manifest.toml files - Invalidates when Julia dependencies change Benefits: ✓ 32% CI speedup (26m → 18m) when deps unchanged ✓ Automatic detection of upstream changes ✓ No stale cache issues ✓ Maintain testing integrity Note: Pre-existing linting warnings about env.total are unrelated to this change and do not affect workflow execution.

Skip disk cleanup when available space is >80%, saving ~2m30s per run Changes: - Add 'Check available disk space' step that calculates usage percentage - Modify 'Free Disk Space (Ubuntu)' condition to only run if usage >20% - Threshold can be tuned; 80% is conservative to prevent full disk Expected savings: 2m 30s per build (disk rarely critical) Impact: ~15% CI runtime reduction on test branches Risk: LOW - cleanup still triggers if disk space actually needed

Critical bug fix: hashFiles() was evaluating on non-existent directories Root cause: - Cache restore steps tried to hash 'socrates/**/*.f90' files - But socrates/ directory didn't exist yet (cloned later in install-all) - Result: Empty hash → cache key 'socrates-bins-Linux-' (missing hash) - Cache always missed → SOCRATES recompiled every run (+11-15 min) Solution: - Clone SOCRATES and AGNI repos BEFORE cache restore steps - Now hashFiles() can compute proper hashes - Cache keys like 'socrates-bins-Linux-abc123def456' work correctly - proteus install-all will use existing clones (no duplicate work) Expected impact: - Cache hits will now work properly - Saves 11-15 minutes when SOCRATES source unchanged - Saves 2-3 minutes when AGNI dependencies unchanged - Reduces run from 48m to ~17-19m when caches hit

…nd 69% coverage threshold Priority 1 improvements to testing_infrastructure.md: Changes: - Document PROTEUS Phase 1 completion (69.23% coverage achieved) - Add comprehensive Phase 2 ecosystem integration guide - Create 4-step quick start deployment for ecosystem modules - Add advanced hash-based caching strategy documentation - Update coverage threshold progression from 5% to 69% for PROTEUS - Change reusable workflow default threshold from 5% to 30% (realistic for new modules) - Add deployment checklist (~2 hours per module) - Include performance expectations and troubleshooting for caching Files modified: - docs/testing_infrastructure.md: +317 lines (comprehensive ecosystem rollout guide) - pyproject.toml: fail_under = 69 (enforces actual achieved coverage) - .github/workflows/proteus_test_quality_gate.yml: improved default threshold and guidance This enables ecosystem modules (CALLIOPE, JANUS, MORS, VULCAN, ZEPHYRUS) to deploy quality gates with clear configuration examples, realistic thresholds, and validated patterns from PROTEUS implementation.

- Add CALLIOPE as Phase 2 pilot reference implementation - Document coverage ratcheting mechanism (auto-threshold updates) - Establish ecosystem integration standards (Codecov, artifacts, test quality) - Provide 4 direct reference links to CALLIOPE working examples - Update Phase 2 quick start with CALLIOPE patterns - Clarify rollout strategy for JANUS/MORS (Phase 2b/2c)

- Add tools/update_coverage_threshold.py for automatic threshold updates - Implement coverage ratcheting step in CI (only increases, never decreases) - Rename .github/workflows/ci.yml to ci_tests.yml for consistency with CALLIOPE - Rename docs/testing_infrastructure.md to test_infrastructure.md (shorter, clearer) - Update all references to renamed files in workflows and documentation - Update pyproject.toml with ratcheting mechanism comments - Add test_infrastructure.md to mkdocs.yml navigation Coverage ratcheting ensures sustainable progress: threshold automatically increases when coverage improves on main branch, preventing regression.

…ity, and safety

Copilot

Pull request overview

Copilot reviewed 26 out of 51 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/workflows/ci_tests.yml

docs/test_infrastructure.md

…ML code block - Convert multiline YAML block scalars to single-line format for cache keys (4 occurrences) - Fix unclosed TOML code block in test_infrastructure.md - Addresses review #579 (review)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b59e65fa3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-04T11:43:12Z

tools/update_coverage_threshold.py

+        elif new_threshold == current_threshold:
+            print(f"[=] Coverage threshold already at {current_threshold:.2f}% (no update needed)")
+            return 1


Ratcheting script aborts CI when coverage is unchanged

The new coverage ratcheting helper returns a non-zero status whenever coverage stays at the existing threshold (elif new_threshold == current_threshold: … return 1). In the Linux CI workflow the step at lines 321-328 runs python tools/update_coverage_threshold.py without continue-on-error, so a “no update needed” case (the common path on main) will cause the job to fail even though tests and coverage pass. The script should exit successfully when no update is required or the workflow step should tolerate that condition.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Copilot reviewed 26 out of 51 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/proteus/cli.py

docs/test_infrastructure.md

Copilot · 2026-01-04T11:45:04Z

.github/workflows/ci_tests.yml

+      - name: Update coverage threshold
+        if: ${{ github.ref == 'refs/heads/main' && env.PYTHON_VERSION == '3.12' && runner.os == 'Linux' && !failure() }}
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        run: |
+          # Automatically ratchet coverage threshold upward when tests pass on main
+          python tools/update_coverage_threshold.py
+
+          # Check if pyproject.toml was modified
+          if git diff --quiet pyproject.toml; then
+            echo "No coverage threshold update needed"
+          else
+            echo "Coverage threshold increased - committing update"
+            git config user.name "github-actions[bot]"
+            git config user.email "github-actions[bot]@users.noreply.github.com"
+            git add pyproject.toml
+            COVERAGE_PCT=$(python -c 'import json; print(json.load(open("coverage.json"))["totals"]["percent_covered"])')
+            git commit -m "ci: Auto-update coverage threshold to ${COVERAGE_PCT}% [skip ci]"
+            # Rebase on top of the latest main to avoid conflicts from concurrent pushes
+            if ! git pull --rebase origin "${{ github.ref_name }}"; then
+              echo "Rebase failed (likely due to concurrent updates). Aborting automatic coverage threshold push."
+              git rebase --abort || true
+              exit 0
+            fi
+            git push origin HEAD:${{ github.ref_name }} || {
+              echo "Failed to push coverage threshold update. You may need to resolve conflicts or permissions issues."
+              exit 1
+            }
+          fi


The git rebase and push logic (lines 340-348) in the coverage threshold update step could cause issues in concurrent scenarios. If multiple CI runs complete simultaneously on the main branch, they could conflict when trying to update the coverage threshold. While line 341 mentions this scenario, the error handling with exit 0 (silent success) might hide legitimate push failures. Consider adding more robust conflict resolution or using a PR-based approach for threshold updates to ensure changes are reviewed.

pyproject.toml

tests/star/test_star.py

Copilot · 2026-01-04T11:45:05Z

docs/test_infrastructure.md

+- name: Clone SOCRATES
+  run: git clone --depth 1 --branch v1.2.3 https://github.com/nichollsh/SOCRATES.git socrates
+
+# Now cache restore can hash the source files
+- name: Restore SOCRATES cache
+  uses: actions/cache/restore@v4
+  id: cache-socrates
+  with:
+    path: socrates/
+    # Hash changes = cache miss = recompile (correct behavior)
+    key: |\n      socrates-${{ runner.os }}-${{ hashFiles(\n        'socrates/**/*.f90',\n        'socrates/**/*.c'\n      ) }}
+    restore-keys: |
+      socrates-${{ runner.os }}-
+
+# Build if cache missed
+- name: Build SOCRATES (if needed)
+  if: steps.cache-socrates.outputs.cache-hit != 'true'
+  run: cd socrates && ./build_code


The hash-based caching example here recommends cloning nichollsh/SOCRATES from GitHub using git clone --depth 1 --branch v1.2.3 ... and then running ./build_code, but the dependency is pinned only to a mutable tag, not an immutable commit. If that repository or the v1.2.3 tag is ever compromised or moved, CI workflows that copy-paste this pattern will execute attacker-controlled code in the build environment with access to repository credentials. To reduce supply-chain risk, update this guidance to pin SOCRATES (and similar third-party tools) to specific, vetted commit SHAs or vendored snapshots rather than branches/tags, and encourage verifying integrity before execution.

Presumably we will always be pulling from the main branch anyway?

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 26 out of 51 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-04T11:52:59Z

tools/validate_test_structure.sh

+    if [[ "$module" == "data" || "$module" == "helpers" || "$module" == "integration" || "$module" == *__pycache__* ]]; then
+        continue
+    fi
+
+    # Count test files
+    test_files=$(find "$test_dir" -name "test_*.py" 2>/dev/null | wc -l)
+
+    if [ "$test_files" -eq 0 ]; then
+        echo "[!]  No test files in $test_dir"
+    else
+        echo "[+] $test_files test file(s) in $test_dir"
+    fi
+done
+
+echo ""
+echo "Checking for __init__.py files..."
+init_missing=0
+for test_dir in tests/*/; do
+    module=$(basename "$test_dir")
+
+    # Skip special directories
+    if [[ "$module" == "data" || "$module" == "helpers" || "$module" == *__pycache__* ]]; then
+        continue


The validation script checks for __init__.py files in test directories but skips the "integration" directory on line 43. However, on line 64, it doesn't skip "integration" when checking for missing __init__.py files. This inconsistency could lead to false warnings if the integration directory exists but doesn't have test files. The skip list should be consistent across both checks.

Copilot · 2026-01-04T11:53:00Z

pyproject.toml

+minversion = "8.1"
+addopts = [
+    # Global coverage options removed from addopts.
+    # CI uses "coverage run -m pytest" with [tool.coverage.*] settings.
+    # For local development, use either:
+    #   1. "coverage run -m pytest" (matches CI behavior, compatible with coverage ratcheting)
+    #   2. "pytest --cov" (uses pytest-cov, convenient but slightly different from CI)
+    # Both approaches work; choose based on preference.
+    "--strict-markers",
+    "--strict-config",
+    "-ra",
+    "--showlocals",
+]


The pyproject.toml comments mention that both coverage[toml] and pytest-cov work independently, and that CI uses "coverage run -m pytest" while developers can use "pytest --cov". However, the pytest.ini_options addopts no longer includes any coverage flags. This is good for flexibility, but the documentation should clarify that developers need to explicitly add --cov when running pytest locally, or use "coverage run -m pytest" to match CI. Consider adding this clarification to the comment block.

Copilot · 2026-01-04T11:53:00Z

pyproject.toml

+# See: tools/update_coverage_threshold.py and .github/workflows/ci_tests.yml.
+# The ratcheting (only ever increasing or staying the same) is enforced in CI;
+# do not manually decrease this value in pyproject.toml.
+fail_under = 69


The coverage threshold value of 69 should have a decimal component (69.0 or 69.00) to match the precision = 2 setting and be consistent with how the auto-update script formats values with 2 decimal places using f"{new_threshold:.2f}". This ensures the format is consistent between manual and automated updates.

Suggested change

fail_under = 69

fail_under = 69.00

Copilot · 2026-01-04T11:53:00Z

tools/restructure_tests.sh

+# test_cli.py and test_init.py stay at top level as they test root-level functionality
+
+# Create __init__.py files in test directories for proper Python package structure
+find tests -type d -name "[!_]*" -exec touch {}/__init__.py \;


The find command with -exec touch {}/__init__.py \; will create __init__.py files in ALL directories under tests/, including special directories like __pycache__, data, and helpers that were meant to be excluded. The command should filter these directories using -not -path or -prune options to avoid creating unnecessary files in excluded directories.

Copilot · 2026-01-04T11:53:01Z

docs/test_infrastructure.md

+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        if: matrix.python-version == '3.11'
+        with:
+          files: ./coverage.xml
+          flags: unittests
+          fail_ci_if_error: false
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}


In this CI example, codecov/codecov-action@v4 is again used with CODECOV_TOKEN but pinned only to the mutable v4 tag, meaning a compromise or retagging of that action could let an attacker run arbitrary code in your pipeline and steal repository secrets. To harden the supply chain for this workflow, pin the Codecov action to a specific commit SHA (and update it intentionally over time) instead of relying on a floating version tag.

This does not seem particularly important.

Copilot · 2026-01-04T11:53:01Z

docs/test_infrastructure.md

+# Clone dependencies BEFORE cache restore (critical!)
+# Pin to specific commit/tag for reproducibility and security


This GitHub Actions example clones SOCRATES from https://github.com/nichollsh/SOCRATES.git using a mutable tag (--branch v1.2.3), which creates a supply-chain risk: if that repository or tag is compromised or force-retagged, an attacker can run arbitrary code during ./build_code and access CI secrets or modify build artifacts. To mitigate, pin the dependency to an immutable commit SHA (and, if possible, verify signatures or checksums) and consider mirroring or vendoring the code under a trusted namespace you control.

@nichollsh We'll ignore this for now, but do you think we could move the socrates repository into FormingWorlds or the soon upcoming other new orgnisation?

nichollsh

These are excellent and much-needed changes. I have a few comments/suggestions.

nichollsh · 2026-01-05T13:45:12Z

.github/ISSUE_TEMPLATE/test_coverage_improvement.md

+```
+
+## Uncovered Lines
+<!-- From coverage report --show-missing -->


Might be useful to have the workflow run this command automatically.

nichollsh · 2026-01-05T13:46:03Z

.github/ISSUE_TEMPLATE/test_coverage_improvement.md

+name: Test Coverage Improvement
+about: Track test coverage improvements for specific folders
+title: 'Improve test coverage for [FOLDER]'
+labels: 'testing, enhancement'


Change these labels to match our existing ones.

nichollsh · 2026-01-05T13:48:05Z

.github/workflows/ci_tests.yml

+      # Get Lovepy
+      - name: Get Lovepy
+        run: |
+          ./tools/get_lovepy.sh


Should soon change this to point to the new Obliqua repo. Could be done in this PR if the Love.jl repo is renamed soon.

nichollsh · 2026-01-05T13:51:04Z

.github/workflows/ci_tests.yml

+          maxColorRange: 90
+          valColorRange: ${{ steps.report-coverage.outputs.total }}
+
+  test-macos:


Writing the workflow this way leads to a lot of duplicated lines. Many of the steps below (e.g. pip install) are common between Ubuntu and MacOS. Could they be generalised and written only once?

E.g. have a single 'lane' of steps, which is skips on MacOS unless github.event_name == 'schedule'.

nichollsh · 2026-01-05T13:53:35Z

docs/test_infrastructure.md

+## Table of Contents
+
+1. [Quick Start](#quick-start)
+2. [Architecture Overview](#architecture-overview)


Not sure that all of these pages exist

nichollsh · 2026-01-05T14:04:48Z

tools/README.md

+**Usage:**
+
+```bash
+python tools/get_stellar_spectrum.py <star_name> [distance_au]


This will need to be updated when #575 is merged.

nichollsh · 2026-01-05T14:05:14Z

tools/README.md

+**Purpose:** Julia script for general post-processing of PROTEUS simulation outputs.
+
+**What it does:**
+- Reads HDF5 output files


NetCDF output files

nichollsh · 2026-01-05T14:05:45Z

tools/README.md

+
+### rheological.ipynb
+
+**Purpose:** Jupyter notebook for analyzing and visualizing rheological properties computed during simulations.


"Jupyter notebook for testing parametrisation of rheological properties."

nichollsh · 2026-01-05T14:06:31Z

tools/restructure_tests.sh

@@ -0,0 +1,85 @@
+#!/bin/bash
+# Script to restructure tests/ to mirror src/proteus/ directory structure


When would we run this script? Seems like it would only be needed once.

nichollsh · 2026-01-05T14:08:07Z

tools/update_coverage_threshold.py

@@ -0,0 +1,172 @@
+#!/usr/bin/env python3
+"""Automatically update test coverage threshold based on current coverage.


Similarly to validate_test_structure.sh, should this be moved elsewhere? I understand that it is usually meant to be run automatically by the GH workflow rather than by a human.

timlichtenberg · 2026-01-10T08:46:07Z

Working on these suggestions plus a few additions. I will need to test some new CI workflows directly on main and will commit them to this branch for this purpose. For that I turned the PR back to draft for some time until these workflows are running appropriately.

timlichtenberg added 30 commits December 31, 2025 13:04

Initial commit with some start on test directory restructuring

8dcac9f

Refactor CI workflow for improved testing and compatibility across Py…

a45c633

…thon versions and OS platforms

Update CI workflow to trigger on pushes/PRs to dev branch for improve…

8bd6722

…d testing coverage

Add test branch to CI workflow triggers for testing

804acb7

Trigger CI after MORS and aragog NumPy 2.0 fixes

1081f77

Install local MORS and aragog in CI to test NumPy 2.0 fixes

86339ef

Clone MORS and aragog repos explicitly in CI

c048960

Test Aragog fix branch tl/deprecation_fixes_line138 in CI

48be087

Trigger CI: test aragog output.py fix

7782f14

Trigger CI: test comprehensive aragog NumPy 2.0 fixes

9a78f58

Fix NumPy 2.0 compatibility in PROTEUS: convert numpy scalar to float…

a1f5441

… in error message

Fix NumPy 2.0 logging conversions in interior wrapper

25f932e

CI: temporarily run only Ubuntu Python 3.13 (commented macOS/3.12)

52de500

Fix NumPy 2.0: convert array outputs to scalars in interior wrapper

41fafbd

Fix coverage: align pytest --cov path with package name

1b0544f

ci: Update aragog clone to use main branch

d016472

Now that NumPy 2.0 fixes are merged to aragog main (PR #5), remove the temporary test branch reference from CI workflow. Related: FormingWorlds/aragog#5

test: Trigger CI to test cache restoration

6aa423f

ci: Run full matrix nightly at 2am

5d11f05

ci: fix workflow run commands

47edb82

ci: gate macOS job to schedule

431e708

feat: Add PROTEUS Copilot guidelines for testing standards, code qual…

c8fc8c9

…ity, and safety

timlichtenberg requested a review from Copilot January 3, 2026 23:27

Copilot started reviewing on behalf of timlichtenberg January 3, 2026 23:27 View session

Copilot AI reviewed Jan 3, 2026

View reviewed changes

Fix: Address Copilot review comments - YAML cache key formatting & TO…

8b59e65

…ML code block - Convert multiline YAML block scalars to single-line format for cache keys (4 occurrences) - Fix unclosed TOML code block in test_infrastructure.md - Addresses review #579 (review)

timlichtenberg marked this pull request as ready for review January 4, 2026 11:37

timlichtenberg requested a review from a team as a code owner January 4, 2026 11:37

Copilot AI review requested due to automatic review settings January 4, 2026 11:37

Copilot started reviewing on behalf of timlichtenberg January 4, 2026 11:37 View session

timlichtenberg self-assigned this Jan 4, 2026

timlichtenberg added this to PROTEUS Development Roadmap Jan 4, 2026

timlichtenberg moved this to In Progress in PROTEUS Development Roadmap Jan 4, 2026

timlichtenberg added Enhancement A new feature or request Software Relating to software and implementation Priority 1: critical Priority level 1: highest priority – critical & fast Tests Automated tests across the PROTEUS ecosystem Docs Update documentation webpage PROTEUS labels Jan 4, 2026

timlichtenberg requested review from nichollsh and stuitje January 4, 2026 11:39

chatgpt-codex-connector bot reviewed Jan 4, 2026

View reviewed changes

Copilot AI reviewed Jan 4, 2026

View reviewed changes

Update docs/test_infrastructure.md

ca3081b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings January 4, 2026 11:47

Update src/proteus/cli.py

9885aef

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot started reviewing on behalf of timlichtenberg January 4, 2026 11:47 View session

Copilot AI reviewed Jan 4, 2026

View reviewed changes

timlichtenberg requested a review from MarijnJ0 January 5, 2026 07:52

nichollsh requested changes Jan 5, 2026

View reviewed changes

timlichtenberg marked this pull request as draft January 10, 2026 08:44

		# Clone dependencies BEFORE cache restore (critical!)
		# Pin to specific commit/tag for reproducibility and security


		### rheological.ipynb

		Purpose: Jupyter notebook for analyzing and visualizing rheological properties computed during simulations.

		@@ -0,0 +1,85 @@
		#!/bin/bash
		# Script to restructure tests/ to mirror src/proteus/ directory structure

		@@ -0,0 +1,172 @@
		#!/usr/bin/env python3
		"""Automatically update test coverage threshold based on current coverage.

PROTEUS ecosystem-wide testing infrastructure & CI/CD enhancements #579

Are you sure you want to change the base?

PROTEUS ecosystem-wide testing infrastructure & CI/CD enhancements #579

Uh oh!

Conversation

timlichtenberg commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ready for first review – this is going to take a few rounds 🫠

Description

Key Changes

1. CI/CD Infrastructure

2. Test Infrastructure Documentation

3. GitHub Copilot Integration

4. Automated Coverage Ratcheting

5. Testing Tools

Current Status

What's Next (Post-Merge)

Validation of changes

Checklist

Relevant people

Additional Context

Most important documents: make sure to check these out:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

timlichtenberg commented Jan 3, 2026 •

edited

Loading