Skip to content

Conversation

@timlichtenberg
Copy link
Member

@timlichtenberg timlichtenberg commented Jan 3, 2026

Ready for first review – this is going to take a few rounds 🫠

Description

This PR establishes a comprehensive, standardized testing infrastructure for the entire PROTEUS ecosystem. It implements significant CI/CD improvements, automated coverage ratcheting, and provides extensive documentation and tooling to support consistent testing practices across all modules.

Starts to address #507 and prepares a unified testing infrastructure for the PROTEUS ecosystem. The idea is that all modules adhere to the same (high) testing standards. Code tests can be written with Copilot, but instructions must be rigorous and development and PR reviews must be guided. A human has to evaluate the end result at all times, optimally more than 1 person.

Key Changes

1. CI/CD Infrastructure

  • Restructured GitHub Actions Workflows (.github/workflows/ci_tests.yml)

    • Split matrix into separate test-linux and test-macos jobs for better control
    • Implemented hash-based caching for SOCRATES binaries and AGNI Julia depot (~5 min savings)
    • Added nightly full matrix testing (2 AM UTC) with macOS support
    • Python 3.13 (Ubuntu always), macOS nightly only for extended testing
    • Fixed NumPy 2.0 compatibility issues across PROTEUS, MORS, and aragog
  • Reusable Quality Gate Workflow (.github/workflows/proteus_test_quality_gate.yml)

    • Centralized testing workflow for entire PROTEUS ecosystem
    • Configurable coverage thresholds (recommend 30-80% for new modules)
    • Automatic Codecov integration and HTML artifact generation
    • Can be called from any submodule (CALLIOPE, JANUS, MORS, VULCAN, ZEPHYRUS, etc.)

2. Test Infrastructure Documentation

  • Comprehensive Testing Guide (docs/test_infrastructure.md)
    • Complete architecture overview and configuration guidance
    • Ecosystem rollout plan with phased deployment
    • Developer workflow with common commands
    • Troubleshooting section for pytest, coverage, and CI issues
    • Best Practices including GitHub Copilot integration guidelines

3. GitHub Copilot Integration

  • Copilot Instructions (.github/workflows/copilot-instructions.md)
    • Ecosystem-wide coding guidelines for AI-assisted development
    • Complete module structure with GitHub repository links
    • Test infrastructure standards (structure, speed, coverage, markers)
    • Automatic coverage ratcheting guidance
    • Code quality standards (ruff, type hints, docstrings)
    • Installation and dependency management references

4. Automated Coverage Ratcheting

  • Threshold Auto-Update Mechanism (tools/update_coverage_threshold.py)

    • Automatically increases fail_under threshold when coverage improves
    • Never decreases - implements "coverage ratcheting" pattern
    • Runs on main branch CI to lock in progress
    • See CALLIOPE for reference implementation (18% → auto-updating)
  • Documentation Updates

    • Updated all coverage goals to 80%+ ecosystem-wide standard
    • Promotes automatic ratcheting over manual threshold updates
    • Risk-based prioritization: High-risk 95%+, Medium-risk 80%+, Low-risk 60%+

5. Testing Tools

  • tools/validate_test_structure.sh - Verify tests mirror source structure
  • tools/restructure_tests.sh - Automatically reorganize tests
  • tools/coverage_analysis.sh - Module-level coverage reporting

Current Status

PROTEUS:

  • Coverage: 69.23% (target: 80%+)
  • CI: ✅ Passing on tl/test_ecosystem_v1 (Run #20668634326)
  • Features: Hash-based caching, dynamic badges, comprehensive reporting

Related Work:

  • CALLIOPE: Draft PR on tl/test_ecosystem_calliope branch (to be submitted)
    • Implementing automatic coverage ratcheting (18% baseline)
    • Adding reusable quality gate workflow
    • Setting up test infrastructure mirroring

What's Next (Post-Merge)

  1. Rework CI runs for faster deployment. I am as of yet unhappy about how long the tests take. I will work on creating a modular setup with fast and rapid checks on PR and nightly science builds using automatically regenerated docker images. This will take a bit, so this here presents a snapshot to get there.

  2. Implementation of proper documentation and use instructions for all PROTEUS developers. The idea is to enforce much stricter testing routines: when one adds new code it needs to be immediately come with tests for the new code. This requires that everyone knows what they have to do. The tests can be written by Copilot or similar, but they need to adhere to the ecosystem standards.

  3. Deploy to JANUS and MORS

    • Add reusable quality gate workflows
    • Implement automatic coverage ratcheting
    • Set realistic starting thresholds (20-30%)
  4. Bootstrap VULCAN, ZEPHYRUS, aragog

    • Set up CI/CD from scratch
    • Implement test structure validation
    • Start with 20% coverage baseline
  5. Implement in non-Python codes: AGNI, Obliqua

    • Set up CI/CD from scratch
    • Implement test structure validation
    • Start with 20% coverage baseline
  6. Ecosystem Monitoring

    • Track coverage trends across all modules
    • Monthly review of ratcheting progress
    • Quarterly goal adjustments

Validation of changes

  • ✅ CI passing on Ubuntu Python 3.12 (primary platform)
  • ✅ CI passing on macOS nightly tests
  • ✅ NumPy 2.0 compatibility verified across PROTEUS, MORS, aragog
  • ✅ SOCRATES/AGNI caching reduces CI time by ~5 minutes
  • ✅ All documentation reviewed and updated
  • ✅ Coverage ratcheting mechanism tested (see in both PROTEUS & CALLIOPE test branch)

Test Configuration:

  • macOS 15.2 (M3) with Python 3.13
  • Ubuntu 24.04 (CI) with Python 3.13
  • All tests pass with pytest --cov

Checklist

  • I have followed the contributing guidelines
  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • My changes generate no new warnings or errors
  • I have checked that the tests still pass on my computer
  • I have updated the docs, as appropriate (extensive documentation added)
  • I have added tests for these changes, as appropriate (CI/infrastructure changes)
  • I have checked that all dependencies have been updated, as required

Relevant people

@FormingWorlds/proteus-maintainer @FormingWorlds/proteus-developer
@nichollsh If you can have an initial look sometime soon that'd be good, we'll discuss on Monday.


Additional Context

This PR represents ~35 commits of iterative improvements to testing infrastructure, including:

  • CI workflow debugging and optimization
  • NumPy 2.0 compatibility fixes
  • Documentation enhancements
  • Copilot integration guidelines
  • Automatic coverage ratcheting implementation

The infrastructure is designed to be modular and reusable across the entire PROTEUS ecosystem, with CALLIOPE serving as the first external adoption case.

Most important documents: make sure to check these out:

The CI workflow uses 'coverage run -m pytest' to collect coverage data.
Having --cov options in pytest addopts creates a conflict that prevents
coverage measurement. Coverage reporting is still configured in
[tool.coverage.report] section and will now work properly with the
CI command.
Now that NumPy 2.0 fixes are merged to aragog main (PR #5),
remove the temporary test branch reference from CI workflow.

Related: FormingWorlds/aragog#5
Implements performance optimizations to reduce CI runtime:

1. Cache SOCRATES compiled binaries (~7-8 min savings)
   - Caches socrates/ directory with binaries
   - Key based on source file hashes for automatic invalidation
   - Restore-keys for partial cache hits

2. Cache AGNI Julia depot (~3-4 min savings)
   - Caches AGNI/ directory and ~/.julia/ packages
   - Key based on Julia source and manifest files
   - Restore-keys for partial cache hits

Expected improvement: 10-12 minutes saved per CI run (from ~26 to ~14-16 minutes)

These caches only rebuild when source files change, otherwise use
cached binaries/packages from previous runs.
Previous attempt failed because:
- Cache keys used hashFiles() on directories that didn't exist yet
- Result: empty cache keys like 'socrates-bins-Linux-'
- No cache was ever restored or saved properly

New approach:
- Use cache/restore before install-all to load previous build
- Use cache/save after tests to save new build
- Key based on run_id (unique) with restore-keys for prefix matching
- Only save cache if restore missed (avoid duplicate saves)

This allows second and subsequent runs to skip 10-12 minutes of compilation.
CRITICAL FIX: Cache keys now depend on source/dependency hashes instead
of run ID. This ensures:

1. Cache is automatically INVALIDATED when source code changes
2. Cache is automatically INVALIDATED when dependencies change
3. Tests always use current SOCRATES and AGNI versions
4. No stale cached code is used if upstream repos change

SOCRATES cache:
- Key: hash of *.f90, *.F90, *.c files, and build_code script
- Invalidates when any Fortran/C source changes

AGNI cache:
- Key: hash of Project.toml and Manifest.toml files
- Invalidates when Julia dependencies change

Benefits:
✓ 32% CI speedup (26m → 18m) when deps unchanged
✓ Automatic detection of upstream changes
✓ No stale cache issues
✓ Maintain testing integrity

Note: Pre-existing linting warnings about env.total are unrelated to
this change and do not affect workflow execution.
Skip disk cleanup when available space is >80%, saving ~2m30s per run

Changes:
- Add 'Check available disk space' step that calculates usage percentage
- Modify 'Free Disk Space (Ubuntu)' condition to only run if usage >20%
- Threshold can be tuned; 80% is conservative to prevent full disk

Expected savings: 2m 30s per build (disk rarely critical)
Impact: ~15% CI runtime reduction on test branches
Risk: LOW - cleanup still triggers if disk space actually needed
Critical bug fix: hashFiles() was evaluating on non-existent directories

Root cause:
- Cache restore steps tried to hash 'socrates/**/*.f90' files
- But socrates/ directory didn't exist yet (cloned later in install-all)
- Result: Empty hash → cache key 'socrates-bins-Linux-' (missing hash)
- Cache always missed → SOCRATES recompiled every run (+11-15 min)

Solution:
- Clone SOCRATES and AGNI repos BEFORE cache restore steps
- Now hashFiles() can compute proper hashes
- Cache keys like 'socrates-bins-Linux-abc123def456' work correctly
- proteus install-all will use existing clones (no duplicate work)

Expected impact:
- Cache hits will now work properly
- Saves 11-15 minutes when SOCRATES source unchanged
- Saves 2-3 minutes when AGNI dependencies unchanged
- Reduces run from 48m to ~17-19m when caches hit
…nd 69% coverage threshold

Priority 1 improvements to testing_infrastructure.md:

Changes:
- Document PROTEUS Phase 1 completion (69.23% coverage achieved)
- Add comprehensive Phase 2 ecosystem integration guide
- Create 4-step quick start deployment for ecosystem modules
- Add advanced hash-based caching strategy documentation
- Update coverage threshold progression from 5% to 69% for PROTEUS
- Change reusable workflow default threshold from 5% to 30% (realistic for new modules)
- Add deployment checklist (~2 hours per module)
- Include performance expectations and troubleshooting for caching

Files modified:
- docs/testing_infrastructure.md: +317 lines (comprehensive ecosystem rollout guide)
- pyproject.toml: fail_under = 69 (enforces actual achieved coverage)
- .github/workflows/proteus_test_quality_gate.yml: improved default threshold and guidance

This enables ecosystem modules (CALLIOPE, JANUS, MORS, VULCAN, ZEPHYRUS) to deploy
quality gates with clear configuration examples, realistic thresholds, and validated
patterns from PROTEUS implementation.
- Add CALLIOPE as Phase 2 pilot reference implementation
- Document coverage ratcheting mechanism (auto-threshold updates)
- Establish ecosystem integration standards (Codecov, artifacts, test quality)
- Provide 4 direct reference links to CALLIOPE working examples
- Update Phase 2 quick start with CALLIOPE patterns
- Clarify rollout strategy for JANUS/MORS (Phase 2b/2c)
- Add tools/update_coverage_threshold.py for automatic threshold updates
- Implement coverage ratcheting step in CI (only increases, never decreases)
- Rename .github/workflows/ci.yml to ci_tests.yml for consistency with CALLIOPE
- Rename docs/testing_infrastructure.md to test_infrastructure.md (shorter, clearer)
- Update all references to renamed files in workflows and documentation
- Update pyproject.toml with ratcheting mechanism comments
- Add test_infrastructure.md to mkdocs.yml navigation

Coverage ratcheting ensures sustainable progress: threshold automatically
increases when coverage improves on main branch, preventing regression.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 51 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ML code block

- Convert multiline YAML block scalars to single-line format for cache keys (4 occurrences)
- Fix unclosed TOML code block in test_infrastructure.md
- Addresses review #579 (review)
@timlichtenberg timlichtenberg marked this pull request as ready for review January 4, 2026 11:37
@timlichtenberg timlichtenberg requested a review from a team as a code owner January 4, 2026 11:37
Copilot AI review requested due to automatic review settings January 4, 2026 11:37
@timlichtenberg timlichtenberg self-assigned this Jan 4, 2026
@timlichtenberg timlichtenberg added Enhancement A new feature or request Software Relating to software and implementation Priority 1: critical Priority level 1: highest priority – critical & fast Tests Automated tests across the PROTEUS ecosystem Docs Update documentation webpage PROTEUS labels Jan 4, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b59e65fa3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +144 to +146
elif new_threshold == current_threshold:
print(f"[=] Coverage threshold already at {current_threshold:.2f}% (no update needed)")
return 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Ratcheting script aborts CI when coverage is unchanged

The new coverage ratcheting helper returns a non-zero status whenever coverage stays at the existing threshold (elif new_threshold == current_threshold: … return 1). In the Linux CI workflow the step at lines 321-328 runs python tools/update_coverage_threshold.py without continue-on-error, so a “no update needed” case (the common path on main) will cause the job to fail even though tests and coverage pass. The script should exit successfully when no update is required or the workflow step should tolerate that condition.

Useful? React with 👍 / 👎.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 51 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +321 to +349
- name: Update coverage threshold
if: ${{ github.ref == 'refs/heads/main' && env.PYTHON_VERSION == '3.12' && runner.os == 'Linux' && !failure() }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Automatically ratchet coverage threshold upward when tests pass on main
python tools/update_coverage_threshold.py
# Check if pyproject.toml was modified
if git diff --quiet pyproject.toml; then
echo "No coverage threshold update needed"
else
echo "Coverage threshold increased - committing update"
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add pyproject.toml
COVERAGE_PCT=$(python -c 'import json; print(json.load(open("coverage.json"))["totals"]["percent_covered"])')
git commit -m "ci: Auto-update coverage threshold to ${COVERAGE_PCT}% [skip ci]"
# Rebase on top of the latest main to avoid conflicts from concurrent pushes
if ! git pull --rebase origin "${{ github.ref_name }}"; then
echo "Rebase failed (likely due to concurrent updates). Aborting automatic coverage threshold push."
git rebase --abort || true
exit 0
fi
git push origin HEAD:${{ github.ref_name }} || {
echo "Failed to push coverage threshold update. You may need to resolve conflicts or permissions issues."
exit 1
}
fi
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git rebase and push logic (lines 340-348) in the coverage threshold update step could cause issues in concurrent scenarios. If multiple CI runs complete simultaneously on the main branch, they could conflict when trying to update the coverage threshold. While line 341 mentions this scenario, the error handling with exit 0 (silent success) might hide legitimate push failures. Consider adding more robust conflict resolution or using a PR-based approach for threshold updates to ensure changes are reviewed.

Copilot uses AI. Check for mistakes.
Comment on lines +645 to +662
- name: Clone SOCRATES
run: git clone --depth 1 --branch v1.2.3 https://github.com/nichollsh/SOCRATES.git socrates
# Now cache restore can hash the source files
- name: Restore SOCRATES cache
uses: actions/cache/restore@v4
id: cache-socrates
with:
path: socrates/
# Hash changes = cache miss = recompile (correct behavior)
key: |\n socrates-${{ runner.os }}-${{ hashFiles(\n 'socrates/**/*.f90',\n 'socrates/**/*.c'\n ) }}
restore-keys: |
socrates-${{ runner.os }}-
# Build if cache missed
- name: Build SOCRATES (if needed)
if: steps.cache-socrates.outputs.cache-hit != 'true'
run: cd socrates && ./build_code
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash-based caching example here recommends cloning nichollsh/SOCRATES from GitHub using git clone --depth 1 --branch v1.2.3 ... and then running ./build_code, but the dependency is pinned only to a mutable tag, not an immutable commit. If that repository or the v1.2.3 tag is ever compromised or moved, CI workflows that copy-paste this pattern will execute attacker-controlled code in the build environment with access to repository credentials. To reduce supply-chain risk, update this guidance to pin SOCRATES (and similar third-party tools) to specific, vetted commit SHAs or vendored snapshots rather than branches/tags, and encourage verifying integrity before execution.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we will always be pulling from the main branch anyway?

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings January 4, 2026 11:47
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 51 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +43 to +65
if [[ "$module" == "data" || "$module" == "helpers" || "$module" == "integration" || "$module" == *__pycache__* ]]; then
continue
fi

# Count test files
test_files=$(find "$test_dir" -name "test_*.py" 2>/dev/null | wc -l)

if [ "$test_files" -eq 0 ]; then
echo "[!] No test files in $test_dir"
else
echo "[+] $test_files test file(s) in $test_dir"
fi
done

echo ""
echo "Checking for __init__.py files..."
init_missing=0
for test_dir in tests/*/; do
module=$(basename "$test_dir")

# Skip special directories
if [[ "$module" == "data" || "$module" == "helpers" || "$module" == *__pycache__* ]]; then
continue
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation script checks for __init__.py files in test directories but skips the "integration" directory on line 43. However, on line 64, it doesn't skip "integration" when checking for missing __init__.py files. This inconsistency could lead to false warnings if the integration directory exists but doesn't have test files. The skip list should be consistent across both checks.

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +134
minversion = "8.1"
addopts = [
# Global coverage options removed from addopts.
# CI uses "coverage run -m pytest" with [tool.coverage.*] settings.
# For local development, use either:
# 1. "coverage run -m pytest" (matches CI behavior, compatible with coverage ratcheting)
# 2. "pytest --cov" (uses pytest-cov, convenient but slightly different from CI)
# Both approaches work; choose based on preference.
"--strict-markers",
"--strict-config",
"-ra",
"--showlocals",
]
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pyproject.toml comments mention that both coverage[toml] and pytest-cov work independently, and that CI uses "coverage run -m pytest" while developers can use "pytest --cov". However, the pytest.ini_options addopts no longer includes any coverage flags. This is good for flexibility, but the documentation should clarify that developers need to explicitly add --cov when running pytest locally, or use "coverage run -m pytest" to match CI. Consider adding this clarification to the comment block.

Copilot uses AI. Check for mistakes.
# See: tools/update_coverage_threshold.py and .github/workflows/ci_tests.yml.
# The ratcheting (only ever increasing or staying the same) is enforced in CI;
# do not manually decrease this value in pyproject.toml.
fail_under = 69
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coverage threshold value of 69 should have a decimal component (69.0 or 69.00) to match the precision = 2 setting and be consistent with how the auto-update script formats values with 2 decimal places using f"{new_threshold:.2f}". This ensures the format is consistent between manual and automated updates.

Suggested change
fail_under = 69
fail_under = 69.00

Copilot uses AI. Check for mistakes.
# test_cli.py and test_init.py stay at top level as they test root-level functionality

# Create __init__.py files in test directories for proper Python package structure
find tests -type d -name "[!_]*" -exec touch {}/__init__.py \;
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The find command with -exec touch {}/__init__.py \; will create __init__.py files in ALL directories under tests/, including special directories like __pycache__, data, and helpers that were meant to be excluded. The command should filter these directories using -not -path or -prune options to avoid creating unnecessary files in excluded directories.

Copilot uses AI. Check for mistakes.
Comment on lines +760 to +768
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
if: matrix.python-version == '3.11'
with:
files: ./coverage.xml
flags: unittests
fail_ci_if_error: false
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this CI example, codecov/codecov-action@v4 is again used with CODECOV_TOKEN but pinned only to the mutable v4 tag, meaning a compromise or retagging of that action could let an attacker run arbitrary code in your pipeline and steal repository secrets. To harden the supply chain for this workflow, pin the Codecov action to a specific commit SHA (and update it intentionally over time) instead of relying on a floating version tag.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem particularly important.

Comment on lines +643 to +644
# Clone dependencies BEFORE cache restore (critical!)
# Pin to specific commit/tag for reproducibility and security
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This GitHub Actions example clones SOCRATES from https://github.com/nichollsh/SOCRATES.git using a mutable tag (--branch v1.2.3), which creates a supply-chain risk: if that repository or tag is compromised or force-retagged, an attacker can run arbitrary code during ./build_code and access CI secrets or modify build artifacts. To mitigate, pin the dependency to an immutable commit SHA (and, if possible, verify signatures or checksums) and consider mirroring or vendoring the code under a trusted namespace you control.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nichollsh We'll ignore this for now, but do you think we could move the socrates repository into FormingWorlds or the soon upcoming other new orgnisation?

Copy link
Member

@nichollsh nichollsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are excellent and much-needed changes. I have a few comments/suggestions.

```

## Uncovered Lines
<!-- From coverage report --show-missing -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to have the workflow run this command automatically.

name: Test Coverage Improvement
about: Track test coverage improvements for specific folders
title: 'Improve test coverage for [FOLDER]'
labels: 'testing, enhancement'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change these labels to match our existing ones.

# Get Lovepy
- name: Get Lovepy
run: |
./tools/get_lovepy.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should soon change this to point to the new Obliqua repo. Could be done in this PR if the Love.jl repo is renamed soon.

maxColorRange: 90
valColorRange: ${{ steps.report-coverage.outputs.total }}

test-macos:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing the workflow this way leads to a lot of duplicated lines. Many of the steps below (e.g. pip install) are common between Ubuntu and MacOS. Could they be generalised and written only once?

E.g. have a single 'lane' of steps, which is skips on MacOS unless github.event_name == 'schedule'.

## Table of Contents

1. [Quick Start](#quick-start)
2. [Architecture Overview](#architecture-overview)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that all of these pages exist

**Usage:**

```bash
python tools/get_stellar_spectrum.py <star_name> [distance_au]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be updated when #575 is merged.

**Purpose:** Julia script for general post-processing of PROTEUS simulation outputs.

**What it does:**
- Reads HDF5 output files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetCDF output files


### rheological.ipynb

**Purpose:** Jupyter notebook for analyzing and visualizing rheological properties computed during simulations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Jupyter notebook for testing parametrisation of rheological properties."

@@ -0,0 +1,85 @@
#!/bin/bash
# Script to restructure tests/ to mirror src/proteus/ directory structure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would we run this script? Seems like it would only be needed once.

@@ -0,0 +1,172 @@
#!/usr/bin/env python3
"""Automatically update test coverage threshold based on current coverage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to validate_test_structure.sh, should this be moved elsewhere? I understand that it is usually meant to be run automatically by the GH workflow rather than by a human.

@timlichtenberg timlichtenberg marked this pull request as draft January 10, 2026 08:44
@timlichtenberg
Copy link
Member Author

Working on these suggestions plus a few additions. I will need to test some new CI workflows directly on main and will commit them to this branch for this purpose. For that I turned the PR back to draft for some time until these workflows are running appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Docs Update documentation webpage Enhancement A new feature or request Priority 1: critical Priority level 1: highest priority – critical & fast PROTEUS Software Relating to software and implementation Tests Automated tests across the PROTEUS ecosystem

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants