feat: Add section validation warnings for recommended manuscript sections#275
feat: Add section validation warnings for recommended manuscript sections#275
Conversation
…ions - New SectionValidator checks for standard sections required by most journals - Warns (doesn't error) if sections are missing: • Data Availability • Code Availability • Author Contributions • Acknowledgements • Funding • Competing Interests - Helps authors ensure manuscripts are complete before submission - Integrated into 'rxiv validate' command as first validator in chain - Flexible pattern matching supports multiple section naming variations
Code Review: Section Validation WarningsThank you for this contribution! This is a valuable feature that aligns well with journal requirements. Strengths
CRITICAL: Missing Test Coverage (Severity: High)The PR adds a new validator but includes no unit tests. According to CLAUDE.md, the project aims for 80% test coverage. Required: tests/unit/test_section_validator.py should include:
See tests/unit/test_doi_validator.py for validator test patterns. Enhancement: Consider Using ErrorCode Enum (Severity: Medium)Other validators use ErrorCode from core.error_codes. Your code uses the deprecated _create_error() method. Consider defining section-specific error codes for consistency (e.g., MISSING_SECTION). See figure_validator.py:77 and citation_validator.py:74 for examples. Pattern Matching: Edge Cases (Severity: Low)Question: Is it intentional to only match h2 headers and not h3? This might be worth documenting. Your regex handles multiple spaces and case variations correctly. Security Review: PASS
Before Merging (Required)
Overall Assessment: Conditional ApprovalThe implementation is solid and well-designed, but missing test coverage is a blocker. Add tests, then re-request review. Great work on the feature design! The validator pattern is well-executed and will be genuinely useful for authors. |
There was a problem hiding this comment.
Pull request overview
This PR adds a new SectionValidator that checks manuscripts for six standard sections commonly required by journals (Data Availability, Code Availability, Author Contributions, Acknowledgements, Funding, and Competing Interests). The validator issues non-blocking warnings when recommended sections are missing, helping authors ensure manuscript completeness before journal submission.
Key Changes:
- New section validation system with flexible pattern matching supporting multiple section name variations
- Integration as the first validator in the validation workflow ("Structure" validation)
- Smart handling of combined sections (e.g., "Data and Code Availability")
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/rxiv_maker/validators/section_validator.py |
New validator class that checks for six recommended manuscript sections using regex patterns and generates warnings for missing sections |
src/rxiv_maker/engines/operations/validate.py |
Integrates SectionValidator into the validation workflow as the first validator, with proper imports in both relative and absolute import blocks |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "author contributions": { | ||
| "patterns": [ | ||
| r"##\s+Author\s+Contributions?", | ||
| r"##\s+Contributions?", |
There was a problem hiding this comment.
The pattern r"##\s+Contributions?" is too broad and could match any section titled "## Contributions" or "## Contribution" without the word "Author". Consider making this pattern more specific to avoid false positives, or remove it if only "Author Contributions" should be matched.
| r"##\s+Contributions?", |
| class SectionValidator(BaseValidator): | ||
| """Validator for checking recommended manuscript sections.""" | ||
|
|
||
| # Required sections that should generate warnings if missing | ||
| RECOMMENDED_SECTIONS = { | ||
| "data availability": { | ||
| "patterns": [ | ||
| r"##\s+Data\s+Availability", | ||
| r"##\s+Data\s+and\s+Code\s+Availability", | ||
| ], | ||
| "suggestion": "Add a '## Data Availability' section describing where your data can be accessed", | ||
| }, | ||
| "code availability": { | ||
| "patterns": [ | ||
| r"##\s+Code\s+Availability", | ||
| r"##\s+Data\s+and\s+Code\s+Availability", | ||
| r"##\s+Software\s+Availability", | ||
| ], | ||
| "suggestion": "Add a '## Code Availability' section with links to code repositories", | ||
| }, | ||
| "author contributions": { | ||
| "patterns": [ | ||
| r"##\s+Author\s+Contributions?", | ||
| r"##\s+Contributions?", | ||
| ], | ||
| "suggestion": "Add an '## Author Contributions' section describing each author's role", | ||
| }, | ||
| "acknowledgements": { | ||
| "patterns": [ | ||
| r"##\s+Acknowledgements?", | ||
| r"##\s+Acknowledgments?", | ||
| ], | ||
| "suggestion": "Add an '## Acknowledgements' section to thank contributors", | ||
| }, | ||
| "funding": { | ||
| "patterns": [ | ||
| r"##\s+Funding", | ||
| r"##\s+Financial\s+Support", | ||
| r"##\s+Grant\s+Information", | ||
| ], | ||
| "suggestion": "Add a '## Funding' section declaring funding sources or stating 'This research received no external funding.'", | ||
| }, | ||
| "competing interests": { | ||
| "patterns": [ | ||
| r"##\s+Competing\s+Interests?", | ||
| r"##\s+Conflicts?\s+of\s+Interest", | ||
| r"##\s+Disclosure", | ||
| ], | ||
| "suggestion": "Add a '## Competing Interests' section, even if just to state 'The authors declare no competing interests.'", | ||
| }, | ||
| } | ||
|
|
||
| def validate(self) -> ValidationResult: | ||
| """Check for recommended manuscript sections. | ||
|
|
||
| Returns: | ||
| ValidationResult with warnings for missing sections | ||
| """ | ||
| errors: list[ValidationError] = [] | ||
| metadata = { | ||
| "missing_sections": [], | ||
| "found_sections": [], | ||
| } | ||
|
|
||
| # Find manuscript file | ||
| try: | ||
| manuscript_file = find_manuscript_md(self.manuscript_path) | ||
| except (FileNotFoundError, ValueError) as e: | ||
| return ValidationResult( | ||
| validator_name=self.name, | ||
| errors=[ | ||
| self._create_error( | ||
| level=ValidationLevel.ERROR, | ||
| message=f"Could not find manuscript file: {e}", | ||
| ) | ||
| ], | ||
| metadata=metadata, | ||
| ) | ||
|
|
||
| # Read manuscript content | ||
| content = self._read_file_safely(manuscript_file) | ||
| if content is None: | ||
| return ValidationResult( | ||
| validator_name=self.name, | ||
| errors=[ | ||
| self._create_error( | ||
| level=ValidationLevel.ERROR, | ||
| message=f"Could not read manuscript file: {manuscript_file}", | ||
| ) | ||
| ], | ||
| metadata=metadata, | ||
| ) | ||
|
|
||
| # Check for each recommended section | ||
| for section_name, section_info in self.RECOMMENDED_SECTIONS.items(): | ||
| found = False | ||
| for pattern in section_info["patterns"]: | ||
| if re.search(pattern, content, re.IGNORECASE): | ||
| found = True | ||
| metadata["found_sections"].append(section_name) | ||
| break | ||
|
|
||
| if not found: | ||
| metadata["missing_sections"].append(section_name) | ||
| errors.append( | ||
| self._create_error( | ||
| level=ValidationLevel.WARNING, | ||
| message=f"Recommended section '{section_name.title()}' not found", | ||
| file_path=manuscript_file, | ||
| suggestion=section_info["suggestion"], | ||
| ) | ||
| ) | ||
|
|
||
| return ValidationResult( | ||
| validator_name=self.name, | ||
| errors=errors, | ||
| metadata=metadata, | ||
| ) |
There was a problem hiding this comment.
The new SectionValidator lacks test coverage. The repository has comprehensive test coverage for other validators (see tests/unit/test_validators.py), and this new validator should have corresponding tests to verify pattern matching, warning generation, and edge cases like missing files or combined sections.
|
|
||
| # Find manuscript file | ||
| try: | ||
| manuscript_file = find_manuscript_md(self.manuscript_path) |
There was a problem hiding this comment.
The Path object returned by find_manuscript_md should be converted to a string before passing to file_path parameter. The ValidationError.file_path is typed as str | None, but find_manuscript_md returns a Path object. For consistency with other validators and proper type handling, convert the Path to string using str().
| self._create_error( | ||
| level=ValidationLevel.WARNING, | ||
| message=f"Recommended section '{section_name.title()}' not found", | ||
| file_path=manuscript_file, |
There was a problem hiding this comment.
The Path object returned by find_manuscript_md should be converted to a string before passing to file_path parameter. The ValidationError.file_path is typed as str | None, and should match the expected type. Convert using str(manuscript_file).
| file_path=manuscript_file, | |
| file_path=str(manuscript_file), |
| class SectionValidator(BaseValidator): | ||
| """Validator for checking recommended manuscript sections.""" | ||
|
|
||
| # Required sections that should generate warnings if missing |
There was a problem hiding this comment.
Comment states "Required sections" but the constant is named "RECOMMENDED_SECTIONS". The comment should say "Recommended sections" to match the variable name and the actual behavior (warnings, not errors).
| # Required sections that should generate warnings if missing | |
| # Recommended sections that should generate warnings if missing |
Code Review: Section Validation WarningsThank you for this contribution! This is a well-motivated feature that adds helpful guidance for authors. Here's my detailed review: ✅ Strengths1. Good Design Decisions
2. Code Quality
3. Alignment with Project Standards
|
Code Review - PR #275: Section Validation WarningsThank you for this feature addition! This is a valuable enhancement that helps authors ensure manuscript completeness. Here's my detailed review: ✅ Strengths
🔍 Code Quality ObservationsPositive:
Minor Suggestions:
🧪 Test CoverageCritical Issue: No unit tests were added for this new validator. According to CLAUDE.md, the project aims for >80% test coverage. I recommend adding tests covering:
Suggested test structure (following @pytest.mark.validation
class TestSectionValidator(unittest.TestCase):
def setUp(self):
self.temp_dir = tempfile.mkdtemp()
self.manuscript_dir = os.path.join(self.temp_dir, "MANUSCRIPT")
os.makedirs(self.manuscript_dir)
def test_all_sections_present(self):
# Create manuscript with all sections
# Assert no warnings
def test_missing_data_availability(self):
# Create manuscript without data availability
# Assert warning generated with correct message
def test_alternative_section_names(self):
# Test "Acknowledgments" vs "Acknowledgements"
# Test "Conflicts of Interest" vs "Competing Interests"
def test_combined_sections(self):
# Test "Data and Code Availability" covers both🔒 Security ConsiderationsNo security concerns identified:
⚡ Performance Considerations
📋 Additional Recommendations
🎯 VerdictApprove with minor recommendations:
📝 Testing CommandsBefore merging, please run: # After adding tests:
nox -s test-unit # Run unit tests
nox -s test-fast # Quick validation
nox -s lint # Check code style
# Manual testing:
rxiv validate MANUSCRIPT/ # Test on actual manuscriptsGreat work overall! This will be a helpful feature for rxiv-maker users. 🚀 |
Code Review: Section ValidatorThank you for this contribution! This is a well-designed feature that adds helpful validation for recommended manuscript sections. Here's my detailed review: ✅ Strengths
🔍 Issues and ConcernsCRITICAL: Missing Test CoverageThe PR adds 139 lines of new code but includes zero tests. According to CLAUDE.md:
Required tests (should add
Example test structure: import unittest
import tempfile
import os
from pathlib import Path
class TestSectionValidator(unittest.TestCase):
def setUp(self):
self.temp_dir = tempfile.mkdtemp()
def test_all_sections_present(self):
# Create manuscript with all sections
content = """
# Title
## Data Availability
## Code Availability
## Author Contributions
## Acknowledgements
## Funding
## Competing Interests
"""
manuscript_path = Path(self.temp_dir)
(manuscript_path / "01_MAIN.md").write_text(content)
validator = SectionValidator(str(manuscript_path))
result = validator.validate()
self.assertEqual(len(result.errors), 0)
self.assertEqual(len(result.metadata["found_sections"]), 6)Code Quality Issues
Security Considerations✅ No major security issues, but worth noting:
Integration Concerns
Documentation Gaps
📋 RecommendationsBefore merging:
Nice-to-have improvements:
🎯 SummaryOverall: Good implementation with critical testing gap The code quality is solid and follows project conventions well. The feature is valuable for users. However, the complete lack of tests is a blocker for merging according to project standards. Recommendation: Request changes for test coverage before approval. Files to check:
Let me know if you'd like help writing the tests! |
Summary
This PR adds a new that checks manuscripts for standard sections commonly required by journals and issues warnings (not errors) when sections are missing.
Changes
New Validator
Integration
Copyright: (C) 1999 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules OpenMP
Delegates (built-in): bzlib cairo fontconfig freetype heic jng jp2 jpeg jxl lcms lqr ltdl lzma openexr png raw rsvg tiff uhdr webp xml zip zlib zstd
Compiler: clang (17.0.0)
Usage: import [options ...] [ file ]
Image Settings:
-adjoin join images into a single multi-image file
-border include window border in the output image
-channel type apply option to select image channels
-colorspace type alternate image colorspace
-comment string annotate image with comment
-compress type type of pixel compression when writing the image
-define format:option
define one or more image format options
-density geometry horizontal and vertical density of the image
-depth value image depth
-descend obtain image by descending window hierarchy
-display server X server to contact
-dispose method layer disposal method
-dither method apply error diffusion to image
-delay value display the next image after pausing
-encipher filename convert plain pixels to cipher pixels
-endian type endianness (MSB or LSB) of the image
-encoding type text encoding type
-filter type use this filter when resizing an image
-format "string" output formatted image characteristics
-frame include window manager frame
-gravity direction which direction to gravitate towards
-identify identify the format and characteristics of the image
-interlace type None, Line, Plane, or Partition
-interpolate method pixel color interpolation method
-label string assign a label to an image
-limit type value Area, Disk, Map, or Memory resource limit
-monitor monitor progress
-page geometry size and location of an image canvas
-pause seconds seconds delay between snapshots
-pointsize value font point size
-quality value JPEG/MIFF/PNG compression level
-quiet suppress all warning messages
-regard-warnings pay attention to warning messages
-repage geometry size and location of an image canvas
-respect-parentheses settings remain in effect until parenthesis boundary
-sampling-factor geometry
horizontal and vertical sampling factor
-scene value image scene number
-screen select image from root window
-seed value seed a new sequence of pseudo-random numbers
-set property value set an image property
-silent operate silently, i.e. don't ring any bells
-snaps value number of screen snapshots
-support factor resize support: > 1.0 is blurry, < 1.0 is sharp
-synchronize synchronize image to storage device
-taint declare the image as modified
-transparent-color color
transparent color
-treedepth value color tree depth
-verbose print detailed information about the image
-virtual-pixel method
Constant, Edge, Mirror, or Tile
-window id select window with this id or name
root selects whole screen
Image Operators:
-annotate geometry text
annotate the image with text
-colors value preferred number of colors in the image
-crop geometry preferred size and location of the cropped image
-encipher filename convert plain pixels to cipher pixels
-extent geometry set the image size
-geometry geometry preferred size or location of the image
-help print program options
-monochrome transform image to black and white
-negate replace every pixel with its complementary color
-quantize colorspace reduce colors in this colorspace
-resize geometry resize the image
-rotate degrees apply Paeth rotation to the image
-strip strip image of all profiles and comments
-thumbnail geometry create a thumbnail of the image
-transparent color make this color transparent within the image
-trim trim image edges
-type type image type
Miscellaneous Options:
-debug events display copious debugging information
-help print program options
-list type print a list of supported option arguments
-log format format of debugging information
-version print version information
By default, 'file' is written in the MIFF image format. To
specify a particular image format, precede the filename with an image
format name and a colon (i.e. ps:image) or specify the image type as
the filename suffix (i.e. image.ps). Specify 'file' as '-' for
standard input or output.
Behavior
Testing
Tested on both manuscripts:
Example output:
Benefits
Related