Skip to content

feat: Add comprehensive Python testing infrastructure with Poetry#151

Open
llbbl wants to merge 1 commit intobuckyroberts:masterfrom
UnitSeeker:add-testing-infrastructure
Open

feat: Add comprehensive Python testing infrastructure with Poetry#151
llbbl wants to merge 1 commit intobuckyroberts:masterfrom
UnitSeeker:add-testing-infrastructure

Conversation

@llbbl
Copy link

@llbbl llbbl commented Jun 23, 2025

Add Python Testing Infrastructure

Summary

This PR sets up a comprehensive testing infrastructure for the Website Crawler project using Poetry for dependency management and pytest as the testing framework. The setup provides a solid foundation for adding unit and integration tests to improve code quality and maintainability.

Changes Made

Package Management

  • Added Poetry configuration (pyproject.toml) as the package manager
  • Configured project metadata and Python version requirements (>=3.8)
  • Set up development dependencies group for testing tools

Testing Framework

  • pytest - Main testing framework with extensive configuration
  • pytest-cov - Coverage reporting with 80% threshold requirement
  • pytest-mock - Mocking utilities for isolated unit tests

Testing Configuration

  • Configured pytest settings in pyproject.toml:
    • Test discovery patterns for test_*.py and *_test.py files
    • Coverage reporting (terminal, HTML, and XML formats)
    • 80% coverage threshold with branch coverage
    • Custom test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.slow
    • Strict configuration and marker validation

Directory Structure

tests/
├── __init__.py
├── conftest.py          # Shared fixtures and test configuration
├── test_setup_validation.py  # Validation tests for the infrastructure
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Shared Fixtures (conftest.py)

  • temp_dir - Creates temporary directories for test isolation
  • mock_config - Provides mock configuration dictionaries
  • mock_queue - Mock Queue objects for threading tests
  • sample_html - HTML content for link parsing tests
  • test_files - Creates test files (queue.txt, crawled.txt)
  • mock_spider_class - Mock Spider class for crawler tests
  • mock_url_response - Mock HTTP responses
  • capture_logs - Log capture utility

Development Workflow

  • Updated .gitignore with comprehensive patterns for:
    • Testing artifacts (.pytest_cache/, coverage.xml, htmlcov/)
    • Python build artifacts and virtual environments
    • IDE files and OS-specific files
    • Claude-specific directories (.claude/*)

Validation

  • Created test_setup_validation.py to verify:
    • All dependencies are properly installed
    • Project structure is correctly set up
    • Test markers work as expected
    • Fixtures are callable and functional
    • Coverage configuration is properly set

How to Use

Install Dependencies

# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -

# Install project dependencies
poetry install

Run Tests

# Run all tests
poetry run pytest

# Run with coverage report
poetry run pytest --cov

# Run specific test markers
poetry run pytest -m unit
poetry run pytest -m integration
poetry run pytest -m "not slow"

# Run tests in a specific directory
poetry run pytest tests/unit/

Coverage Reports

  • Terminal: Displayed after each test run
  • HTML: Generated in htmlcov/ directory
  • XML: Generated as coverage.xml for CI integration

Notes

  • The project currently has no external dependencies beyond the Python standard library
  • Coverage threshold is set to 80% to encourage comprehensive testing
  • The validation tests pass successfully, confirming the infrastructure is working
  • Poetry lock file (poetry.lock) is created and should be committed to ensure reproducible builds
  • No actual unit tests for the codebase were created - only the infrastructure setup

Next Steps

With this testing infrastructure in place, developers can now:

  1. Write unit tests for individual modules (spider.py, domain.py, etc.)
  2. Add integration tests for the complete crawling workflow
  3. Implement mocking for external dependencies (file I/O, threading)
  4. Set up CI/CD pipelines using the coverage reports

- Set up Poetry as package manager with pyproject.toml configuration
- Add testing dependencies: pytest, pytest-cov, pytest-mock
- Configure pytest with coverage reporting (80% threshold)
- Create test directory structure (unit/integration)
- Add shared fixtures in conftest.py for common test scenarios
- Update .gitignore with testing and development patterns
- Add validation tests to verify infrastructure setup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant