📊 API Documentation Quality Research Project

🎯 Project Overview

This repository contains the research framework and validation tools for studying the impact of API documentation quality on Large Language Model (LLM) code generation success. Our groundbreaking research has discovered a "documentation sweet spot" phenomenon where moderate-quality documentation achieves better LLM performance than comprehensive documentation.

🔬 Research Hypothesis

Counter-intuitive Discovery: LLMs generate more functional code when provided with moderate-quality API documentation (3.0-4.0/5.0) compared to excellent documentation (5.0/5.0).

📈 Key Findings

58% better performance with average vs excellent documentation in controlled experiments
Consistent pattern across OpenAI GPT-4, Claude Sonnet, and Gemini Pro
"Over-engineering effect" where comprehensive documentation leads to complex but failure-prone implementations

📋 Experiment Design

Selected APIs

Stripe API (Excellent Documentation)
- Complex payment processing API
- Comprehensive interactive documentation
- Extensive code examples and error handling
GitHub API (Good Documentation)
- GraphQL API for repository management
- Well-structured schema documentation
- Good examples with some gaps
OpenWeatherMap API (Average Documentation)
- Weather data API with API key authentication
- Decent endpoint documentation
- Limited advanced usage patterns
JSONPlaceholder API (Basic Documentation)
- Simple REST API for testing
- Minimal documentation
- Basic endpoint descriptions only
Cat Facts API (Poor Documentation)
- Very simple GET requests
- Minimal documentation
- No examples or error handling

Documentation Quality Metrics

Completeness (25%): Endpoint coverage, parameter documentation, response schemas
Clarity (20%): Language clarity, organization, terminology consistency
Examples (20%): Code examples, language support, real-world use cases
Error Handling (15%): Error codes, troubleshooting guides
Authentication (10%): Auth instructions, security practices
Code Quality (10%): Best practices, production-ready examples

Standardized Tasks

Authentication Setup: Implement API authentication
Data Retrieval: Basic and parameterized GET requests
CRUD Operations: Create, update, delete resources
Error Handling: Rate limiting, validation errors
Edge Cases: Large datasets, network failures
Integration: Multi-step workflows

🏗️ Project Structure

api-doc-quality-tests/
├── apis/                           # API selection and analysis
│   └── api_selection_analysis.md
├── controls/                       # Experiment execution
│   └── experiment_execution.py
├── documentation/                  # Documentation variants
│   └── variants/
├── evaluation/                     # Analysis frameworks
│   ├── documentation_quality_metrics.py
│   ├── llm_testing_infrastructure.py
│   ├── code_validation_system.py
│   ├── data_analysis_framework.py
│   └── results_analysis.py
├── tasks/                         # Standardized task definitions
│   └── standardized_tasks.py
└── README.md

🚀 Quick Start

Prerequisites

pip install pandas numpy matplotlib seaborn scipy scikit-learn
pip install anthropic openai google-generativeai  # For LLM APIs

Environment Setup

Create a .env file with your API keys:

ANTHROPIC_API_KEY=your_claude_api_key
OPENAI_API_KEY=your_openai_api_key
GOOGLE_API_KEY=your_gemini_api_key

Running the Experiment

Execute the complete experiment:

cd controls
python experiment_execution.py

Analyze results:

cd evaluation
python results_analysis.py --results-dir ../experiment_results

View results:
- Check experiment_results/insights_report.md for detailed findings
- View visualizations in experiment_results/visualizations/
- Access raw data in JSON format for further analysis

📊 Expected Outputs

Quantitative Results

Correlation coefficients between documentation metrics and code quality
Statistical significance testing (p-values)
Provider-specific performance comparisons
API-specific success rates

Visualizations

Correlation heatmaps
Scatter plots showing documentation vs code quality relationships
Box plots of success rates by documentation quality quartiles
Provider performance comparisons

Actionable Insights

Specific documentation features that most impact LLM performance
Recommendations for documentation improvement priorities
Provider-specific strengths and weaknesses
Evidence-based best practices for API documentation

🔬 Methodology

Documentation Assessment

Automated scoring based on predefined criteria
Weighted metrics reflecting real-world importance
Consistent evaluation across all APIs

Code Generation Testing

Identical prompts across all LLM providers
Standardized task requirements
Controlled testing environment

Code Validation

Syntax Validation: Python syntax correctness
Functionality: Meeting task requirements
Best Practices: Coding standards compliance
Security: Secure coding practices
Completeness: Implementation thoroughness

Statistical Analysis

Pearson correlation analysis
Significance testing at p < 0.05
Provider and API-specific breakdowns
Regression analysis for predictive insights

📈 Key Research Questions

Primary: How does documentation quality correlate with LLM code generation success?
Secondary:
- Which documentation aspects most impact LLM performance?
- Do different LLM providers show varying sensitivity to documentation quality?
- What documentation quality threshold ensures reliable code generation?
- How do complex vs simple APIs respond to documentation improvements?

🎯 Expected Impact

For API Providers

Evidence-based documentation improvement priorities
ROI justification for documentation investments
Specific guidelines for LLM-friendly documentation

For Developers

Better understanding of documentation quality impact
Improved code generation success rates
More efficient API integration workflows

For Research Community

Empirical data on LLM performance factors
Methodology for similar studies
Baseline metrics for future research

🔄 Extending the Experiment

Adding New APIs

Update experiment_config.json with new API details
Define expected documentation quality level
Run the experiment with expanded API set

Testing Additional LLMs

Implement new provider in llm_testing_infrastructure.py
Add provider configuration
Update analysis framework for new provider

Custom Task Development

Define new tasks in standardized_tasks.py
Specify requirements and success criteria
Update validation system for new task types

📝 Contributing

Fork the repository
Create a feature branch
Implement improvements or extensions
Add tests and documentation
Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

API providers for public documentation access
LLM providers for research-friendly APIs
Open source community for analysis tools

📞 Contact

For questions, suggestions, or collaboration opportunities, please open an issue or contact the research team.

This experiment aims to bridge the gap between documentation quality and AI-assisted development, providing actionable insights for the developer community.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.debug		.debug
ADQAS		ADQAS
api_workflow_env		api_workflow_env
apis		apis
config		config
controls		controls
evaluation		evaluation
tasks		tasks
.env.real_api_template		.env.real_api_template
.env.template		.env.template
.gitignore		.gitignore
ACADEMIC_RESEARCH_PAPER_API_DOCUMENTATION_QUALITY.md		ACADEMIC_RESEARCH_PAPER_API_DOCUMENTATION_QUALITY.md
API_KEY_WORKFLOW_GUIDE.md		API_KEY_WORKFLOW_GUIDE.md
COMPLETE_CORRECTED_RESEARCH_REPORT.md		COMPLETE_CORRECTED_RESEARCH_REPORT.md
COMPREHENSIVE_EXPERIMENT_RESULTS.md		COMPREHENSIVE_EXPERIMENT_RESULTS.md
Corrected_API_Documentation_Quality_Research_Report.md		Corrected_API_Documentation_Quality_Research_Report.md
Documentation_Extraction_Test_Report.md		Documentation_Extraction_Test_Report.md
EXECUTIVE_SUMMARY_FUNDING_REQUEST.md		EXECUTIVE_SUMMARY_FUNDING_REQUEST.md
Final_API_Documentation_Quality_Research_Report.md		Final_API_Documentation_Quality_Research_Report.md
GIT_BEST_PRACTICES.md		GIT_BEST_PRACTICES.md
IMMEDIATE_ACTION_CHECKLIST.md		IMMEDIATE_ACTION_CHECKLIST.md
Improved_API_Documentation_Quality_Research_Report.md		Improved_API_Documentation_Quality_Research_Report.md
LASTPASS_INTEGRATION_GUIDE.md		LASTPASS_INTEGRATION_GUIDE.md
ONE_PAGE_EXECUTIVE_SUMMARY.md		ONE_PAGE_EXECUTIVE_SUMMARY.md
Optimized_API_Documentation_Quality_Research_Report.md		Optimized_API_Documentation_Quality_Research_Report.md
PROMPT_RESPONSE_ANALYSIS.md		PROMPT_RESPONSE_ANALYSIS.md
README.md		README.md
README_NEW.md		README_NEW.md
REAL_API_VALIDATION_ACTION_PLAN.md		REAL_API_VALIDATION_ACTION_PLAN.md
REAL_API_VALIDATION_ASSESSMENT.md		REAL_API_VALIDATION_ASSESSMENT.md
REFINED_METHODOLOGY_RESULTS.md		REFINED_METHODOLOGY_RESULTS.md
RESEARCH_PAPER_SUMMARY.md		RESEARCH_PAPER_SUMMARY.md
REVISED_API_MATRIX_AUTHENTICATED_ONLY.md		REVISED_API_MATRIX_AUTHENTICATED_ONLY.md
SETUP.md		SETUP.md
activate_workflow_env.bat		activate_workflow_env.bat
api_documentation_quality_rubric_assessment.md		api_documentation_quality_rubric_assessment.md
api_documentation_research_study.py		api_documentation_research_study.py
api_key_acquisition_workflow.py		api_key_acquisition_workflow.py
api_provider_selection_matrix.md		api_provider_selection_matrix.md
code_execution_engine.py		code_execution_engine.py
context7_documentation_provider.py		context7_documentation_provider.py
create_github_repo.bat		create_github_repo.bat
demo.py		demo.py
documentation_extractor.py		documentation_extractor.py
enhanced_documentation_extractor.py		enhanced_documentation_extractor.py
enhanced_llm_code_generator.py		enhanced_llm_code_generator.py
experiment_config.json		experiment_config.json
final_research_study.py		final_research_study.py
fixed_code_execution_engine.py		fixed_code_execution_engine.py
generate_corrected_report.py		generate_corrected_report.py
generate_prompt_response_examples.py		generate_prompt_response_examples.py
hybrid_documentation_manager.py		hybrid_documentation_manager.py
improved_documentation_manager.py		improved_documentation_manager.py
improved_research_study.py		improved_research_study.py
investigate_api_keys.py		investigate_api_keys.py
llm_code_generator.py		llm_code_generator.py
llm_config.json		llm_config.json
manual_api_test.bat		manual_api_test.bat
optimized_research_study.py		optimized_research_study.py
prompt_response_cat_facts_api_data_001.txt		prompt_response_cat_facts_api_data_001.txt
prompt_response_github_api_auth_001.txt		prompt_response_github_api_auth_001.txt
prompt_response_stripe_api_auth_001.txt		prompt_response_stripe_api_auth_001.txt
real_api_llm_validation.py		real_api_llm_validation.py
real_api_validation_framework.py		real_api_validation_framework.py
real_api_validation_study.md		real_api_validation_study.md
reprocess_multi_domain_results.py		reprocess_multi_domain_results.py
requirements.txt		requirements.txt
requirements_workflow.txt		requirements_workflow.txt
research_paper_api_documentation_quality.html		research_paper_api_documentation_quality.html
research_paper_api_documentation_quality.md		research_paper_api_documentation_quality.md
run_three_provider_experiment.py		run_three_provider_experiment.py
setup_real_api_validation.py		setup_real_api_validation.py
test_web_browsing_response.txt		test_web_browsing_response.txt
tier1_api_validation.py		tier1_api_validation.py

harrymower/api-documentation-quality-research

Folders and files

Latest commit

History

Repository files navigation