An automated code review tool based on large language models, providing professional and comprehensive code quality analysis. Supports multi-repository monitoring, automated reviews, and intelligent analysis.
- Trigger Condition: Automatically executed when pushing to any branch
- Smart Filtering: Automatically skips documentation changes (
.md,.txt) and merge commits - Manual Trigger Support: Can specify specific commit SHA for review
- Scheduled Execution: Daily automatic scan at 2:00 AM UTC+8 (6:00 PM UTC)
- Multi-Repository Support: Simultaneously monitors all repositories in the configuration list
- Smart Limiting: Process up to 3 latest commits per repository
- Avoid Duplication: Automatically skips already reviewed commits
- Security Analysis: SQL injection, XSS, CSRF vulnerability detection
- Performance Evaluation: Algorithm complexity, database optimization, memory usage
- Code Quality: Readability, maintainability, SOLID principles checking
- Test Coverage: Unit test suggestions, boundary condition checking
- Best Practices: Language-specific standards, design pattern evaluation
Supports code review reports in 10 languages:
- πΉπΌ Traditional Chinese (
zh-TW) - π¨π³ Simplified Chinese (
zh-CN) - πΊπΈ English (
en) - π―π΅ Japanese (
ja) - π°π· Korean (
ko) - π«π· French (
fr) - π©πͺ German (
de) - πͺπΈ Spanish (
es) - π΅πΉ Portuguese (
pt) - π·πΊ Russian (
ru)
- Parallel Processing: Multi-file simultaneous review for improved processing speed
- Smart Segmentation: Large changes automatically segmented (when exceeding 300KB)
- Configuration Caching: Reduces redundant configuration loading
- API Rate Control: Avoids frequency limitations
- Fallback Models: Supports multiple backup LLM models for service reliability
PR-Agent/
βββ .github/workflows/
β βββ pr-review.yml # Push-triggered code review workflow
β βββ scheduled-review.yml # Scheduled multi-repository scan workflow
βββ scripts/
β βββ ai_code_review.py # Main review script (830+ lines)
β βββ test_config.py # Configuration validation test script
βββ config.json # System configuration file
βββ CONFIG.md # Detailed configuration documentation
βββ README-TW.md # Traditional Chinese documentation
βββ README.md # English documentation (this project)
{
"model": {
"name": "Llama-4-Maverick-17B-128E-Instruct-FP8",
"fallback_models": [
"Llama-3.3-Nemotron-Super-49B-v1",
"Llama-3.3-70B-Instruct-MI210",
"Llama-3.3-70B-Instruct-Gaudi3"
],
"max_tokens": 32768,
"temperature": 0.2,
"timeout": 90
},
"projects": {
"enabled_repos": [
"owner/repo1",
"owner/repo2"
],
"default_repo": "owner/main-repo"
},
"review": {
"max_diff_size": 150000,
"large_diff_threshold": 300000,
"chunk_max_tokens": 8192,
"max_files_detail": 8,
"overview_max_tokens": 12288,
"response_language": "en"
},
"filters": {
"ignored_extensions": [".md", ".txt", ".yml", ".yaml", ".json", ".lock", ".png", ".jpg", ".gif", ".svg"],
"ignored_paths": ["docs/", "documentation/", ".github/", "node_modules/", "dist/", "build/", ".vscode/"],
"code_extensions": [".py", ".js", ".ts", ".jsx", ".tsx", ".java", ".cpp", ".c", ".go", ".rs", ".php", ".rb", ".cs", ".swift", ".kt"]
},
"prompts": {
"include_line_numbers": true,
"detailed_analysis": true,
"security_focus": true,
"performance_analysis": true
}
}For detailed configuration instructions, see CONFIG.md
- CRITICAL: Security vulnerabilities, data loss risks, system failures
- MAJOR: Performance issues, design flaws, breaking changes
- MINOR: Code style, optimization opportunities, suggestions
Supports any service compatible with OpenAI API format:
- OpenAI: GPT-4, GPT-3.5
- Anthropic: Claude series
- Open Source Solutions: Ollama, vLLM, Text Generation Inference
- Other Providers: Any service supporting OpenAI API format
Full support: Python, JavaScript, TypeScript, Java, C++, C, Go, Rust, PHP, Ruby, C#, Swift, Kotlin
Since this system requires cross-repository operations (accessing other repositories from the PR-Agent repository), you must use a Personal Access Token.
Go to repository Settings > Secrets and variables > Actions and add:
| Secret Name | Description | Example |
|---|---|---|
GH_TOKEN |
GitHub Personal Access Token | ghp_xxxxx... |
OPENAI_KEY |
LLM Service API Key | sk-xxxxx... |
OPENAI_BASE_URL |
LLM Service Base URL | https://api.xxx.com/v1 |
Important Notes:
- The above 3 Secrets need to be manually configured
GITHUB_SHA,GITHUB_REPOSITORYetc. are built-in environment variables automatically provided by GitHub Actions- You neither need nor can manually configure these built-in variables
-
Go to GitHub Settings > Personal access tokens > Tokens (classic)
-
Click "Generate new token" > "Generate new token (classic)"
-
Configure Token Information:
- Note:
AI Code Review Token - Expiration: Recommend 90 days (adjust as needed)
- Select scopes: Check the following permissions
- β
repo- Full repository permissions (including private repositories) - β
write:discussion- Discussion write permissions (optional)
- β
- Note:
-
Click "Generate token"
-
Important: Immediately copy the token and save it to a secure location
Permission Explanation:
repopermission includes the ability to create issues- Supports cross-repository operations, can create review issues in any repository in the enabled_repos list
Security Reminder:
- Token has full repository permissions for your account, please keep it secure
- Update token regularly (recommend every 90 days)
- If token is compromised, immediately revoke it in GitHub settings
{
"model": {
"name": "your-model-name"
},
"projects": {
"enabled_repos": ["owner/repo-name"],
"default_repo": "owner/repo-name"
},
"review": {
"response_language": "en"
}
}{
"projects": {
"enabled_repos": [
"org/project1",
"org/project2",
"user/personal-project"
],
"default_repo": "org/project1"
}
}{
"projects": {
"enabled_repos": ["*"]
}
}Before running the main script, you can first run configuration validation:
python scripts/test_config.pySuccessful output example:
π Testing configuration file...
β
Configuration validation passed
Model: Llama-4-Maverick-17B-128E-Instruct-FP8
Language: en
Enabled repos: 4 repositories
Max tokens: 32,768
Temperature: 0.2
π Configuration test completed successfully!
Submit changes to trigger automatic review:
git add .
git commit -m "Add AI code review configuration"
git push origin mainGo to the Actions page to view execution results.
Trigger Conditions:
- β Push to any branch
- β Manual trigger (workflow_dispatch)
- β Skip when modifying documentation files (
.md,.gitignore,LICENSE,docs/**)
Workflow Process:
- Environment Setup β Checkout code, setup Python 3.11, install dependencies
- Configuration Validation β Run
test_config.pyto validate configuration file - Code Review β Execute main review script
- Publish Results β Create Issue in target repository to publish review results
Execution Time:
- Scheduled Trigger: Daily at 2:00 AM UTC+8 (6:00 PM UTC)
- Manual Trigger: Can customize scan time range, maximum commits per repository, and concurrency
Scan Logic:
- Repository Traversal β Scan all repositories in
enabled_repos - Time Filtering β Only process commits from the last 24 hours (adjustable)
- Duplicate Check β Automatically skip commits that already have review Issues
- Parallel Processing β Simultaneously process multiple repositories and commits
- Quantity Limit β Process up to 3 latest commits per repository (adjustable)
- Concurrency Limit β Set
SCAN_CONCURRENCYto control parallel repository scans (default: 4)
- Project Check β Confirm project is in allow list
- Token Permission Test β Verify GitHub Token permissions and type
- Commit Analysis β Get change content and statistics
- Smart Filtering β Skip documentation changes, merge commits
- Strategy Selection β Choose review method based on change size:
- Small Changes (< 150KB): Full analysis
- Large Changes (> 300KB): Segmented processing, focus on top 8 files with most changes
- AI Analysis β Execute comprehensive code review
- Publish Results β Create review Issue in target repository
- Threshold Detection: Trigger segmented mode when exceeding 300KB characters
- Focus Review: Priority processing of 8 files with the most changes
- Parallel Processing: Simultaneously process multiple file segments for improved efficiency
- Consolidated Report: Generate overall overview and specific recommendations
{
"model": {
"name": "primary-model",
"fallback_models": ["backup-model-1", "backup-model-2"],
"max_tokens": 32768,
"temperature": 0.1,
"timeout": 120
}
}{
"filters": {
"ignored_extensions": [".md", ".txt", ".png", ".jpg"],
"ignored_paths": ["docs/", "dist/", ".vscode/", "tests/fixtures/"],
"code_extensions": [".py", ".js", ".ts", ".java", ".go"]
}
}{
"review": {
"max_diff_size": 200000,
"large_diff_threshold": 500000,
"chunk_max_tokens": 10240,
"max_files_detail": 10,
"overview_max_tokens": 16384
}
}{
"prompts": {
"include_line_numbers": true,
"detailed_analysis": true,
"security_focus": true,
"performance_analysis": true
}
}π€ AI Code Review - Commit abc12345
## AI Code Review Report
**Review Time**: 2024-01-20 14:30:25 UTC+8
**Commit**: [abc12345](https://github.com/owner/repo/commit/abc12345)
**Author**: John Doe
**Message**: Implement user authentication system
**Model**: Llama-4-Maverick-17B-128E-Instruct-FP8
**Change Statistics**: 15 files, +342 lines, -89 lines
---
## π Security Analysis
### CRITICAL Issues
- **SQL Injection Risk** (user_service.py:42)
- Direct string concatenation to build SQL queries
- Recommendation: Use parameterized queries or ORM
### MAJOR Issues
- **Hardcoded API Key** (config.py:15)
- Sensitive information should not be hardcoded in source code
- Recommendation: Use environment variables or secure configuration management
## β‘ Performance Analysis
### MAJOR Issues
- **N+1 Query Problem** (user_repository.py:78)
- Executing database queries in a loop
- Recommendation: Use batch queries or lazy loading
### MINOR Optimizations
- **Connection Pool Configuration** (database.py:25)
- Consider configuring connection pool for improved performance
- Recommendation: Set appropriate maximum connection count
## π Code Quality
### POSITIVE Strengths
- β
Good error handling implementation
- β
Clear function naming and comments
- β
Appropriate unit test coverage
### MINOR Suggestions
- **Variable Naming** (auth_controller.py:156)
- `tmp_var` could use a more descriptive name
- Recommendation: `temporary_session_data`
## π§ͺ Testing Suggestions
- Add boundary condition tests (null values, extreme values)
- Consider adding integration tests covering authentication flow
- Recommend testing exception handling logic
## π Summary and Recommendations
### Priority Actions (High Risk)
1. Fix SQL injection vulnerability
2. Remove hardcoded API keys
3. Resolve N+1 query performance issue
### Follow-up Optimizations (Medium Risk)
- Improve variable naming conventions
- Add more test cases
- Consider performance optimization strategies
---
### π Review Notes
- This review is automatically generated by AI, please combine with human judgment
- Recommend prioritizing CRITICAL and MAJOR issues by severity
- If you have questions or need further discussion, please comment below this issue
### π Related Links
- [View Commit Changes](https://github.com/owner/repo/commit/abc12345)
- [View File Diff](https://github.com/owner/repo/commit/abc12345.diff)
- [Project Configuration](https://github.com/owner/repo/blob/main/config.json)
---
*Generated by [PR-Agent](https://github.com/sheng1111/AI-Code-Review-Agent)*Review Report Features:
- π Structured Analysis: Categorized by security, performance, quality
- π¬ Team Discussion Support: Issue format facilitates collaborative communication
- π·οΈ Automatic Label Classification:
ai-code-review,automatedlabels - π Searchable and Filterable: Easy historical tracking and management
- β Track Resolution Status: Can mark issues as resolved
- π Detailed Statistics: Includes change statistics and file information
- High Accuracy Requirements: Use GPT-4 or Claude-3
- Cost Considerations: Use open-source models like Llama series
- Chinese Optimization: Choose models with better Chinese support
- Reliability Considerations: Configure multiple fallback models to ensure service availability
- Appropriate max_tokens: Adjust based on model capabilities (recommend 16K-32K)
- Reasonable temperature setting: 0.1-0.3 suitable for code review
- Concurrency control: Avoid exceeding API rate limits
- Project scope control: Only enable automatic review for important projects
- Adjust
SCAN_CONCURRENCY: Increase thread count to reduce scheduled scan time
- Set appropriate file size limits (recommend full analysis within 150KB)
- Filter non-critical file types (exclude images, documents, configuration files)
- Use lower-cost models (open-source models or smaller parameter models)
- Limit concurrent file processing (focus analysis on 8 files for large changes)
- Adjust scheduled scan frequency (default daily, adjust based on needs)
- Group Management: Categorize related projects for batch configuration
- Priority Setting: Important projects can be set for more frequent scanning
- Permission Management: Ensure PAT has appropriate permissions for all target repositories
- Monitoring and Alerts: Regularly check review execution status and error logs
Checklist:
- GitHub Secrets configured correctly (
GH_TOKEN,OPENAI_KEY,OPENAI_BASE_URL) - Token has sufficient permissions (needs
repopermission to create issues) - LLM service responding normally
- Project is in
enabled_reposlist - Check Actions execution logs for errors
- Check target repository Issues page for
ai-code-reviewlabeled issues - Confirm changes are not in filter list (non-documentation files)
Common Causes and Solutions:
- Fine-grained PAT cross-repository limitations: Switch to Classic Personal Access Token
- Insufficient token permissions: Ensure includes
repopermission (not justpublic_repo) - Token expired: Check and update expired PAT
- Organization setting restrictions: Check organization's Token policy settings
- Repository doesn't exist or no permission: Confirm repository name is correct and PAT owner has permissions
Effective Strategies:
- Adjust Token Limits: Lower
max_tokens,chunk_max_tokens - Filter More Files: Expand
ignored_extensionsandignored_paths - Use Cheaper Models: Choose lower-cost LLM services
- Limit Processing Scope: Reduce
max_files_detailcount - Adjust Scan Frequency: Lower scheduled scan frequency (e.g., weekly)
- Set Size Limits: Lower
max_diff_sizeto skip very large changes
Complete Support List:
- Mainstream Languages: Python, JavaScript, TypeScript, Java, C++, C
- Emerging Languages: Go, Rust, Swift, Kotlin
- Scripting Languages: PHP, Ruby, Shell
- Enterprise Languages: C#, VB.NET
- Configuration Languages: YAML, JSON (adjustable via
code_extensions)
Prompt Configuration Adjustment:
{
"prompts": {
"include_line_numbers": true, // Include line number information
"detailed_analysis": false, // Simplify analysis to reduce costs
"security_focus": true, // Strengthen security checks
"performance_analysis": false // Disable performance analysis for speed
}
}Check Items:
- Workflow Enabled: Confirm
scheduled-review.ymlis enabled - Timezone Settings: Confirm cron expression matches expected time
- Repository Activity: GitHub may pause scheduled tasks for inactive repositories
- Manual Testing: Use workflow_dispatch to manually trigger tests
- Permission Check: Confirm Actions has execution permissions
Problem Symptoms:
Error: HttpError: Not Found
or
Error: 403 Forbidden
Diagnostic Steps:
# Test Token basic permissions
curl -H "Authorization: token YOUR_TOKEN" \
https://api.github.com/user
# Test specific repository permissions
curl -H "Authorization: token YOUR_TOKEN" \
https://api.github.com/repos/USERNAME/REPO_NAME
# Test Issues creation permissions
curl -X POST \
-H "Authorization: token YOUR_TOKEN" \
-H "Content-Type: application/json" \
https://api.github.com/repos/USERNAME/REPO_NAME/issues \
-d '{"title":"Test Issue","body":"Test"}'Solutions:
- Use Classic PAT: Avoid Fine-grained PAT cross-repository limitations
- Ensure repo permissions: Must include full
repopermission - Check repository settings: Confirm target repository allows Issues feature
Common Errors:
Error: Connection timeout
or
Error: Invalid API key
Test API Connection:
curl -X POST YOUR_BASE_URL/chat/completions \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "your-model-name",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 10
}'Solutions:
- Check Network Connection: Ensure server can connect to LLM service
- Verify API Key: Confirm key is valid and not expired
- Check Model Name: Confirm model name matches what provider offers
- Adjust Timeout: Increase timeout setting value
Validate Configuration Syntax:
# Check JSON syntax
python -c "import json; print('β
Valid JSON' if json.load(open('config.json')) else 'β Invalid JSON')"
# Run full configuration test
python scripts/test_config.pyCommon Configuration Errors:
- JSON syntax errors (missing commas, quotes)
- Missing required fields
- Values outside limits (e.g., temperature > 2.0)
- Incorrect repository name format
Problem Symptoms: Timeout or out of memory
Solution Strategies:
{
"review": {
"max_diff_size": 100000, // Lower single processing size
"large_diff_threshold": 200000, // Lower segmentation threshold
"chunk_max_tokens": 4096, // Reduce tokens per segment
"max_files_detail": 5 // Reduce detailed analysis file count
}
}Repository Permission Settings:
- Go to
Settings > Actions > General - Set Workflow permissions to "Read and write permissions"
- Check "Allow GitHub Actions to create and approve pull requests"
Organization Level Permissions:
- Confirm organization allows Personal Access Token
- Check organization's Actions usage policies
- Processing Time: Average time per review
- Token Usage: API call cost statistics
- Success Rate: Review completion percentage
- Error Rate: Failed request classification statistics
- Coverage Rate: Reviewed commits vs total commits ratio
View GitHub Actions execution logs:
π Testing configuration file...
β
Configuration validation passed
Model: Llama-4-Maverick-17B-128E-Instruct-FP8
Language: en
Enabled repos: 4 repositories
Max tokens: 32,768
Temperature: 0.2
Testing GitHub Token permissions...
Token is valid, user: sheng1111
Token type: Classic
Classic PAT scopes: ['repo', 'workflow']
SUCCESS: Token has 'repo' permission for cross-repository operations
Configuration Summary:
Model: Llama-4-Maverick-17B-128E-Instruct-FP8
Fallback Models: 3 configured
Max Tokens: 32,768
Temperature: 0.2
Large Diff Threshold: 300,000 chars
Response Language: en
Enabled Repositories: 4
Review statistics: 5 files changed
Change size: 12,450 chars, using full analysis
AI code review completed successfully
Review issue created: https://github.com/owner/repo/issues/123
- Monthly: Check PAT expiration time, update in advance
- Quarterly: Review LLM service costs and usage
- Quarterly: Update configuration to adapt to new project needs
- Semi-annually: Evaluate model performance, consider upgrades
- Semi-annually: Clean up expired review Issues
- Monitor API Usage: Set usage alerts
- Adjust Scan Strategy: High frequency for important projects, low frequency for general projects
- Optimize Filter Rules: Continuously improve ignore lists
- Choose Appropriate Models: Balance cost and quality
This project is open-sourced under the MIT License, and community contributions and improvements are welcome.
- Python 3.11+: Primary development language
- GitHub Actions: CI/CD automation platform
- OpenAI API Compatible Format: LLM service interface standard
- Requests Library: HTTP request handling
Welcome to submit Issues and Pull Requests:
- Bug Reports: Describe detailed steps to reproduce the issue
- Feature Suggestions: Explain use cases and expected effects
- Code Contributions: Follow existing code style
- Documentation Improvements: Help improve usage instructions
Project Maintainer: @sheng1111 Last Updated: JUNE 2025 Version: v1.0