A comprehensive, industry-standard ETL (Extract, Transform, Load) pipeline for processing stock market data, calculating profit/loss, and generating financial analysis reports using Medallion Architecture.
- Overview
- Features
- Architecture
- Quick Start
- Installation
- Configuration
- Usage
- Data Layers
- Portfolio Management
- Development
- API Documentation
- Contributing
- Roadmap
- License
- Support
Portfolio Tracker is an exciting ETL solution that leverages the power of Medallion architecture to transform trading data into actionable insights. It seamlessly extracts, transforms, and loads data from trading accounts (Upstox and others) into a multi-tiered storage system: BRONZE, SILVER, and GOLD layers, providing a robust foundation for data management, analytics, and decision-making in trading activities.
π Data-Driven Trading: Make informed decisions based on comprehensive portfolio analysis π Automated Processing: Set it and forget it - automated ETL pipeline handles everything π Performance Tracking: Real-time P&L calculation and portfolio performance metrics πΌ Multi-User Support: Track portfolios for multiple users and accounts π― Production Ready: Industry-standard code with comprehensive logging and error handling π Scalable Design: Medallion architecture that grows with your data needs
- β Multi-Layer Architecture: Progressive data refinement through Bronze β Silver β Gold β API layers
- β Portfolio Management: Track stocks across multiple users and trading accounts
- β P&L Calculation: Automated profit/loss calculation with FIFO position matching
- β Brokerage & Taxes: Accurate calculation of brokerage, STT, stamp duty, and GST
- β Multiple Asset Classes: Support for equities, futures, options, and commodities
- β Data Validation: Schema-based data contracts ensure data quality
- β Comprehensive Logging: Structured logging for debugging and monitoring
- β Type Safety: Full type hints throughout the codebase for better IDE support
- β Error Handling: Custom exceptions for better error management
- β CLI Interface: Flexible command-line interface with multiple options
- β Web Dashboard: Interactive visualizations and insights (coming soon)
π Exchange Integration: Connect with Upstox and other trading platforms π Real-Time Data: Process real-time and historical trading data π° Dividend Tracking: Track and analyze dividend income π Trend Analysis: Visualize portfolio performance over time π― Position Management: Track open and closed positions with detailed metrics π Risk Analysis: Understand your portfolio risk exposure
ποΈ Medallion Architecture: Industry-standard lakehouse architecture π Python-Powered: Built with modern Python (3.7+) π¦ Modular Design: Clean, maintainable, and extensible codebase π Data Contracts: Schema validation for data quality π Documentation: Comprehensive guides and API documentation π§ͺ Testing Ready: Structure supports comprehensive test coverage
Portfolio Tracker implements the Medallion Architecture, a data design pattern used to organize data in a lakehouse logically:
βββββββββββββββ
β SOURCE β Raw data files (CSV, Excel, JSON)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β BRONZE β Harmonized data with standardized schema
ββββββββ¬βββββββ - Data type conversion
β - Column name normalization
β - Basic validation
βΌ
βββββββββββββββ
β SILVER β Cleansed, validated, and enriched data
ββββββββ¬βββββββ - Data quality checks
β - Duplicate removal
β - Enrichment with reference data
βΌ
βββββββββββββββ
β GOLD β Business-ready aggregated analytics
ββββββββ¬βββββββ - Portfolio calculations
β - P&L computation
β - Aggregations and metrics
βΌ
βββββββββββββββ
β API β JSON endpoints for frontend consumption
βββββββββββββββ - User-specific data
- Optimized for web/mobile apps
Portfolio Tracker
β
βββ StockETL/ # Core ETL Package
β βββ ETL_BRONZE/ # Bronze layer processors
β βββ ETL_SILVER/ # Silver layer processors
β βββ ETL_GOLD/ # Gold layer processors
β βββ ETL_API/ # API generation
β βββ portfolio/ # Portfolio management
β βββ common_utility/ # Shared utilities
β βββ logger.py # Logging system
β βββ exceptions.py # Custom exceptions
β βββ constants.py # Constants & config
β
βββ DATA/ # Data Storage
β βββ SOURCE/ # Raw input files
β βββ BRONZE/ # Harmonized data
β βββ SILVER/ # Cleansed data
β βββ GOLD/ # Analytics-ready data
β βββ API/ # JSON outputs
β βββ CONFIG/ # Data contracts
β βββ logs/ # Application logs
β
βββ src/ # Frontend Application
βββ pages/ # React/Vue pages
Get started in 5 minutes with Portfolio Tracker:
# 1. Clone the repository
git clone https://github.com/PtPrashantTripathi/PortfolioTracker.git
cd PortfolioTracker
# 2. Install dependencies
pip install -r requirements.txt
# 3. Configure environment
cp .env.example .env
# Edit .env and set PROJECT_DIR
# 4. Run the pipeline
python -m StockETL
# 5. Check the results
ls -la DATA/API/For a detailed guide, see QUICKSTART.md
- Python 3.7 or higher
- 2GB RAM minimum (4GB recommended)
- 500MB disk space for data storage
pandas >= 2.2.2
python-dotenv >= 1.0.1
pydantic >= 2.0.0
numpy >= 1.24.0
# Clone the repository
git clone https://github.com/PtPrashantTripathi/PortfolioTracker.git
cd PortfolioTracker
# Create virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install in development mode
pip install -e .pip install StockETLdocker pull ptprashanttripathi/portfolio-tracker
docker run -v $(pwd)/DATA:/app/DATA portfolio-tracker-
Copy the environment template:
cp .env.example .env
-
Edit
.envwith your settings:# Required PROJECT_DIR=/absolute/path/to/your/PortfolioTracker # Optional LOG_LEVEL=INFO
Ensure the following structure exists (created automatically):
DATA/
βββ SOURCE/ # Place your raw data files here
β βββ Symbol/ # Stock master data (CSV)
β βββ TradeHistory/ # Trade transactions (CSV/Excel)
β βββ Dividend/ # Dividend data
β βββ Holding/ # Current holdings
β
βββ BRONZE/ # Harmonized data (auto-generated)
βββ SILVER/ # Cleansed data (auto-generated)
βββ GOLD/ # Analytics data (auto-generated)
βββ API/ # JSON outputs (auto-generated)
βββ CONFIG/ # Data contracts & schemas
β βββ DATA_CONTRACTS/ # JSON schema files
βββ logs/ # Application logs
Create JSON schema files in DATA/CONFIG/DATA_CONTRACTS/:
{
"data_schema": [
{
"col_name": "symbol",
"data_type": "string"
},
{
"col_name": "price",
"data_type": "float64"
},
{
"col_name": "quantity",
"data_type": "int64"
}
],
"order_by": ["date", "symbol"]
}Run the complete pipeline:
python -m StockETLRun specific layers:
# Run only Bronze and Silver
python -m StockETL --layers bronze silver
# Run Gold and API layers
python -m StockETL --layers gold apiControl logging:
# Verbose output
python -m StockETL --verbose
# Quiet mode (errors only)
python -m StockETL --quiet
# Custom log level
python -m StockETL --log-level DEBUGView help:
python -m StockETL --helpRun ETL layers individually:
from StockETL import ETL_BRONZE, ETL_SILVER, ETL_GOLD, ETL_API
# Bronze Layer - Data Harmonization
ETL_BRONZE.Symbol.run()
ETL_BRONZE.TradeHistory.run()
ETL_BRONZE.StockData.run()
# Silver Layer - Data Cleansing
ETL_SILVER.Symbol.run()
ETL_SILVER.StockPrice.run()
ETL_SILVER.StockEvents.run()
ETL_SILVER.TradeHistory.run()
# Gold Layer - Analytics
ETL_GOLD.Portfolio.run()
ETL_GOLD.Dividend.run()
# API Layer - JSON Generation
ETL_API.API.run()Use the Portfolio Manager:
from StockETL.portfolio import Portfolio
# Create a portfolio
portfolio = Portfolio()
# Process trades
trade_data = {
"username": "investor_01",
"datetime": "2024-01-15 10:30:00",
"exchange": "NSE",
"segment": "EQ",
"symbol": "RELIANCE",
"scrip_name": "Reliance Industries",
"side": "BUY",
"quantity": 100,
"price": 2500.50,
"amount": 250050.00,
}
portfolio.trade(trade_data)
# Get current holdings
holdings = portfolio.get_current_holding()
print(f"Current Holdings: {len(holdings)} positions")
# Get P&L
pnl = portfolio.get_pnl()
total_pnl = sum(p['pnl_amount'] for p in pnl)
print(f"Total P&L: βΉ{total_pnl:,.2f}")
# Get holding history
history = portfolio.get_holding_history()
print(f"Total Records: {len(history)}")Custom Logging:
from StockETL.logger import get_logger
import logging
# Get a logger for your module
logger = get_logger(__name__)
# Use it
logger.info("Processing started")
logger.debug("Detailed debug information")
logger.warning("Warning message")
logger.error("Error occurred")Purpose: Raw data ingestion point
Input: CSV, Excel, JSON files from trading platforms Process: None (raw data) Output: Original files stored for audit trail
Example Files:
TradeHistory_2024.csv- Your trade transactionsSymbol_Master.csv- Stock master dataDividend_2024.xlsx- Dividend records
Purpose: Data harmonization and standardization
Process:
- Column name normalization
- Data type conversion
- Basic validation
- Duplicate removal
Output: Standardized CSV files
Example Transformation:
Before (SOURCE):
"Stock Name", "Qty", "Price"
"RELIANCE", "100", "2500.50"
After (BRONZE):
"scrip_name", "quantity", "price"
"RELIANCE", 100, 2500.50
Purpose: Data cleansing and enrichment
Process:
- Data quality checks
- Missing value handling
- Reference data enrichment
- Business rule application
Output: Cleansed and enriched CSV files
Features:
- Stock price history
- Corporate actions (splits, bonuses)
- Symbol standardization
- Trade validation
Purpose: Business-ready analytics
Process:
- Portfolio calculations
- P&L computation
- Aggregations
- Metrics calculation
Output: Analytics-ready CSV files
Reports Generated:
- Current holdings with unrealized P&L
- Closed positions with realized P&L
- Portfolio trends over time
- Dividend summary
Purpose: Frontend-ready JSON endpoints
Process:
- User-specific data segregation
- JSON formatting
- API structure creation
Output: JSON files per user
API Endpoints:
DATA/API/username/
βββ current_holding_data.json # Current positions
βββ holding_trands_data.json # Portfolio trends
βββ profit_loss_data.json # P&L summary
βββ dividend_data.json # Dividend historyThe portfolio manager uses FIFO (First In, First Out) for position matching:
# Example: Buy and Sell matching
portfolio.trade({
"side": "BUY",
"quantity": 100,
"price": 2500
})
portfolio.trade({
"side": "SELL",
"quantity": 50,
"price": 2600
})
# Automatically matches and calculates P&LAccurate brokerage calculation including:
- Brokerage Charges: Exchange-specific rates
- STT/CTT: Securities Transaction Tax
- Stamp Duty: State stamp duty
- Exchange Charges: NSE/BSE transaction charges
- SEBI Charges: Regulatory charges
- GST: 18% on applicable charges
- Equities (EQ): Delivery and Intraday
- Futures (FO): Stock and Index futures
- Options (FO): Call and Put options
- Commodities: MCX instruments (planned)
The codebase follows industry best practices:
β PEP 8 Compliance: Python style guide β Type Hints: Full type annotations β Docstrings: Comprehensive documentation β Error Handling: Proper exception handling β Logging: Structured logging throughout β Modularity: Clear separation of concerns
# Run all tests
pytest
# Run with coverage
pytest --cov=StockETL --cov-report=html
# Run specific tests
pytest tests/test_portfolio.py# Format code with Black
black StockETL/
# Sort imports with isort
isort StockETL/
# Check with flake8
flake8 StockETL/# Install development dependencies
pip install -e ".[dev,testing]"
# Setup pre-commit hooks
pre-commit install
# Run pre-commit checks
pre-commit run --all-filesPortfolio Tracker uses comprehensive structured logging:
- DEBUG: Detailed diagnostic information
- INFO: General informational messages
- WARNING: Warning messages for potential issues
- ERROR: Error messages for failures
- CRITICAL: Critical errors requiring immediate attention
Configure via environment variables or CLI:
# Set log level
export LOG_LEVEL=DEBUG
# Or use CLI
python -m StockETL --log-level DEBUG --log-file DATA/logs/debug.logLogs are written to:
- Console: Real-time feedback
- File: Persistent logging with rotation
Example log entry:
2024-12-29 10:30:45 - StockETL.ETL_BRONZE.Symbol - INFO - Processing file: Symbol_Master.csv
2024-12-29 10:30:46 - StockETL.ETL_BRONZE.Symbol - DEBUG - Processed 1250 rows
2024-12-29 10:30:46 - StockETL.ETL_BRONZE.Symbol - INFO - β Symbol processing complete
All API endpoints return JSON in this format:
{
"data": [...],
"load_timestamp": "2024-12-29 10:30:45"
}File: DATA/API/{username}/current_holding_data.json
{
"data": [
{
"symbol": "RELIANCE",
"scrip_name": "Reliance Industries",
"quantity": 50,
"price": 2500.50,
"amount": 125025.00,
"close_price": 2600.00,
"close_amount": 130000.00,
"pnl_amount": 4975.00,
"side": "BUY",
"datetime": "2024-01-15 10:30:00"
}
],
"load_timestamp": "2024-12-29 10:30:45"
}File: DATA/API/{username}/profit_loss_data.json
{
"data": [
{
"symbol": "RELIANCE",
"quantity": 50,
"open_price": 2500.50,
"close_price": 2600.00,
"pnl_amount": 4975.00,
"pnl_percentage": 3.98,
"brokerage": 125.50,
"position": "LONG",
"open_datetime": "2024-01-15 10:30:00",
"close_datetime": "2024-01-20 14:00:00"
}
],
"load_timestamp": "2024-12-29 10:30:45"
}File: DATA/API/{username}/holding_trands_data.json
{
"data": [
{
"date": "2024-01-15",
"open": 125025.00,
"high": 130000.00,
"low": 123000.00,
"close": 127500.00,
"holding": 127500.00
}
],
"load_timestamp": "2024-12-29 10:30:45"
}We welcome contributions! Please see our CONTRIBUTING.md for details.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes
- Add tests if applicable
- Commit your changes:
git commit -m 'feat: add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
- Follow PEP 8 style guide
- Add type hints to all functions
- Write comprehensive docstrings
- Include tests for new features
- Update documentation as needed
- Multi-layer ETL architecture
- Portfolio management
- P&L calculation
- Comprehensive logging
- CLI interface
- Unit test coverage > 80%
- Integration tests
- Real-time data ingestion
- WebSocket support
- Advanced analytics
- Performance optimizations
- Docker containerization
- RESTful API server with FastAPI
- Web dashboard (React/Vue)
- Mobile app
- Multi-exchange support (BSE, MCX)
- Cloud deployment (AWS/GCP)
- Machine learning predictions
- Risk analysis tools
- Alert notifications
- Social trading features
- Tax optimization suggestions
This project is licensed under the MIT License - see the LICENSE file for details.
MIT License
Copyright (c) 2023-2025 Pt. Prashant Tripathi
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
Pt. Prashant Tripathi
- π§ Email: ptprashanttripathi@outlook.com
- π GitHub: @PtPrashantTripathi
- π LinkedIn: Pt. Prashant Tripathi
- π Website: ptprashanttripathi.github.io
- Thanks to all contributors and the open-source community
- Inspired by modern data engineering practices and Medallion architecture
- Built with β€οΈ for the Indian stock market trading community
- Special thanks to Databricks for the Medallion architecture pattern
- π Documentation: Check QUICKSTART.md and this README
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π§ Email: ptprashanttripathi@outlook.com
Join our community to:
- Share trading strategies
- Report bugs and request features
- Contribute to the project
- Learn from other traders
When reporting issues, please include:
- Clear description of the problem
- Steps to reproduce
- Expected vs actual behavior
- Python version and OS
- Relevant log files
- Screenshots if applicable
#DataAnalytics #TradingStrategies #MedallionArchitecture #ETLProcess #UpstoxData #DataDrivenDecisions #FinanceTech #DataInsights #Python #Trading #Portfolio #StockMarket #NSE #BSE #FinancialAnalysis
Version: 0.8.12 Status: Production Ready Last Updated: December 2025
Made with β€οΈ by Pt. Prashant Tripathi
β Star this repo if you find it helpful!