Skip to content

OptiPick analyzes Amazon product reviews to provide comprehensive sentiment analysis, feature insights, and data-driven purchase recommendations.

License

Notifications You must be signed in to change notification settings

amanop29/Optipick

Β 
Β 

Repository files navigation

OptiPick - Amazon Product Review Analyzer πŸ›’

OptiPick is an advanced analytics tool that helps users make informed purchasing decisions by analyzing Amazon product reviews. It combines web scraping, sentiment analysis, and machine learning to provide comprehensive insights into product reviews, feature sentiment, and buying trends. The tool offers multi-level sentiment classification, aspect-based analysis, and AI-powered summaries to give users a deep understanding of product feedback and customer experiences.

🌟 Features

Product Analysis

  • Smart URL Processing: Handles both full Amazon URLs and short links (amzn.in)
  • Product Details Extraction: Gets product title, price, ratings, images, descriptions, and more
  • Review Scraping: Fetches up to 200 reviews per product with metadata (date, country, verification status)

Sentiment Analysis

  • NLTK VADER Sentiment: Fast and accurate sentiment scoring (-1 to +1)
  • Multi-level Classification: Positive, Negative, and Neutral categorization
  • NPS Calculation: Net Promoter Score derived from 5-star ratings
  • Feature Keywords: TF-IDF based extraction of most discussed product features

Advanced Analytics

  • Aspect-Based Sentiment: Identifies specific product aspects and their sentiment
  • Word Cloud Generation: Visual representation of review content
  • Review Complexity Analysis: Sentence structure, subjectivity, and writing patterns
  • Monthly Trend Analysis: Sentiment trends over time
  • GPT-Powered Insights: Optional AI analysis using OpenAI API

User Interface

  • Modern Streamlit Interface: Clean, responsive design with dark theme
  • Tabbed Navigation: Dashboard, Reviews, Compare, AI Summary, and Advanced NLP
  • Product Comparison: Side-by-side analysis of two products
  • Category Search: Browse and analyze product categories
  • Interactive Charts: Trend visualization and sentiment distribution

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Apify account with API token
  • OpenAI API key (optional, for AI summaries)

Installation

  1. Clone the repository

    git clone <repository-url>
    cd optipick
  2. Install dependencies

    pip install -r requirements.txt
  3. Set up environment variables Create a .streamlit/secrets.toml file:

    APIFY_TOKEN = "your_apify_token_here"
    OPENAI_API_KEY = "your_openai_key_here"  # Optional

    Or set environment variables:

    export APIFY_TOKEN="your_apify_token_here"
    export OPENAI_API_KEY="your_openai_key_here"
  4. Run the application

    streamlit run app.py

πŸ”§ Configuration

Apify Actors Used

  • Product Details: XVDTQc4a7MDTqSTMJ - Extracts product information
  • Reviews: R8WeJwLuzLZ6g4Bkk - Scrapes customer reviews

Supported URLs

  • Full Amazon URLs: https://www.amazon.com/dp/ASIN or https://www.amazon.in/dp/ASIN
  • Short URLs: https://amzn.in/...
  • Category/Search URLs: https://www.amazon.com/s?k=keyword

πŸ“ Project Structure

optipick/
β”œβ”€β”€ app.py                 # Main Streamlit application
β”œβ”€β”€ nlp_utils.py          # Advanced NLP processing functions
β”œβ”€β”€ components.py         # UI components and utilities
β”œβ”€β”€ scraper.py           # Web scraping utilities
β”œβ”€β”€ analyzer.py          # Data analysis functions
β”œβ”€β”€ summarizer.py        # Text summarization utilities
β”œβ”€β”€ utils.py             # General utility functions
β”œβ”€β”€ requirements.txt     # Python dependencies
└── README.md           # This file

🎯 Usage Guide

1. Single Product Analysis

  1. Paste an Amazon product URL in the sidebar
  2. Adjust max reviews (20-200)
  3. Click "Fetch"
  4. Explore the different tabs for insights

2. Product Comparison

  1. Analyze first product (Product A)
  2. Add second product URL in "Compare" section
  3. Click "Fetch B"
  4. View side-by-side comparison in Compare tab

3. Category Search

  1. Go to "Category Search" tab in sidebar
  2. Enter Amazon search/category URL
  3. Set max products to analyze
  4. Browse results in the displayed table

πŸ“Š Analytics Features

Dashboard Tab

  • Product Header: Image, title, brand, pricing, ratings
  • Sentiment Metrics: Positive, negative, neutral counts with NPS
  • Monthly Trends: Time-series chart of sentiment over time
  • Review Highlights: Best, worst, and most informative reviews
  • Feature Keywords: Most discussed product aspects

Reviews Tab

  • Complete Review Table: All scraped reviews with ratings, sentiment scores
  • Sortable Columns: Date, country, verification status, sentiment
  • Detailed Metadata: Review URLs, user verification status

Advanced NLP Tab

  • Aspect Analysis: Product features with associated sentiment
  • Word Cloud: Visual representation of review content
  • Complexity Stats: Writing patterns and review characteristics
  • Detailed Sentiment: 5-level sentiment classification
  • Key Phrases: Important terms and their frequency
  • GPT Analysis: AI-powered insights (requires OpenAI API)

πŸ” Technical Details

Sentiment Analysis Pipeline

  1. Text Preprocessing: URL removal, normalization, cleaning
  2. VADER Scoring: Compound sentiment scores (-1 to +1)
  3. Classification: Threshold-based positive/negative/neutral labeling
  4. Feature Extraction: TF-IDF based keyword identification
  5. Aspect Mining: Entity and sentiment association

Data Processing

  • Date Parsing: Multiple format support for review dates
  • Price Handling: Currency normalization and formatting
  • Review Filtering: Removes empty or invalid reviews
  • Deduplication: Prevents duplicate review entries

Performance Optimizations

  • Streamlit Caching: Cached sentiment analyzer and computations
  • Chunked Processing: Handles large review datasets efficiently
  • Error Handling: Graceful fallbacks for API failures

πŸ›‘οΈ Error Handling

The application includes robust error handling for:

  • Invalid or inaccessible URLs
  • Apify API failures and rate limits
  • Missing or malformed data
  • Network connectivity issues
  • OpenAI API errors (graceful degradation)

πŸ“‹ Requirements

Core Dependencies

streamlit>=1.28.0
pandas>=1.5.0
numpy>=1.21.0
nltk>=3.8
scikit-learn>=1.1.0
apify-client>=1.4.0
requests>=2.28.0
altair>=4.2.0
wordcloud>=1.9.0
textblob>=0.17.0

Optional Dependencies

openai>=1.0.0  # For AI summaries
matplotlib>=3.6.0  # For additional visualizations

🚦 API Limits

Apify Limits

  • Review scraping: Up to 200 reviews per product
  • Category scraping: Up to 200 products per search
  • Rate limits apply based on your Apify subscription

OpenAI Limits

  • Used only for AI summaries (optional feature)
  • Minimal token usage per analysis
  • Graceful fallback to rule-based summaries

πŸ”§ Customization

Adding New Sentiment Models

Extend the nlp_utils.py file to add custom sentiment analysis models.

Custom Scrapers

Modify scraper.py to add support for additional e-commerce platforms.

UI Themes

Customize the CSS in app.py for different visual themes.

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Streamlit for the amazing web framework
  • NLTK for natural language processing tools
  • Apify for reliable web scraping infrastructure
  • OpenAI for advanced AI capabilities
  • scikit-learn for machine learning utilities

πŸ“ž Support

For support, questions, or feature requests:

  1. Open an issue on GitHub
  2. Check existing documentation
  3. Review the troubleshooting section

πŸ”„ Version History

v1.0.0 (Current)

  • Initial release with full functionality
  • Support for Amazon product analysis
  • Advanced NLP features
  • Product comparison capabilities
  • Modern UI with dark theme

Built with ❀️ using Python and Streamlit

About

OptiPick analyzes Amazon product reviews to provide comprehensive sentiment analysis, feature insights, and data-driven purchase recommendations.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%