This project showcases SQL-based Exploratory Data Analysis (EDA) on a structured retail data warehouse.
It includes scripts for database creation, metadata exploration, dimensional profiling, date analysis, metric computation, magnitude breakdowns, and ranking using advanced SQL functions.
Creates the DataWarehouseAnalytics database, defines the gold schema, and loads dimension & fact tables via BULK INSERT.
Tables:
gold.dim_customersgold.dim_productsgold.fact_sales
:contentReference[oaicite:0]{index=0}
Explores metadata using INFORMATION_SCHEMA.
- Lists all tables
- Inspects columns and data types
Useful for schema validation before EDA.
:contentReference[oaicite:1]{index=1}
Profiles dimension tables (dim_customers, dim_products).
Uses DISTINCT + ORDER BY to understand:
- Countries
- Categories & subcategories
- Product diversity
:contentReference[oaicite:2]{index=2}
Analyzes temporal boundaries using MIN(), MAX(), DATEDIFF().
Includes:
- First/last order date
- Youngest/oldest customers
:contentReference[oaicite:3]{index=3}
Computes key business metrics:
- Total sales, quantity, orders
- Average price
- Distinct customers
- Combined metric report via
UNION ALL
:contentReference[oaicite:4]{index=4}
Breakdown of data magnitude using GROUP BY:
- Customers by country/gender
- Products by category
- Average cost per category
- Revenue by category & customer
- Item distribution across regions
:contentReference[oaicite:5]{index=5}
Ranks products and customers using:
TOPRANK()ROW_NUMBER()
Includes:- Top & bottom performing products
- Top revenue-generating customers
- Customers with fewest orders
:contentReference[oaicite:6]{index=6}
- Fully SQL-based, no external tools required
- Clear logical progression from schema → dimensions → dates → metrics → magnitude → ranking
- Uses advanced SQL: window functions, grouping, distinct profiling, metadata queries
- Excellent template for analytics portfolios and data engineering foundations
- Run
00_init_database.sqlto create and populate the warehouse. - Execute each exploration script in order:
- Use any SQL client supporting T-SQL (Azure Data Studio, SSMS).
- Data warehousing fundamentals
- SQL EDA methodology
- Window functions (RANK, ROW_NUMBER)
- Aggregations & grouping
- Dimensional modeling analysis
- Schema introspection
- Retail analytics logic
- Built as a personal analytics project inspired by coursework and real-world data warehouse approaches.
- Suitable for demonstrating SQL proficiency in analytics, BI, and data engineering.
D.G.A. DINETH HIRUSHA