Skip to content

[Q2-P2] Document SDK performance characteristics and best practices #26

@karlwaldman

Description

@karlwaldman

Problem

Users don't know:

  • How long queries should take
  • When to use custom timeouts
  • What's "normal" performance

Solution

Add comprehensive performance documentation:

# Performance Guide

## Expected Response Times

### Historical Queries

| Date Range | Expected Time | Recommended Timeout |
|------------|---------------|---------------------|
| 1 day      | 1-2 seconds   | 30 seconds          |
| 1 week     | 5-10 seconds  | 30 seconds          |
| 1 month    | 15-25 seconds | 60 seconds          |
| 1 year     | 60-90 seconds | 120 seconds         |

### Custom Timeouts

For queries longer than 1 year:

```python
# 5 years of data - use 3 minute timeout
historical = client.historical.get(
    commodity='WTI_USD',
    start_date='2020-01-01',
    end_date='2024-12-31',
    timeout=180
)

Best Practices

1. Use Appropriate Date Ranges

Good: Request only data you need

# Get last week
historical = client.historical.get(
    commodity='WTI_USD',
    start_date=(datetime.now() - timedelta(days=7)).date(),
    end_date=datetime.now().date()
)

Bad: Request full year when you only need a week

# DON'T DO THIS - 10x slower than necessary
historical = client.historical.get(
    commodity='WTI_USD',
    start_date='2024-01-01',
    end_date='2024-12-31'
)
# Then filter: [p for p in historical.data if ...]

2. Use Pagination for Large Datasets

# Memory-efficient iteration
for page in client.historical.iter_pages(
    commodity='WTI_USD',
    start_date='2024-01-01',
    per_page=1000
):
    process_batch(page)  # Process 1000 records at a time

3. Choose Appropriate Intervals

# Daily interval for year (365 records)
historical = client.historical.get(..., interval='daily')  # Fast

# Raw interval for year (1M+ records)
historical = client.historical.get(..., interval='raw')    # Very slow!

Troubleshooting

Timeout Errors

If you're getting timeout errors:

  1. Check your date range - is it larger than expected?
  2. Increase timeout for large queries
  3. Use pagination instead of getting all data at once
  4. Consider using a coarser interval (daily instead of raw)

## Acceptance Criteria

- [ ] Performance guide added to docs
- [ ] Best practices documented
- [ ] Troubleshooting guide added
- [ ] Examples updated with performance notes
- [ ] README links to performance guide

## Estimated Effort

**Time**: 3 hours

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions