Skip to content

This repo contains time series forecasting methods for water quality analysis

Notifications You must be signed in to change notification settings

knowledge-computing/Aquasense

Repository files navigation

Aquasense

This repo contains multiple time series forecasting methods for water quality analysis.

Built on the baselines, we propose a spatially informed, clustering-based masking strategy during self-supervised pretraining to explicitly incorporate spatial relationships among water sensors. Specifically, the approach groups geographically nearby sensors and masks & reconstructs entire spatial clusters during pretraining, forcing the model to learn long-range dependencies.

Quick Start

1. Validate Setup

module load python3
source activate multimae
export PYTHONPATH=$PWD:$PYTHONPATH

2. Run Experiments

Full Experiment (All Models)

python run_unified_experiments.py
  • Models: LSTM, STD-MAE Random, STD-MAE Distance

Quick Test (Reduced Epochs)

python run_unified_experiments.py --quick
  • Duration: ~30-60 minutes (10 pretrain + 5 downstream epochs)
  • Purpose: Test pipeline functionality

Specific Models Only

# Run only LSTM and Random masking
python run_unified_experiments.py --models lstm,random

# Run only Distance masking
python run_unified_experiments.py --models distance

What Gets Generated

Results

experiments/unified_comparison_[name]_[timestamp]/
├── hyperparameters.json          # All hyperparameters used
├── detailed_results.json         # Complete experimental results
├── comparison_summary.txt         # Human-readable summary
├── lstm_results/                  # LSTM outputs
│   ├── mae_matrix.csv
│   ├── r2_matrix.csv  
│   ├── training_losses.csv
│   ├── heatmaps.png
│   └── forecast_vs_truth_grid.png
├── stdmae_random_results/         # Random masking outputs
│   └── [visualization files]
└── stdmae_distance_results/       # Distance masking outputs
    └── [visualization files]

Metrics Generated

  • LSTM: MAE, R², MSE (per-sensor and overall)
  • STD-MAE: Test MSE from downstream evaluation (we adapt STD-MAE from link)
  • Visualizations: Heatmaps, forecast comparisons, training curves

pred_len = 24 # Prediction sequence length
hidden = 128
distance_threshold_km = 75 # Distance masking threshold

About

This repo contains time series forecasting methods for water quality analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages