GEO Benchmark

Framework for evaluating LLM performance on climate data prediction using global geographic meshes.

Overview

Creates geographic coordinate grids, queries LLMs for climate data, and compares results against ERA5 climatology.

Features

Mesh generation with land/ocean detection
Multi-provider support: OpenAI, Anthropic Claude, Google Gemini, Ollama
Batch processing with resume capability
Distributed processing for large meshes
ERA5 climatology comparison
YAML configuration
Spatial analysis with population and bathymetry data
Temperature visualization and statistical analysis

Quick Start

Edit config.yaml to specify mesh file, model provider, and parameters, then run python climate_llm_benchmark.py.

Standard processing:

# Generate mesh
python geo_mesh_processor.py 20
# Edit config.yaml: set mesh_file, provider, model

# Run benchmark
python climate_llm_benchmark.py

# Complete analysis
python run_complete_analysis_pipeline.py results/climate_results_20.0deg_r10_simple.json

Distributed processing:

# Generate and split mesh
python geo_mesh_processor.py 1
python split_mesh.py meshes/mesh_data_1.0deg.json 20

# Configure chunk mode in config.yaml
# Run parallel chunks
for i in {1..20}; do python climate_llm_benchmark.py $i & done; wait

# Combine results
python combine_results.py "results/climate_results_*_chunk_*_simple.json" results/combined.json
python run_complete_analysis_pipeline.py results/combined.json

Documentation

File Descriptions - Script and data structure overview
Usage Guide - Examples and workflows
Examples - Common use cases and commands
Distributed Processing - Parallel processing guide

Requirements

Python 3.8+
API keys for chosen provider (OpenAI, Anthropic, Google) or Ollama server for local models
ERA5 climatology data (NetCDF format)
Natural Earth 10m Land shapefile in data/land/

pip install -r requirements.txt

# Optional provider dependencies
pip install langchain-anthropic      # For Claude
pip install langchain-google-genai   # For Gemini
pip install langchain-community      # For Ollama

Citation

@software{geo_benchmark,
  title={GEO Benchmark: LLM Climate Data Evaluation Framework},
  year={2024},
  url={https://github.com/CliDyn/geo_benchmark}
}

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
data		data
meshes		meshes
results		results
.gitignore		.gitignore
CHUNKING_UPDATE_SUMMARY.md		CHUNKING_UPDATE_SUMMARY.md
CLAUDE.md		CLAUDE.md
DISTRIBUTED_PROCESSING.md		DISTRIBUTED_PROCESSING.md
ENHANCED_CLI_USAGE.md		ENHANCED_CLI_USAGE.md
EXAMPLES.md		EXAMPLES.md
LICENSE		LICENSE
README.md		README.md
RESUME_ENHANCEMENT.md		RESUME_ENHANCEMENT.md
USAGE.md		USAGE.md
add_bathymetry_to_results.py		add_bathymetry_to_results.py
add_era5_to_results.py		add_era5_to_results.py
add_population_to_results.py		add_population_to_results.py
aggregate_bathymetry.py		aggregate_bathymetry.py
analyze_country_performance.py		analyze_country_performance.py
base_ollama.sh		base_ollama.sh
climate_llm_benchmark.py		climate_llm_benchmark.py
combine_results.py		combine_results.py
combined.py		combined.py
compare_llm_era5.py		compare_llm_era5.py
config.yaml		config.yaml
extend_results_with_climate_change_rmse.py		extend_results_with_climate_change_rmse.py
extend_results_with_spatial_rmse.py		extend_results_with_spatial_rmse.py
file_descriptions.md		file_descriptions.md
geo_mesh_processor.py		geo_mesh_processor.py
multivariate_rmse_analysis.py		multivariate_rmse_analysis.py
plot_bathymetry_map.py		plot_bathymetry_map.py
plot_climate_change_analysis.py		plot_climate_change_analysis.py
plot_density_comparison.py		plot_density_comparison.py
plot_elevation_clusters.py		plot_elevation_clusters.py
plot_era5_climate_change.py		plot_era5_climate_change.py
plot_mesh.py		plot_mesh.py
plot_population_clusters.py		plot_population_clusters.py
plot_population_map.py		plot_population_map.py
plot_scatter_comparison.py		plot_scatter_comparison.py
plot_scatter_comparison_climate.py		plot_scatter_comparison_climate.py
plot_spatial_analysis.py		plot_spatial_analysis.py
plot_spatial_analysis_filtered.py		plot_spatial_analysis_filtered.py
plot_spatial_maps_climate_change.py		plot_spatial_maps_climate_change.py
plot_spatial_maps_only.py		plot_spatial_maps_only.py
plot_temperature_comparison_colored.py		plot_temperature_comparison_colored.py
plot_temperature_results.py		plot_temperature_results.py
point_rmse_analysis.py		point_rmse_analysis.py
process_era5_climatology.py		process_era5_climatology.py
requirements.txt		requirements.txt
run_complete_analysis_pipeline.py		run_complete_analysis_pipeline.py
spatial_rmse_analysis.py		spatial_rmse_analysis.py
split_mesh.py		split_mesh.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GEO Benchmark

Overview

Features

Quick Start

Documentation

Requirements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

CliDyn/geo_benchmark

Folders and files

Latest commit

History

Repository files navigation

GEO Benchmark

Overview

Features

Quick Start

Documentation

Requirements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages