Spatial-Relational Aggregation Toolkit
Series of tools to operationalize a semantic mapping from a developer-centered civic-data ontology to its representation in a database.
Which is a bunch of fancy words for: it makes a system that can answer statistical questions about a place using a small, standardized, and more-human-friendly vocabulary.
- It defines a hierarchy of geographies. This includes administrative regions or any set of similar features (e.g. parcels, 311 request points),
- Defines where and how to retrieve the set of regions (and their properties) within each feature and represent them in a standard fashion, and
- Defines how regions are described by the source data. Allows for arbitrary queries for properties of regions. (e.g. median age of a neighborhood, price of a parcel, avg parcel price in a town),
- which allows it to provide a standardized way for making queries using spatial aggregation without the need to dig through the underlying data.
SpaceRAT was designed with open data portals in mind (and particularly regional ones). These portals share data about a single place or region but from various organizations, departments, or vendors who each may have their own data publishing requirements.
SpaceRAT allows such portals to give their publishers flexibility while making it easy to unify their data.
- Create
spaceratdatabase- default is in-memory SQLite, can use any SQL database to store model.
- can be on the source data postgres database within its own schema (default
spacerat). This is necessary for full vector tile support.
- Load/update model
- Create/update materialized views for geographic indices using
Geography.query.
SpaceRAT provides a CLI tool to run useful maintenance scripts.
Generate set of basic Questions from a table in the source database to be manually tweaked if necessary.
This will generate one Question per column of the table and dump yaml representations of them in the model
directory.
$ spacerat generate-questions source_id geog_levelUsing the source property-assessments
$ spacerat generate-questions property-assessments parcelLoad model data from file into database. Will overwrite existing data on collision.
$ spacerat update-model [model_dir]Load models from default directory (./models)
$ spacerat update-modelLoad models from custom directory
$ spacerat update-model /path/to/dir/Build/rebuild all geographic indices
$ spacerat update-modelBuild/rebuild for neighborhood and municipality geographic levels
$ spacerat build-geo-indices neighborhood municipalityCreate/update materialized views for geographic indices using Geography.query
$ spacerat build-geo-indices [geog_level ...]Build/rebuild all geographic indices
$ spacerat build-geo-indicesBuild/rebuild for neighborhood and municipality geographic levels
$ spacerat build-geo-indices neighborhood municipalityCreate or update materialized views used for indicator maps. These can then be served as vector tiles for mapping applications.
This will create/update materialized a materialized view for each geog level provided with data from the provided source.
Questions can be specified by passing comma-separated lists of IDs to --include or --exclude. If no specifications
are made, all questions for the source will be used.
$ spacerat populate-maps source_id geog_level ... [--include=question_id ... --exclude=question_id ...]Generate a map of neighborhoods with results for all Questions for property-assessments.
$ spacerat populate-maps property-assessments neighborhoodGenerate maps for neighborhoods and municipalities with only values for fairmarkettotal (fair market assessed value
for land + building)
Note that --exlcude arguments will override any --include arguments.
$ spacerat populate-maps property-assessments --include fairmarkettotal classdesc --exclude classdescInitialize a SpaceRAT configration.
This will...
- set up the SpaceRAT database,
- load your model from files, and
- make any necessary modifications on the source database. (e.g. creating a
spaceratschema, creating geographic indices)
$ spacerat init [--skip-model --skip-geo-indices] Query answers to Questions about Regions.
List available Questions
Get details about a Question
List available geographic levels
Get details on a geographic level including a list of regions.
Get details on a specific region within a geographic level.
Use SpaceRAT as a library in your own code.
e.g.
from spacerat.core import SpaceRAT
from spacerat.helpers import print_records
# initializing without args will use an in-memory sql db to hold the model,and will load the model
# from files found in ./model relative to the current working directory (not necessarily this file)
rat = SpaceRAT()
question = rat.get_question("fair-market-assessed-value")
neighborhoods = rat.get_geog("neighborhood")
shadyside = neighborhoods.get_region("shadyside")
bloomfield = neighborhoods.get_region("bloomfield")
print(question)
# Question(id='fair-market-assessed-value', name='Fair Market Assessed Value', datatype='continuous')
# Get stats on fair market assessment value for Shadyside at current time.
print_records(rat.answer_question(question, shadyside))
# 2024-06-01T00:00:00
# - mean: 516796.628695209
# - mode: 250000.0
# - min: 0.0
# - first_quartile: 147300.0
# - median: 260000.0
# - third_quartile: 426775.0
# - max: 126459400.0
# - stddev: 2985002.463899841
# - sum: 2027909971.0
# - n: 3924
# Same thing but for Bloomfield
print_records(rat.answer_question(question, bloomfield))
# 2024-06-01T00:00:00
# - mean: 228085.7476367803
# - mode: 150000.0
# - min: 0.0
# - first_quartile: 66950.0
# - median: 102700.0
# - third_quartile: 171100.0
# - max: 74945100.0
# - stddev: 1596977.2907357663
# - sum: 796247345.0
# - n: 3491
# Shadyside again but only Residential parcels
print_records(rat.answer_question(question, shadyside, variant="residential"))
# 2024-06-01T00:00:00
# - mean: 300894.54238310707
# - mode: 200000.0
# - min: 0.0
# - first_quartile: 146000.0
# - median: 245000.0
# - third_quartile: 383350.0
# - max: 2500000.0
# - stddev: 238512.67504200496
# - sum: 997465408.0
# - n: 3315
# All 90 Pittsburgh neighborhoods, two parcel questions
print_records(
rat.answer_question(
["parcel-class", "fair-market-assessed-value"], "neighborhood"
)
)
# neighborhood.allegheny center
# - time: 2024-07-01 00:00:00
# - parcel_class__mode: COMMERCIAL
# - parcel_class__n: 63
# - fair_market_assessed_value__mean: 3949404.761904762
# - fair_market_assessed_value__mode: 0.0
# - fair_market_assessed_value__min: 0.0
# - fair_market_assessed_value__first_quartile: 33450.0
# - fair_market_assessed_value__median: 548800.0
# - fair_market_assessed_value__third_quartile: 5588800.0
# - fair_market_assessed_value__max: 22108700.0
# - fair_market_assessed_value__stddev: 5999592.32838388
# - fair_market_assessed_value__sum: 248812500.0
# - fair_market_assessed_value__n: 63
# neighborhood.allegheny west
# - time: 2024-07-01 00:00:00
# - parcel_class__mode: RESIDENTIAL
# - parcel_class__n: 269
# - fair_market_assessed_value__mean: 546734.9442379182
# - fair_market_assessed_value__mode: 54000.0
# - fair_market_assessed_value__min: 0.0
# - fair_market_assessed_value__first_quartile: 117300.0
# - fair_market_assessed_value__median: 203400.0
# - fair_market_assessed_value__third_quartile: 324400.0
# - fair_market_assessed_value__max: 28230300.0
# - fair_market_assessed_value__stddev: 2126818.841172308
# - fair_market_assessed_value__sum: 147071700.0
# - fair_market_assessed_value__n: 269
# neighborhood.allentown
# - time: 2024-07-01 00:00:00
# - parcel_class__mode: RESIDENTIAL
# - parcel_class__n: 1547
# - fair_market_assessed_value__mean: 42152.47899159664
# - fair_market_assessed_value__mode: 400.0
# - fair_market_assessed_value__min: 0.0
# - fair_market_assessed_value__first_quartile: 3300.0
# - fair_market_assessed_value__median: 20000.0
# - fair_market_assessed_value__third_quartile: 32600.0
# - fair_market_assessed_value__max: 10122900.0
# - fair_market_assessed_value__stddev: 276475.42076201265
# - fair_market_assessed_value__sum: 65209885.0
# - fair_market_assessed_value__n: 1547
# ...replace variables with ure write and read roles
grant usage on schema spacerat to $READ_USER;
grant select on all tables in schema spacerat to $READ_USER;
ALTER DEFAULT PRIVILEGES FOR USER $WRITE_USER IN SCHEMA spacerat GRANT SELECT ON TABLES TO $READ_USER;