Skip to content

billimarie/prosecutor-database

Repository files navigation

U.S. Prosecutor Database

Last Updated: August 14th, 2025

No Maintenance Intended

Looking For Maintainer

The U.S. Prosecutor Database is looking for its next maintainer. If you have a vision of how to create a public, searchable database, ledger, or chain which everyday users can interact with through a simple UI/UX front-end, please submit a PR and we'll make sure this project lives on through your contributions.


Setting Up

To run the app locally on your machine:

  1. install node & npm (see official docs)
  2. install meteor via terminal: npm install -g meteor (see official docs)
  3. npm install
  4. meteor run & open http://localhost:3000/ in your browser

Adding Data to your local environment

To play around with data:

  1. in a new (simultaneous) terminal tab: meteor mongo
  2. verify that the show collections command produces the Attorneys collection
  3. insert a new document using the api folder .js files as a base. Make sure it contains the name, state, & role--otherwise it won't work. Example: db.Attorneys.insertOne({"id": "ag-01","state": "Alabama","name": "Steve Marshall","role": "Attorney General"})
  4. check on the app in your browser; it should automatically refresh

JSON Schema

I've asked ChatGPT to enhance our old JSON schema:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "US Prosecutor Database Schema",
  "description": "Enhanced schema for storing prosecutor profiles, policies, metrics, campaign statements, provenance, and privacy info for the US Prosecutor Database.",
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "pattern": "^[a-z]{2}-[a-z0-9\\-]+-[0-9]{4}$",
      "description": "Unique ID: state (2-letter), jurisdiction slug, year (e.g., ca-los-angeles-2025)"
    },
    "prosecutor": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "role": { "type": "string", "enum": ["District Attorney", "Assistant District Attorney", "State Attorney General", "Deputy AG", "US Attorney"] },
        "jurisdiction": {
          "type": "object",
          "properties": {
            "state": { "type": "string", "minLength": 2, "maxLength": 2 },
            "county": { "type": "string" },
            "city": { "type": "string" }
          },
          "required": ["state"]
        },
        "party_affiliation": { "type": "string", "enum": ["Democrat", "Republican", "Independent", "Nonpartisan", "Other", "Unknown"] },
        "demographics": {
          "type": "object",
          "properties": {
            "gender": { "type": "string", "enum": ["Male", "Female", "Nonbinary", "Other", "Unknown"] },
            "race_ethnicity": { "type": "string" },
            "dob": { "type": "string", "format": "date" }
          }
        },
        "contact": {
          "type": "object",
          "properties": {
            "phone": { "type": "string" },
            "email": { "type": "string", "format": "email" },
            "website": { "type": "string", "format": "uri" },
            "mailing_address": { "type": "string" }
          }
        },
        "term_start": { "type": "string", "format": "date" },
        "term_end": { "type": "string", "format": "date" },
        "photo_url": { "type": "string", "format": "uri" }
      },
      "required": ["name", "role", "jurisdiction"]
    },
    "policies": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "topic": { "type": "string", "description": "Policy category (e.g., 'discovery', 'bail', 'diversion')" },
          "description": { "type": "string" },
          "url": { "type": "string", "format": "uri" },
          "effective_date": { "type": "string", "format": "date" },
          "last_updated": { "type": "string", "format": "date" }
        },
        "required": ["topic", "url"]
      }
    },
    "metrics": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "id": { "type": "string", "description": "Metric identifier (e.g., 'diversion_rate', 'discovery_timeliness')" },
          "name": { "type": "string" },
          "value": { "type": "number" },
          "unit": { "type": "string", "description": "Unit of measurement (%, days, ratio, etc.)" },
          "period": { "type": "string", "description": "Reporting period (e.g., 2025Q2, 2024-01)" },
          "confidence": { "type": "number", "minimum": 0, "maximum": 1 },
          "evidence": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "url": { "type": "string", "format": "uri" },
                "quote": { "type": "string" },
                "quote_hash": { "type": "string" }
              }
            }
          },
          "caveats": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["id", "value", "period"]
      }
    },
    "campaign_statements": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "date": { "type": "string", "format": "date" },
          "url": { "type": "string", "format": "uri" },
          "statement": { "type": "string" },
          "themes": { "type": "array", "items": { "type": "string" }, "description": "Tags for major policy stances (e.g., 'increase_incarceration', 'expand_diversion')" }
        },
        "required": ["date", "statement"]
      }
    },
    "provenance": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "field": { "type": "string", "description": "Field path in dot notation" },
          "source_url": { "type": "string", "format": "uri" },
          "retrieved_at": { "type": "string", "format": "date-time" },
          "text_span_hash": { "type": "string" },
          "model_confidence": { "type": "number", "minimum": 0, "maximum": 1 }
        },
        "required": ["field", "source_url", "retrieved_at"]
      }
    },
    "privacy": {
      "type": "object",
      "properties": {
        "pii_redaction": { "type": "boolean" },
        "dp_noise_epsilon": { "type": "number", "description": "Differential privacy parameter if noise is added" }
      }
    },
    "last_updated": { "type": "string", "format": "date-time" }
  },
  "required": ["id", "prosecutor", "last_updated"]
}

You'll notice the following has been added:

  • metrics[] for justice-forward measures with evidence and confidence levels.
  • policies[] to store topic-tagged policy statements with effective dates.
  • campaign_statements[] to track rhetoric vs. practice.
  • provenance[] to record exactly where and when each data point came from (critical for AI-assisted ingestion).
  • privacy controls to indicate whether PII was redacted or differential privacy applied.
  • Stronger validation patterns for IDs, enums, and date formats to avoid messy data.

Production

TBD


Important Links


Community

Maintainers

Would you like to help maintain this project? Email me (link in profile).

Contributors

Interested in contributing to the web app? You'll find dev notes in the DOCS.md. Our stack is: Node.js, Meteor.js, MongoDB, Heroku.

We also need help with documentation for the GitHub page: https://billimarie.github.io/prosecutor-database. You can use the DOCS.md & the Hacktoberfest Issue as references to update our outdated GitHub page.


Post-Carceral

Post-Carceral is a digital community group of volunteers working on civic tech projects (like the US Prosecutor Database) in service of working toward a post-carceral ("beyond prison") world.

Stay Updated

Volunteer

You don't have to be a developer or a prisoners' rights activist to join. We're looking for all types of people with all types of interests & expertise to collaborate with.

Datathons: On Sundays, we hang out remotely and discuss recent prosecutor news, primary results, & campaigns. We also brainstorm new ways to collect data (considering the strange logic of the prosecutorial system, especially as it differs between localities & regions). If you'd like to join, send me an email.

Back to Top


License

The USPD is an open-source community project built to house data about current and previous US Prosecutors (copyright (c) 2017 - 2020 Billimarie Lubiano Robinson). It is licensed under GNU GPLv3. This means you are able to use, modify, & distribute USPD as long as the following conditions are met:

  • Acknowledge the original source (this repository & its contributors)
  • Apply the same license & copyright usage
  • Make public any changes, updates, or improvements upon USPD

For more information, please view the LICENSE.md file.

Back to Top