Refactor race data architecture for scalability#22
Conversation
Introduces a new architecture for race data that scales to 100,000+ races: - Split data into individual race modules (courses/*.ts) and route files (routes/*.json) - Add lazy loading for heavy GPS route data - Create type-safe registry with filtering, search, and categorization - Add compatibility layer to maintain backwards compatibility during migration - Include migration script to convert existing marathon-data.json Architecture: - types.ts: Type definitions for races, routes, regions, tiers - registry.ts: Central registry with sync metadata and async route loading - helpers.ts: Utilities for validation, inference, and code generation - compat.ts: Backwards-compatible wrapper around existing data - courses/: Individual TypeScript files per race - routes/: JSON files with GPS thumbnail points https://claude.ai/code/session_014sbqjK25p1uWsCbf2sp7xz
Adds support for half marathons, 10Ks, 5Ks, and ultras:
- Add RaceType: 'marathon' | 'half-marathon' | '10k' | '5k' | 'ultra' | 'other'
- Add RACE_DISTANCES constant with standard distances in km
- Add inferRaceType() helper to auto-detect type from distance
- Add raceType to RaceMetadata and RaceSummary
- Add filtering by raceType and distance range
- Add helper functions: getMarathons(), getHalfMarathons(), get10Ks(), get5Ks(), getUltras()
- Update compat layer and example courses
Usage:
const halfs = getHalfMarathons();
const races = filterRaces({ raceType: ['marathon', 'half-marathon'] });
https://claude.ai/code/session_014sbqjK25p1uWsCbf2sp7xz
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
✅ Deploy Preview for trainpace ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR introduces a scalable race data architecture to replace the monolithic marathon-data.json approach. The new system uses modular TypeScript files for race metadata with lazy-loaded route data stored in separate JSON files. A global registry manages race registration, and a compatibility layer maintains backwards compatibility during migration.
Changes:
- Introduces comprehensive type system for race metadata, routes, and registry entries with support for multiple race types
- Implements registry-based architecture with filtering, searching, and sorting capabilities
- Provides migration script to convert existing marathon-data.json to new modular format
- Includes helper utilities for region/tier inference, validation, and code generation
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
| vite-project/src/data/races/types.ts | Defines type system for race metadata, routes, regions, tiers, and registry entries |
| vite-project/src/data/races/registry.ts | Implements central registry with race registration, metadata queries, and route loading |
| vite-project/src/data/races/helpers.ts | Provides utilities for region/tier inference, validation, and transformation |
| vite-project/src/data/races/index.ts | Main module entry point with exports and backwards compatibility layer |
| vite-project/src/data/races/compat.ts | Compatibility layer that auto-registers races from legacy marathon-data.json |
| vite-project/src/data/races/courses/boston.ts | Example migrated race file for Boston Marathon with metadata and route loader |
| vite-project/src/data/races/courses/oslo.ts | Example migrated race file for Oslo Marathon with metadata and route loader |
| vite-project/src/data/races/routes/boston-route.json | Separate route data for Boston Marathon |
| vite-project/src/data/races/routes/oslo-route.json | Separate route data for Oslo Marathon |
| scripts/migrate-marathon-data.ts | Migration script to convert marathon-data.json to new modular format |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (const [id, data] of Object.entries(legacyData)) { | ||
| const metadata = { | ||
| id, | ||
| name: data.name, | ||
| city: data.city, | ||
| country: data.country, | ||
| region: inferRegion(data.country), | ||
| tier: inferTier(data.name, data.elevationGain), | ||
| distance: data.distance, | ||
| elevationGain: data.elevationGain, | ||
| elevationLoss: data.elevationLoss, | ||
| startElevation: data.startElevation, | ||
| endElevation: data.endElevation, | ||
| slug: data.slug, | ||
| raceDate: data.raceDate, | ||
| website: data.website, | ||
| description: data.description, | ||
| tips: data.tips || [], | ||
| paceStrategy: data.paceStrategy || { | ||
| type: "even-pace", | ||
| summary: "Run at consistent effort throughout.", | ||
| segments: [], | ||
| }, | ||
| fuelingNotes: data.fuelingNotes || "", | ||
| faq: data.faq || [], | ||
| }; | ||
|
|
||
| const routeData = { | ||
| raceId: id, | ||
| thumbnailPoints: data.thumbnailPoints || [], | ||
| }; | ||
|
|
||
| registerRace(metadata as any, async () => routeData); | ||
| } |
There was a problem hiding this comment.
The migrateFromLegacyData function uses 'as any' type assertion when registering metadata (line 245). This bypasses TypeScript's type checking and could hide missing or incorrect fields. Since you're constructing the metadata object inline, it should be possible to provide proper typing or at least validate the constructed metadata using the validateMetadata helper function before registration.
| } | ||
| } | ||
|
|
||
| // Default to Europe if unknown |
There was a problem hiding this comment.
The inferRegion function defaults to "europe" when a country is not recognized (line 94). This assumption may not be appropriate for all unrecognized countries. Consider either returning a more neutral default, throwing an error, or logging a warning when defaulting to help identify missing region mappings.
| // Default to Europe if unknown | |
| // Default to Europe if unknown; log a warning to highlight missing mappings | |
| console.warn( | |
| `[inferRegion] Unrecognized country/location "${country}". ` + | |
| 'Defaulting region to "europe". Consider updating REGION_MAP.' | |
| ); |
| // Default to gold for established races | ||
| return "gold"; |
There was a problem hiding this comment.
The inferTier function defaults to "gold" for all unrecognized races (line 169). This default may not accurately represent the tier of unknown races and could lead to incorrect categorization. Consider using a more conservative default like "bronze" or requiring explicit tier specification for races that don't match the predefined lists.
| // Default to gold for established races | |
| return "gold"; | |
| // Use a conservative default tier for unrecognized races | |
| return "bronze"; |
| import "./compat"; | ||
|
|
||
| // ============================================================================ | ||
| // Individual Race Modules (Optional - for fully migrated races) | ||
| // ============================================================================ | ||
|
|
||
| // Uncomment these as you migrate races to the new format: | ||
| // import "./courses/boston"; | ||
| // import "./courses/oslo"; |
There was a problem hiding this comment.
The index.ts file imports the compat layer at line 142, which will auto-register all races from marathon-data.json. However, the commented-out imports for boston and oslo at lines 149-150 would cause duplicate registrations if uncommented, since these races are already in marathon-data.json and will be registered by compat.ts. The PR description states boston and oslo are "fully migrated as reference implementations", but they're not actually being imported. Either import them and handle the duplicate registration issue, or clarify in the comments that these should only be uncommented after removing those races from marathon-data.json.
| const metadata = { | ||
| id, | ||
| name: data.name, | ||
| city: data.city, | ||
| country: data.country, | ||
| region, | ||
| tier, | ||
| distance: data.distance, | ||
| elevationGain: data.elevationGain, | ||
| elevationLoss: data.elevationLoss, | ||
| startElevation: data.startElevation, | ||
| endElevation: data.endElevation, | ||
| slug: data.slug, | ||
| raceDate: data.raceDate, | ||
| website: data.website, | ||
| description: data.description, | ||
| tips: data.tips || [], | ||
| paceStrategy: data.paceStrategy || { | ||
| type: "even-pace", | ||
| summary: "Run at consistent effort throughout the race.", | ||
| segments: [], | ||
| }, | ||
| fuelingNotes: data.fuelingNotes || "", | ||
| faq: data.faq || [], | ||
| keywords: generateKeywords(data), | ||
| lastUpdated: new Date().toISOString().split("T")[0], | ||
| }; |
There was a problem hiding this comment.
The migration script's generateCourseFile function is missing the raceType field in the metadata object. This field is required by the RaceMetadata interface (line 132 of types.ts) but is not being populated during migration. This will cause TypeScript errors and runtime issues when the migrated files are used. Add raceType field using inferRaceType(data.distance) similar to how it's done in compat.ts line 63.
| return id | ||
| .replace(/-/g, "_") | ||
| .replace(/[^a-zA-Z0-9_]/g, "") | ||
| .replace(/^(\d)/, "_$1"); // Prefix with underscore if starts with number |
There was a problem hiding this comment.
The toVariableName function in helpers.ts produces snake_case output (e.g., "boston_marathon"), but the migration script's toVariableName function produces camelCase (e.g., "bostonMarathon"). This inconsistency will cause issues if both functions are used interchangeably or if the helper's generateRaceFileContent is used. The implementations should be aligned to produce the same output format.
| return id | |
| .replace(/-/g, "_") | |
| .replace(/[^a-zA-Z0-9_]/g, "") | |
| .replace(/^(\d)/, "_$1"); // Prefix with underscore if starts with number | |
| // Normalize to words separated by non-alphanumeric characters | |
| const cleaned = id.replace(/[^a-zA-Z0-9]+/g, " ").trim(); | |
| if (!cleaned) { | |
| return "_"; | |
| } | |
| const words = cleaned.split(/\s+/); | |
| // First word lowercase, subsequent words capitalized -> camelCase | |
| let variableName = words | |
| .map((word, index) => { | |
| const lower = word.toLowerCase(); | |
| if (index === 0) { | |
| return lower; | |
| } | |
| return lower.charAt(0).toUpperCase() + lower.slice(1); | |
| }) | |
| .join(""); | |
| // Ensure only valid identifier characters remain | |
| variableName = variableName.replace(/[^a-zA-Z0-9]/g, ""); | |
| // Prefix with underscore if starts with number | |
| variableName = variableName.replace(/^(\d)/, "_$1"); | |
| return variableName; |
| "london", | ||
| "berlin", | ||
| "tokyo", | ||
| "sydney", |
There was a problem hiding this comment.
Sydney is listed in WORLD_MAJORS array but Sydney Marathon is not actually one of the World Marathon Majors. The six World Marathon Majors are: Tokyo, Boston, London, Berlin, Chicago, and New York. Sydney should be removed from this list to avoid incorrect tier classification.
| "sydney", |
| export function registerRace( | ||
| metadata: RaceMetadata, | ||
| routeLoader: () => Promise<RaceRouteData> | ||
| ): void { | ||
| registry.set(metadata.id, { | ||
| metadata, | ||
| loadRoute: routeLoader, | ||
| routeLoaded: false, | ||
| }); | ||
| } |
There was a problem hiding this comment.
The registerRace function silently overwrites existing registrations with the same ID. If both the compat layer and individual race files register the same race (e.g., boston.ts and compat.ts both try to register "boston"), the last one wins without warning. This could lead to unexpected behavior during migration. Consider adding a check to warn or throw an error when attempting to register a race ID that already exists, or document this "last-one-wins" behavior explicitly.
| export function simplifyRoute(points: RoutePoint[], targetCount: number): RoutePoint[] { | ||
| if (points.length <= targetCount) { | ||
| return points; | ||
| } | ||
|
|
||
| const result: RoutePoint[] = [points[0]]; // Always include start | ||
| const step = (points.length - 1) / (targetCount - 1); | ||
|
|
||
| for (let i = 1; i < targetCount - 1; i++) { | ||
| const index = Math.round(i * step); | ||
| result.push(points[index]); | ||
| } | ||
|
|
||
| result.push(points[points.length - 1]); // Always include end | ||
| return result; | ||
| } |
There was a problem hiding this comment.
The simplifyRoute function has a potential division by zero error when targetCount is 1. The expression (points.length - 1) / (targetCount - 1) on line 311 will result in division by zero if targetCount is 1. This should be handled as an edge case, either by returning just the first point or by requiring targetCount to be at least 2.
| if (distanceKm <= 5.5) return "5k"; | ||
| if (distanceKm <= 11) return "10k"; | ||
| if (distanceKm <= 22) return "half-marathon"; | ||
| if (distanceKm <= 43) return "marathon"; | ||
| if (distanceKm > 43) return "ultra"; | ||
| return "other"; |
There was a problem hiding this comment.
The condition 'distanceKm > 43' is always true.
| if (distanceKm <= 5.5) return "5k"; | |
| if (distanceKm <= 11) return "10k"; | |
| if (distanceKm <= 22) return "half-marathon"; | |
| if (distanceKm <= 43) return "marathon"; | |
| if (distanceKm > 43) return "ultra"; | |
| return "other"; | |
| // Handle non-finite values explicitly to keep "other" as the fallback | |
| if (!Number.isFinite(distanceKm)) return "other"; | |
| if (distanceKm <= 5.5) return "5k"; | |
| if (distanceKm <= 11) return "10k"; | |
| if (distanceKm <= 22) return "half-marathon"; | |
| if (distanceKm <= 43) return "marathon"; | |
| return "ultra"; |

Summary
This PR introduces a new scalable race data architecture that replaces the monolithic
marathon-data.jsonwith a modular system supporting 100,000+ races without performance degradation. The new system uses individual TypeScript files per race with lazy-loaded route data and automatic registry management.Key Changes
courses/directory with auto-registration via a global registryroutes/as JSON, loaded only when needed for better performancescripts/migrate-marathon-data.tsautomatically converts existingmarathon-data.jsonto the new formatcompat.tsprovides drop-in replacement for old imports during migration periodhelpers.tswith region/tier inference, validation, and code generation functionstypes.tsfor metadata, routes, and registry entriesregistry.tsmanages race registration and provides query APIs (filtering, searching, sorting)Implementation Details
import { marathonData } from '@/data/races/compat'with zero changesMigration Path
getRaceMetadata,loadRaceRoute, etc.)Benefits
https://claude.ai/code/session_014sbqjK25p1uWsCbf2sp7xz