Skip to content

Refactor race data architecture for scalability#22

Open
aleexwong wants to merge 2 commits intomainfrom
claude/marathon-data-architecture-3qSIg
Open

Refactor race data architecture for scalability#22
aleexwong wants to merge 2 commits intomainfrom
claude/marathon-data-architecture-3qSIg

Conversation

@aleexwong
Copy link
Owner

Summary

This PR introduces a new scalable race data architecture that replaces the monolithic marathon-data.json with a modular system supporting 100,000+ races without performance degradation. The new system uses individual TypeScript files per race with lazy-loaded route data and automatic registry management.

Key Changes

  • New modular architecture: Individual race files in courses/ directory with auto-registration via a global registry
  • Lazy-loaded route data: Route points stored separately in routes/ as JSON, loaded only when needed for better performance
  • Migration script: scripts/migrate-marathon-data.ts automatically converts existing marathon-data.json to the new format
  • Backwards compatibility layer: compat.ts provides drop-in replacement for old imports during migration period
  • Helper utilities: helpers.ts with region/tier inference, validation, and code generation functions
  • Type system: Comprehensive TypeScript types in types.ts for metadata, routes, and registry entries
  • Registry system: registry.ts manages race registration and provides query APIs (filtering, searching, sorting)
  • Example races: Boston and Oslo marathons fully migrated as reference implementations

Implementation Details

  • Region/Tier Inference: Automatic classification based on race name and elevation characteristics
  • Code Generation: Migration script generates properly formatted TypeScript files with full metadata and JSDoc comments
  • Lazy Loading: Route data uses dynamic imports for code splitting; metadata is always synchronously available
  • Backwards Compatibility: Existing code can use import { marathonData } from '@/data/races/compat' with zero changes
  • Extensible Design: New races can be added by creating a course file and route JSON; auto-registration happens on import

Migration Path

  1. Run migration script to generate individual race files from existing data
  2. Gradually replace old imports with new API (getRaceMetadata, loadRaceRoute, etc.)
  3. Remove compatibility layer once all components are migrated
  4. Future races added directly in new format

Benefits

  • ✅ Scales to 100,000+ races without performance issues
  • ✅ Smaller bundle size through lazy loading
  • ✅ Better code organization and maintainability
  • ✅ Type-safe race data with comprehensive metadata
  • ✅ Powerful filtering, searching, and sorting capabilities
  • ✅ Zero breaking changes during migration period

https://claude.ai/code/session_014sbqjK25p1uWsCbf2sp7xz

Introduces a new architecture for race data that scales to 100,000+ races:

- Split data into individual race modules (courses/*.ts) and route files (routes/*.json)
- Add lazy loading for heavy GPS route data
- Create type-safe registry with filtering, search, and categorization
- Add compatibility layer to maintain backwards compatibility during migration
- Include migration script to convert existing marathon-data.json

Architecture:
- types.ts: Type definitions for races, routes, regions, tiers
- registry.ts: Central registry with sync metadata and async route loading
- helpers.ts: Utilities for validation, inference, and code generation
- compat.ts: Backwards-compatible wrapper around existing data
- courses/: Individual TypeScript files per race
- routes/: JSON files with GPS thumbnail points

https://claude.ai/code/session_014sbqjK25p1uWsCbf2sp7xz
Adds support for half marathons, 10Ks, 5Ks, and ultras:

- Add RaceType: 'marathon' | 'half-marathon' | '10k' | '5k' | 'ultra' | 'other'
- Add RACE_DISTANCES constant with standard distances in km
- Add inferRaceType() helper to auto-detect type from distance
- Add raceType to RaceMetadata and RaceSummary
- Add filtering by raceType and distance range
- Add helper functions: getMarathons(), getHalfMarathons(), get10Ks(), get5Ks(), getUltras()
- Update compat layer and example courses

Usage:
  const halfs = getHalfMarathons();
  const races = filterRaces({ raceType: ['marathon', 'half-marathon'] });

https://claude.ai/code/session_014sbqjK25p1uWsCbf2sp7xz
Copilot AI review requested due to automatic review settings January 29, 2026 16:22
@vercel
Copy link

vercel bot commented Jan 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
trainpace Ready Ready Preview, Comment Jan 29, 2026 4:22pm

@netlify
Copy link

netlify bot commented Jan 29, 2026

Deploy Preview for trainpace ready!

Name Link
🔨 Latest commit 8728888
🔍 Latest deploy log https://app.netlify.com/projects/trainpace/deploys/697b895707cf890009803d62
😎 Deploy Preview https://deploy-preview-22--trainpace.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 64
Accessibility: 89
Best Practices: 92
SEO: 100
PWA: 60
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a scalable race data architecture to replace the monolithic marathon-data.json approach. The new system uses modular TypeScript files for race metadata with lazy-loaded route data stored in separate JSON files. A global registry manages race registration, and a compatibility layer maintains backwards compatibility during migration.

Changes:

  • Introduces comprehensive type system for race metadata, routes, and registry entries with support for multiple race types
  • Implements registry-based architecture with filtering, searching, and sorting capabilities
  • Provides migration script to convert existing marathon-data.json to new modular format
  • Includes helper utilities for region/tier inference, validation, and code generation

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 21 comments.

Show a summary per file
File Description
vite-project/src/data/races/types.ts Defines type system for race metadata, routes, regions, tiers, and registry entries
vite-project/src/data/races/registry.ts Implements central registry with race registration, metadata queries, and route loading
vite-project/src/data/races/helpers.ts Provides utilities for region/tier inference, validation, and transformation
vite-project/src/data/races/index.ts Main module entry point with exports and backwards compatibility layer
vite-project/src/data/races/compat.ts Compatibility layer that auto-registers races from legacy marathon-data.json
vite-project/src/data/races/courses/boston.ts Example migrated race file for Boston Marathon with metadata and route loader
vite-project/src/data/races/courses/oslo.ts Example migrated race file for Oslo Marathon with metadata and route loader
vite-project/src/data/races/routes/boston-route.json Separate route data for Boston Marathon
vite-project/src/data/races/routes/oslo-route.json Separate route data for Oslo Marathon
scripts/migrate-marathon-data.ts Migration script to convert marathon-data.json to new modular format

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +213 to +246
for (const [id, data] of Object.entries(legacyData)) {
const metadata = {
id,
name: data.name,
city: data.city,
country: data.country,
region: inferRegion(data.country),
tier: inferTier(data.name, data.elevationGain),
distance: data.distance,
elevationGain: data.elevationGain,
elevationLoss: data.elevationLoss,
startElevation: data.startElevation,
endElevation: data.endElevation,
slug: data.slug,
raceDate: data.raceDate,
website: data.website,
description: data.description,
tips: data.tips || [],
paceStrategy: data.paceStrategy || {
type: "even-pace",
summary: "Run at consistent effort throughout.",
segments: [],
},
fuelingNotes: data.fuelingNotes || "",
faq: data.faq || [],
};

const routeData = {
raceId: id,
thumbnailPoints: data.thumbnailPoints || [],
};

registerRace(metadata as any, async () => routeData);
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migrateFromLegacyData function uses 'as any' type assertion when registering metadata (line 245). This bypasses TypeScript's type checking and could hide missing or incorrect fields. Since you're constructing the metadata object inline, it should be possible to provide proper typing or at least validate the constructed metadata using the validateMetadata helper function before registration.

Copilot uses AI. Check for mistakes.
}
}

// Default to Europe if unknown
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inferRegion function defaults to "europe" when a country is not recognized (line 94). This assumption may not be appropriate for all unrecognized countries. Consider either returning a more neutral default, throwing an error, or logging a warning when defaulting to help identify missing region mappings.

Suggested change
// Default to Europe if unknown
// Default to Europe if unknown; log a warning to highlight missing mappings
console.warn(
`[inferRegion] Unrecognized country/location "${country}". ` +
'Defaulting region to "europe". Consider updating REGION_MAP.'
);

Copilot uses AI. Check for mistakes.
Comment on lines +168 to +169
// Default to gold for established races
return "gold";
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inferTier function defaults to "gold" for all unrecognized races (line 169). This default may not accurately represent the tier of unknown races and could lead to incorrect categorization. Consider using a more conservative default like "bronze" or requiring explicit tier specification for races that don't match the predefined lists.

Suggested change
// Default to gold for established races
return "gold";
// Use a conservative default tier for unrecognized races
return "bronze";

Copilot uses AI. Check for mistakes.
Comment on lines +142 to +150
import "./compat";

// ============================================================================
// Individual Race Modules (Optional - for fully migrated races)
// ============================================================================

// Uncomment these as you migrate races to the new format:
// import "./courses/boston";
// import "./courses/oslo";
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index.ts file imports the compat layer at line 142, which will auto-register all races from marathon-data.json. However, the commented-out imports for boston and oslo at lines 149-150 would cause duplicate registrations if uncommented, since these races are already in marathon-data.json and will be registered by compat.ts. The PR description states boston and oslo are "fully migrated as reference implementations", but they're not actually being imported. Either import them and handle the duplicate registration issue, or clarify in the comments that these should only be uncommented after removing those races from marathon-data.json.

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +150
const metadata = {
id,
name: data.name,
city: data.city,
country: data.country,
region,
tier,
distance: data.distance,
elevationGain: data.elevationGain,
elevationLoss: data.elevationLoss,
startElevation: data.startElevation,
endElevation: data.endElevation,
slug: data.slug,
raceDate: data.raceDate,
website: data.website,
description: data.description,
tips: data.tips || [],
paceStrategy: data.paceStrategy || {
type: "even-pace",
summary: "Run at consistent effort throughout the race.",
segments: [],
},
fuelingNotes: data.fuelingNotes || "",
faq: data.faq || [],
keywords: generateKeywords(data),
lastUpdated: new Date().toISOString().split("T")[0],
};
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration script's generateCourseFile function is missing the raceType field in the metadata object. This field is required by the RaceMetadata interface (line 132 of types.ts) but is not being populated during migration. This will cause TypeScript errors and runtime issues when the migrated files are used. Add raceType field using inferRaceType(data.distance) similar to how it's done in compat.ts line 63.

Copilot uses AI. Check for mistakes.
Comment on lines +286 to +289
return id
.replace(/-/g, "_")
.replace(/[^a-zA-Z0-9_]/g, "")
.replace(/^(\d)/, "_$1"); // Prefix with underscore if starts with number
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The toVariableName function in helpers.ts produces snake_case output (e.g., "boston_marathon"), but the migration script's toVariableName function produces camelCase (e.g., "bostonMarathon"). This inconsistency will cause issues if both functions are used interchangeably or if the helper's generateRaceFileContent is used. The implementations should be aligned to produce the same output format.

Suggested change
return id
.replace(/-/g, "_")
.replace(/[^a-zA-Z0-9_]/g, "")
.replace(/^(\d)/, "_$1"); // Prefix with underscore if starts with number
// Normalize to words separated by non-alphanumeric characters
const cleaned = id.replace(/[^a-zA-Z0-9]+/g, " ").trim();
if (!cleaned) {
return "_";
}
const words = cleaned.split(/\s+/);
// First word lowercase, subsequent words capitalized -> camelCase
let variableName = words
.map((word, index) => {
const lower = word.toLowerCase();
if (index === 0) {
return lower;
}
return lower.charAt(0).toUpperCase() + lower.slice(1);
})
.join("");
// Ensure only valid identifier characters remain
variableName = variableName.replace(/[^a-zA-Z0-9]/g, "");
// Prefix with underscore if starts with number
variableName = variableName.replace(/^(\d)/, "_$1");
return variableName;

Copilot uses AI. Check for mistakes.
"london",
"berlin",
"tokyo",
"sydney",
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sydney is listed in WORLD_MAJORS array but Sydney Marathon is not actually one of the World Marathon Majors. The six World Marathon Majors are: Tokyo, Boston, London, Berlin, Chicago, and New York. Sydney should be removed from this list to avoid incorrect tier classification.

Suggested change
"sydney",

Copilot uses AI. Check for mistakes.
Comment on lines +39 to +48
export function registerRace(
metadata: RaceMetadata,
routeLoader: () => Promise<RaceRouteData>
): void {
registry.set(metadata.id, {
metadata,
loadRoute: routeLoader,
routeLoaded: false,
});
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registerRace function silently overwrites existing registrations with the same ID. If both the compat layer and individual race files register the same race (e.g., boston.ts and compat.ts both try to register "boston"), the last one wins without warning. This could lead to unexpected behavior during migration. Consider adding a check to warn or throw an error when attempting to register a race ID that already exists, or document this "last-one-wins" behavior explicitly.

Copilot uses AI. Check for mistakes.
Comment on lines +305 to +320
export function simplifyRoute(points: RoutePoint[], targetCount: number): RoutePoint[] {
if (points.length <= targetCount) {
return points;
}

const result: RoutePoint[] = [points[0]]; // Always include start
const step = (points.length - 1) / (targetCount - 1);

for (let i = 1; i < targetCount - 1; i++) {
const index = Math.round(i * step);
result.push(points[index]);
}

result.push(points[points.length - 1]); // Always include end
return result;
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplifyRoute function has a potential division by zero error when targetCount is 1. The expression (points.length - 1) / (targetCount - 1) on line 311 will result in division by zero if targetCount is 1. This should be handled as an edge case, either by returning just the first point or by requiring targetCount to be at least 2.

Copilot uses AI. Check for mistakes.
Comment on lines +91 to +96
if (distanceKm <= 5.5) return "5k";
if (distanceKm <= 11) return "10k";
if (distanceKm <= 22) return "half-marathon";
if (distanceKm <= 43) return "marathon";
if (distanceKm > 43) return "ultra";
return "other";
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition 'distanceKm > 43' is always true.

Suggested change
if (distanceKm <= 5.5) return "5k";
if (distanceKm <= 11) return "10k";
if (distanceKm <= 22) return "half-marathon";
if (distanceKm <= 43) return "marathon";
if (distanceKm > 43) return "ultra";
return "other";
// Handle non-finite values explicitly to keep "other" as the fallback
if (!Number.isFinite(distanceKm)) return "other";
if (distanceKm <= 5.5) return "5k";
if (distanceKm <= 11) return "10k";
if (distanceKm <= 22) return "half-marathon";
if (distanceKm <= 43) return "marathon";
return "ultra";

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants