Skip to content

Comments

Add CircleMUD parser, multi-format support, and normalized JSON output#4

Open
daiverd wants to merge 9 commits intoctoth:masterfrom
daiverd:add-circlemud-parser
Open

Add CircleMUD parser, multi-format support, and normalized JSON output#4
daiverd wants to merge 9 commits intoctoth:masterfrom
daiverd:add-circlemud-parser

Conversation

@daiverd
Copy link

@daiverd daiverd commented Feb 19, 2026

Summary

This PR adds comprehensive MUD area file parsing capabilities:

  • CircleMUD parser: Parse CircleMUD split-file format (.wld, .mob, .obj, .zon, .shp)
  • Additional format support: ROT, Envy, SMAUG-WD formats
  • Normalized JSON output: Unified schema across all MUD formats
  • Tolerant parsing mode: Extract partial data from incompatible files
  • Batch conversion tool: Convert entire directories of area files

New Formats Supported

Format Description
CircleMUD Directory-based format with separate files per data type
ROT Realms of Thera format with extra mob flags
Envy Envy MUD format with level ranges in braces
SMAUG-WD SMAUG variant with key-value area metadata

Normalized Output

The normalized format provides a unified schema across all MUD formats:

  • Flags as lowercase string lists instead of enum strings
  • Consistent field names across formats
  • Original data preserved in original field for lossless round-trip

Batch Conversion

# Convert all areas to normalized JSON
python convert_all.py --normalized --tolerant --continue-on-error

# Results: 259 .are files + 63 CircleMUD zones = 322 areas converted

Key Changes

  • circlemud.py: New CircleMUD parser
  • normalized.py: Normalized data classes
  • normalizer.py: Format-specific normalizers
  • convert_all.py: Batch conversion script
  • area_reader.py: Added SmaugMob.read(), tolerant parsing, partial parse fallback
  • constants.py: Extended enums for SMAUG compatibility

Test plan

  • All 96 existing tests pass
  • Batch conversion succeeds for 322/343 area files
  • Normalized output verified for ROM, Merc, SMAUG, CircleMUD formats

🤖 Generated with Claude Code

David Sexton and others added 9 commits January 25, 2026 21:07
- New circlemud.py module for parsing CircleMUD split-file format
- Supports .wld, .mob, .obj, .zon, .shp files
- Handles alphanumeric VNUMs (e.g., QQ00, XX74)
- Same API pattern as existing parsers (load_sections, as_dict, as_json)
- Update README with CircleMUD usage and output format documentation
- Add RotMob and RotAreaFile for Realms of Thera format
- Add SmaugWdAreaFile with SmaugWdMob, SmaugWdItem, SmaugWdRoom, SmaugWdReset
- Fix +- number pattern handling (negative after plus)
- Fix VNum type handling in reader
- Add section handlers: areadata, resetmsg, author, ranges, flags
- Fix Reset/MercReset arg parsing for M, P, G, R commands
- Add convert_all.py batch conversion script
- Add validate_samples.py for testing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add EnvyAreaFile, EnvyMob, EnvyItem, EnvyRoom classes for Envy format
- Normalize CRLF to LF on file load to handle Windows line endings
- Add read_dice_or_number() for flexible dice/number parsing
- Add read_to_blank_line() for multi-line descriptions without tildes
- Fix Envy object extra descriptions (single-line, no tilde terminator)
- Fix Envy room extra descriptions (use standard tilde terminator)
- Add S marker handling for room termination
- Fix exits serialization (convert OrderedDict to list)
- Update convert_all.py with Envy format detection and parser order

Converts 11 additional Envy files (192 total success, up from 181).
Remaining 6 Envy failures due to format variants or data corruption.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix read_until() to detect missing terminators and raise ParseError
  instead of setting index to -1 and looping infinitely
- Add MOBprog (>trigger....|) skipping in SmaugWdMob.read
- Re-enable SMAUG-WD parser in convert_all.py

Now converts 25 SMAUG-WD files (217 total success, 77.5%).
Only AsylumGrounds.are fails due to malformed object missing ~.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add normalized.py with unified data classes (NormalizedMob, NormalizedItem,
  NormalizedRoom, NormalizedArea) for cross-format compatibility
- Add normalizer.py with format-specific normalizers that convert ROM, Merc,
  SMAUG, Envy, and CircleMUD data to a common schema
- Add --normalized/-n flag to convert_all.py for normalized output
- Add --tolerant/-t flag for partial parsing when full parse fails
- Add SmaugMob.read() method for proper SMAUG mob format parsing
- Fix read_vnum() to properly detect section headers without consuming them
- Extend EXIT_DIRECTIONS and SECTOR_TYPES enums for SMAUG compatibility
- Add room prog (>) handling in Room.read_metadata()

Normalized output provides unified field names and types across all MUD
formats while preserving original data in the 'original' field.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add documentation for all supported formats (ROM, Merc, SMAUG, Envy, ROT, SMAUG-WD, CircleMUD)
- Document normalized output format with unified schema
- Add batch conversion tool (convert_all.py) usage and options
- Document tolerant parsing mode for partial data extraction
- Add normalized vs raw output comparison table
- Add architecture section showing project structure

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant