Skip to content

Conversation

@Kinvert
Copy link
Contributor

@Kinvert Kinvert commented Jan 13, 2026

My first env with Claude. Dogfighting, WW2.

  • Improve policy results
  • 3D planes
  • Human controllable
  • Ballistics
  • Enemy maneuvers
  • Enemy policy
  • Improved physics

REWARD SYSTEM: Simplified from 9+ terms to 6
- ADDED: aim_scale (continuous aiming reward based on aim quality)
- KEPT: closing_scale, neg_g, stall, rudder, speed_min
- REMOVED: tail_scale, tracking, firing_solution, roll, aileron,
  bias, approach, level (caused 'don't maneuver' traps or redundant)

OBSERVATION SCHEME 1: Replaced OBS_CONTROL_ERROR with OBS_PURSUIT
- 13 obs instead of 17 (removed spoon-fed control errors)
- Added energy state (potential + kinetic normalized)
- Body-frame target azimuth/elevation instead of control errors
- Target pitch/roll/aspect for energy-aware pursuit decisions

CURRICULUM: Performance-based instead of episode-count
- REMOVED: episodes_per_stage
- ADDED: advance_threshold, demote_threshold, eval_window

PERFORMANCE: Division to multiplication optimizations
CLEANUP: Removed dead code from struct and reward accumulators
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant