Env - Dogfight #460

Kinvert · 2026-01-13T02:48:04Z

My first env with Claude. Dogfighting, WW2.

REWARD SYSTEM: Simplified from 9+ terms to 6 - ADDED: aim_scale (continuous aiming reward based on aim quality) - KEPT: closing_scale, neg_g, stall, rudder, speed_min - REMOVED: tail_scale, tracking, firing_solution, roll, aileron, bias, approach, level (caused 'don't maneuver' traps or redundant) OBSERVATION SCHEME 1: Replaced OBS_CONTROL_ERROR with OBS_PURSUIT - 13 obs instead of 17 (removed spoon-fed control errors) - Added energy state (potential + kinetic normalized) - Body-frame target azimuth/elevation instead of control errors - Target pitch/roll/aspect for energy-aware pursuit decisions CURRICULUM: Performance-based instead of episode-count - REMOVED: episodes_per_stage - ADDED: advance_threshold, demote_threshold, eval_window PERFORMANCE: Division to multiplication optimizations CLEANUP: Removed dead code from struct and reward accumulators

Kinvert added 30 commits January 12, 2026 21:46

Trains and Evals

a417266

Reward Changes

49af2d4

Rendered with spheres or something

daaf902

Good Claude - Wireframe Planes

332a9ae

Physics model: incidence, comments, test suite

0116b97

Renamed md Files

b29bf5a

Moved Physics to File

95eb2ef

Physics in Own File - Test Flights

3582d2d

Coordinated Turn Tests

1c30c54

Simple Optimizations

1131e83

Small Perf - Move cosf Out of Loop

374871d

Autopilot Seperate File

8598067

Vectorized Autopilot

80bcf31

Weighted Random Actions

0a1c2e6

Observation Schemas Swept

63a7aae

Rewards Fixed - Sweepable

04dd016

Preparing for Sweeps

26709b9

Fix Terminals and Loggin

a31d1dc

More Sweep Prep

3cc5b58

Fix Reward and Score

17f18c1

Temp Undo Later - Clamp logstd

d639ee3

Apply Sweep df1 84 u5i33hej

2606e20

New Obs Schemas - New Sweep Prep

bc72836

Roll Penalty - Elevator Might Be Inversed

fe7e26a

Fix Elevator Problems

652ab7a

Fix Obs 5 Schema and Adjust Penalties

30fa9fe

Increase Batch Size for Speed

ab222bf

Next Sweep Improvements - Likes to Aileron Roll too Much

7fd88f1

Reduce Prints

9dca5c6

Simplify Penalties and Rewards

b68d1b2

Kinvert added 27 commits January 18, 2026 00:14

Try to Avoid NAN

03d1ebc

Trying to Stop NANs

7a15539

Debug Prints

2c3073f

Fix Mean Outside Bounds

be1e31c

Still Trying to Fix Blowups

f6c821d

Revert Some Ini Values

3f0f8b4

Restore Much of Ini to 9dca5c6

6c61df6

Reduce Learning Rate Again

faf6eb6

Trying to Fix Curriculum - Agent Trains Poorly

4e640ee

Aim Annealing - Removed Some Penalties

f302224

Added More Debugging

f000fb8

Some Fixes - SPS Gains - New Sweep Soon

7a75d2b

Fixed Rewards That Turn Negative

92aa6c5

Reduce Negative G Penalties

fd1941f

Revert to df5 (f3022) + SPS gains, Ready for df7

d8a8475

Clamp for nans - df7 2.0

4c3ebd3

This Potentially Helps with Curriculum

bfa061f

3M SPS Prep for df8 Sweep

214338e

df9 Sweep Prep - Sweeping Stages

f2af35e

Safer Sweeps - Obs Clamps - Coeff Ranges

060bbfb

Add sweep persistence and override injection for Protein

153bd08

Observation Scheme Tests

4b72007

Rudder Damping - Obs HUD - Test Updates

784856b

Code Cleanup

b0f22a3

Reduce Sweep Params - Rudder Drag - Restructure and Add Tests

6859683

Logs Update

84d8241

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Env - Dogfight #460

Env - Dogfight #460

Uh oh!

Kinvert commented Jan 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Env - Dogfight #460

Are you sure you want to change the base?

Env - Dogfight #460

Uh oh!

Conversation

Kinvert commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Kinvert commented Jan 13, 2026 •

edited

Loading