The Non-Separability Constraint: A unifying framework for understanding and detecting AI alignment failures
optimization coupling risk-management ai-alignment system-health-check goodhart-s-law red-teaming-tools reward-hacking ai-safety-research instrumental-convergence mesa-optimization multi-agent-miscoordination seperability-assumption
-
Updated
Feb 9, 2026