Projection based risk surfaces for advanced AI systems, stability classes and failure modes.
Safety boundaries and risk assessments for advanced AI systems are typically defined in a specific analytical frame — a particular set of metrics, test conditions, and evaluation criteria. The structural problem is that these boundaries are not stable under projection: when the system is observed or operates in a different frame (different context, scale, or deployment condition), the safety surfaces shift in ways that invalidate the original assessment.
This is not a testing coverage problem. It is a structural property of systems with emergent behavior: the risk surface itself changes shape depending on the projection through which it is observed. A system that appears safe under one set of evaluation criteria may exhibit entirely different risk characteristics when deployed in a context that projects onto a different region of the stability space.
This application addresses advanced AI systems where capability, safety, and risk are interrelated in non-linear ways. The relevant system boundary includes model behavior, deployment environments, evaluation frameworks, and the interaction between capability development and safety constraints.
The structural challenge is particularly acute for systems approaching or crossing capability thresholds where emergent properties alter the risk landscape. Safety assessments performed at one capability level may not transfer to the next, creating a moving target that static risk frameworks cannot capture.
This application provides structural analysis of safety and risk surfaces across projection frames, identifying conditions under which safety boundaries shift or collapse. The diagnostic output maps stability classes — regions of the capability-safety space where risk properties are structurally stable — and identifies failure modes associated with transitions between classes.
As AI systems grow in capability, the structural relationship between capability and safety becomes the determining factor for responsible deployment. Static safety assessments that do not account for projection effects provide false assurance. This application enables structurally grounded safety analysis that remains valid across deployment conditions and capability levels.
The SORT framework addresses this application through four structural dimensions, each providing a distinct analytical layer.
Safety boundaries are not stable under projection.
Emergent effects shift risk surfaces.
Structural projection of safety surfaces and stability classes.
Safety strategy, risk assessment, failure mode analysis.