Structural analysis of information loss in mapping internal computation to external explanation, identifying interpretability limits.
Interpretability methods attempt to map a model's internal computations onto human-understandable explanations. The structural problem is that this mapping is a projection — a dimensionality reduction from the high-dimensional internal representation space to the lower-dimensional explanation space — and projections inherently lose information. The question is not whether information is lost, but what information is lost, whether the loss is acceptable for the intended purpose, and whether the explanation preserves the structurally important properties of the internal computation.
Current interpretability approaches often assume that explanation fidelity can be improved incrementally. The structural perspective reveals that certain aspects of internal computation may be fundamentally non-projectable — they exist in dimensions of the internal space that have no counterpart in the explanation space, creating irreducible interpretability limits.
This application operates at the interface between model internals and human understanding, addressing interpretability methods, explanation systems, and audit requirements. The relevant system boundary includes the model's internal representation space, the explanation or interpretability method, the target explanation space, and the decisions that depend on explanation accuracy.
As AI regulation increasingly requires model interpretability, understanding the structural limits of explanation becomes essential. Organizations need to know what can and cannot be explained, and to design governance frameworks that account for these structural limits rather than assuming that all internal behavior is interpretable.
The SORT framework addresses this application through four structural dimensions, each providing a distinct analytical layer.
Explanations don't fully reflect internal computations.
Structural information loss in projection from internal to external.
Diagnostics of representation-projection gap.
Interpretability strategy, explanation design, audit requirements.