
On the left: we’ve trained a “black box” credit risk model, and our ProxyML surrogate reveals a surprising result – foreign_worker is one of the most important contributions to a prediction. A closer examination of our data shows that nearly all our training data samples are foreign workers.
On the right: we’ve retrained our black box model without considering foreign_worker. Arguably these feature contributions make more sense in understanding the “bad” credit risk prediction.
ProxyML uses linear surrogates, which means local explanations are mathematically exact. The sum of feature contributions plus the intercept equals the model output exactly — no approximation, no sampling error.