What are Integrated Gradients?

Integrated Gradients

Integrated gradients are a technique for attributing the predictions of a classification model to input features. It can be used to visualize the relationship between input features and model predictions. It is a local method that helps account for each individual prediction. For example, in the Fashion MNIST dataset, if we take the image of a shoe, then the positive attributions are the pixels of the image which make a positive influence on the model classifying the image as a shoe. The integrated gradient method is mainly used to identify errors in the model, where corrections can be made to improve the accuracy of the model.


Sign up for our monthly newsletter, The Drift.