Deep Neural Network Interpretibility and Visualization

A Venn diagram of the relationship between vis and interpreting deep learning:

alt text

Deep learning interpretibility is a sub-area of deep learning. It focuses on understanding how deep models work, why they succeed or fail, in order to trust them in real applications, protect them from adversarial attack and discover more capable models. Visualization can be used to study interpretibility. For example, by studying networks’ activation patterns and learned features. These studies refer to region 1 in the Venn diagram. Apart from visualization, different approaches, such as local interpretable approximations and hypothesis functions, are also effective at explainations. On the other hand, visualization usage is not restricted to interpretation. It is capable of seeing a network’s graph structure or peeking into the training process to facilitate debugging. These work represent region 2 in the diagram.

It is important to form a clear picture of the relationships between areas, so one can steer a research direction more effectively. For example, as a visualization researcher, interpretibility is not the only aspect to study deep neural networks, network structure, gradient flow and debugging are all valuable venues to pursue.

Written on February 4, 2018