深度学习可解释性分析

这种分析貌似在大模型时代没有使用那么多了,所以这里大概了解一下.

主要了解CAM,Grad-CAM可视化方法以及Sillancy map的概念

Learning Deep Features for Discriminative Localization

对于给定的图像,令f~k~( x , y)表示空间位置( x , y)处最后一个卷积层中单元k的激活值.然后,对于单元k,执行全局平均池化的结果F^k^为∑~x,y~ f~k~( x ,y).因此,对于给定的类c,softmax的输入S~c~=∑~k~ w^c^ ~k~F~k~,其中w^c^~k~是单位k对应于类c的权重.本质上,w^c^~k~表示F~k~对c类的重要性.

最后通过exp(S~c~ )/∑~c~ exp(S~c~ )给出类c,P~c~的softmax的输出.

通过将F~k~ =∑~x,y~f~k~( x,y)插入类得分S~c~,得到

$\begin{aligned} S_{c}& =\sum_kw_k^c\sum_{x,y}f_k(x,y) \\ &=\sum_{x,y}\sum_{k}w_{k}^{c}f_{k}(x,y). \end{aligned}$

将M~c~定义为c类的类激活映射,其中每个空间元素由S~c~ =∑~x,y~M~c~( x ,y),因此M~c~( x , y)直接表明空间网格( x , y)上的激活导致图像分类到c类的重要性.

所以这里的class activation map是根据卷积层特征来的,权重值计算依赖于global pooling.

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

CAM对于网络结构有要求，不采用GAP方法不能使用；且只能可视化最后一层的卷积

提出了一种从基于卷积神经网络( Convolutional Neural Network，CNN )的一大类模型中为决策产生”视觉解释”的技术,使其更加透明和可解释.

我们的方法- -梯度加权类激活映射( Gradient-weighted Class Activation Mapping，Grad-CAM )，利用任意目标概念(在分类网络中说’狗’ ,或者在描述网络中说词序列)流入最终卷积层的梯度，产生一个粗略的定位图，突出图像中的重要区域，用于预测概念

与以前的方法不同，Grad - CAM适用于各种各样的CNN模型家族：( 1 )具有全连接层的CNN (例如. VGG ),( 2 )用于结构化输出的CNNs (如字幕),( 3 )用于具有多模态输入(例如,视觉问答)或强化学习的任务的CNNs，都没有架构变化或重新训练.

$\alpha_k^c=\overbrace{\frac1Z\sum_i\sum_j}^{\text{global average pooling}}\underbrace{\frac{\partial y^c}{\partial A_{ij}^k}}_{\text{gradients via backprop}}$ $L_{\text{Grad-CAM}}^c=ReLU\underbrace{\left(\sum\alpha_k^cA^k\right)}_{\text{linear combination}}$

其中A^k^是特征图的值,权重由自己定变成了梯度.

为了得到对任意类c的宽u和高v的类判别局部化映射Grad - CAM Lc Grad-CAM∈R^u×v^，我们首先计算类c的得分y~c~ (在softmax之前)关于卷积层的特征图激活A~k~的梯度，即∑y~c~∑A~k~。

Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks

本文的工作主要受到两种算法的启发，即CAM和Grad-CAM,这两种算法在当今被广泛使用.

CAM和Grad-CAM都基于一个基本假设,即对于特定类别c的最终得分Y~c~可以写成其全局平均池化最后一个卷积层特征图A~k~的线性组合

$Y^c=\sum_kw_k^c.\sum_i\sum_jA_{ij}^k$

在Grad - CAM和CAM中基于梯度的可视化技术的基础上提出了一种广义的方法，称为Grad - CAM + +，它是通过显式地建模CNN特征图中每个像素对最终输出的贡献来制定的。

$w_k^c=\sum_i\sum_j\alpha_{ij}^{kc}.relu(\frac{\partial Y^c}{\partial A_{ij}^k})$

$Y^{c}=\sum_{k}[\sum_{i}\sum_{j}\{\sum_{a}\sum_{b}\alpha_{ab}^{kc}.relu(\frac{\partial Y^{c}}{\partial A_{ab}^{k}})\}A_{ij}^{k}]$

Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models

加入噪声

$M_c(I)=\frac1n\sum_1^nM_c(I+\mathcal{N}(0,\sigma^2))$ $M_c(x,y)=\sum_kw_c^kf^k(x,y)=\sum_kw_c^k\cdot\frac1n\sum_1^n(f_{x,y|I+\mathcal{N}(0,\sigma^2)}^k)$

由于添加了噪声，所以更能够突出特征图在噪声中robust的部分，这部分被认为更重要；此方法可以用来比较图片不同位置对神经元的激活强度，所以可以进行choosing neurons的操作

Saliency map

Visualizing Neural Networks using Saliency Maps in PyTorch | by Aditya Rastogi | DataDrivenInvestor

表示显著性.

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())
  # return torch.log(image)/torch.log(image.max())

def compute_saliency_maps(x, y, model):
  model.eval()
  x = x.cuda()

  # we want the gradient of the input x
  x.requires_grad_()
  
  y_pred = model(x)
  loss_func = torch.nn.CrossEntropyLoss()
  loss = loss_func(y_pred, y.cuda())
  loss.backward()

  # saliencies = x.grad.abs().detach().cpu()
  saliencies, _ = torch.max(x.grad.data.abs().detach().cpu(),dim=1)

  # We need to normalize each image, because their gradients might vary in scale
  saliencies = torch.stack([normalize(item) for item in saliencies])
  return saliencies
  # images, labels = train_set.getbatch(img_indices)
saliencies = compute_saliency_maps(images, labels, model)

# visualize
fig, axs = plt.subplots(2, len(img_indices), figsize=(15, 8))
for row, target in enumerate([images, saliencies]):
  for column, img in enumerate(target):
    if row==0:
      axs[row][column].imshow(img.permute(1, 2, 0).numpy())
      # What is permute?
      # In pytorch, the meaning of each dimension of image tensor is (channels, height, width)
      # In matplotlib, the meaning of each dimension of image tensor is (height, width, channels)
      # permute is a tool for permuting dimensions of tensors
      # For example, img.permute(1, 2, 0) means that,
      # - 0 dimension is the 1 dimension of the original tensor, which is height
      # - 1 dimension is the 2 dimension of the original tensor, which is width
      # - 2 dimension is the 0 dimension of the original tensor, which is channels
    else:
      axs[row][column].imshow(img.numpy(), cmap=plt.cm.hot)
    
plt.show()
plt.close()

Smooth Grad

https://arxiv.org/pdf/1706.03825.pdf

Smooth grad 的方法是,在圖片中隨機地加入 noise,然後得到不同的 heatmap,把這些 heatmap 平均起來就得到一個比較能抵抗 noisy gradient 的結果

# Smooth grad

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())

def smooth_grad(x, y, model, epoch, param_sigma_multiplier):
  model.eval()
  #x = x.cuda().unsqueeze(0)

  mean = 0
  sigma = param_sigma_multiplier / (torch.max(x) - torch.min(x)).item()
  smooth = np.zeros(x.cuda().unsqueeze(0).size())
  for i in range(epoch):
    # call Variable to generate random noise
    noise = Variable(x.data.new(x.size()).normal_(mean, sigma**2))
    x_mod = (x+noise).unsqueeze(0).cuda()
    x_mod.requires_grad_()

    y_pred = model(x_mod)
    loss_func = torch.nn.CrossEntropyLoss()
    loss = loss_func(y_pred, y.cuda().unsqueeze(0))
    loss.backward()

    # like the method in saliency map
    smooth += x_mod.grad.abs().detach().cpu().data.numpy()
  smooth = normalize(smooth / epoch) # don't forget to normalize
  # smooth = smooth / epoch # try this line to answer the question
  return smooth

# images, labels = train_set.getbatch(img_indices)
smooth = []
for i, l in zip(images, labels):
  smooth.append(smooth_grad(i, l, model, 500, 0.4))
smooth = np.stack(smooth)
# print(smooth.shape)

fig, axs = plt.subplots(2, len(img_indices), figsize=(15, 8))
for row, target in enumerate([images, smooth]):
  for column, img in enumerate(target):
    axs[row][column].imshow(np.transpose(img.reshape(3,128,128), (1,2,0)))

简单来说,就是执行多个epoch然后每次添加一个噪声进行forward(注意噪声也带梯度),每次将计算损失并将梯度累加,最后归一化

Filter Explaination

Deep Dream: Visualizing the features learnt by Convolutional Networks in PyTorch (reurl.cc)

通过hook得到卷积层的结果进行可视化.

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())

layer_activations = None
def filter_explanation(x, model, cnnid, filterid, iteration=100, lr=1):
  # x: input image
  # cnnid: cnn layer id
  # filterid: which filter
  model.eval()

  def hook(model, input, output):
    global layer_activations
    layer_activations = output
  
  hook_handle = model.cnn[cnnid].register_forward_hook(hook)
  # When the model forwards through the layer[cnnid], it needs to call the hook function first
  # The hook function save the output of the layer[cnnid]
  # After forwarding, we'll have the loss and the layer activation

  # Filter activation: x passing the filter will generate the activation map
  model(x.cuda()) # forward

  # Based on the filterid given by the function argument, pick up the specific filter's activation map
  # We just need to plot it, so we can detach from graph and save as cpu tensor
  filter_activations = layer_activations[:, filterid, :, :].detach().cpu()
  
  # Filter visualization: find the image that can activate the filter the most
  x = x.cuda()
  x.requires_grad_()
  # input image gradient
  optimizer = Adam([x], lr=lr)
  # Use optimizer to modify the input image to amplify filter activation
  for iter in range(iteration):
    optimizer.zero_grad()
    model(x)
    
    objective = -layer_activations[:, filterid, :, :].sum()
    # We want to maximize the filter activation's summation
    # So we add a negative sign
    
    objective.backward()
    # Calculate the partial differential value of filter activation to input image
    optimizer.step()
    # Modify input image to maximize filter activation
  filter_visualizations = x.detach().cpu().squeeze()

  # Don't forget to remove the hook
  hook_handle.remove()
  # The hook will exist after the model register it, so you have to remove it after used
  # Just register a new hook if you want to use it

  return filter_activations, filter_visualizations

images, labels = train_set.getbatch(img_indices)
filter_activations, filter_visualizations = filter_explanation(images, model, cnnid=6, filterid=0, iteration=100, lr=0.1)

fig, axs = plt.subplots(3, len(img_indices), figsize=(15, 8))
for i, img in enumerate(images):
  axs[0][i].imshow(img.permute(1, 2, 0))
# Plot filter activations
for i, img in enumerate(filter_activations):
  axs[1][i].imshow(normalize(img))
# Plot filter visualization
for i, img in enumerate(filter_visualizations):
  axs[2][i].imshow(normalize(img.permute(1, 2, 0)))
plt.show()
plt.close()

filter_activations就是卷积层的权重,通过hook得到filter_activations = layer_activations[:, filterid, :, :].detach().cpu()

register_forward_hook可以修改输出,修改输入不会对forward过程产生作用.

register_full_backward_hook可以修改grad_input对应于输入(在forward时)的梯度(反向传播时的输入梯度就是需要计算的梯度)

filter_visualizations是输入数据带上梯度,通过多个epoch计算损失函数(-layer_activations[:, filterid, :, :].sum()),使得激活值更大.

Integrated Gradients

arxiv.org/pdf/1703.01365

根据代码看下来,就是在图像上不断加噪声,看起来跟积分类似,然后计算梯度加起来.

$\text{IntegratedGrads}_{i}(x)::=(x_{i}-x_{i}')\times\int_{\alpha=0}^{1}\frac{\partial F(x'+\alpha\times(x-x'))}{\partial x_{i}} d\alpha$

class IntegratedGradients():
    def __init__(self, model):
        self.model = model
        self.gradients = None
        # Put model in evaluation mode
        self.model.eval()

    def generate_images_on_linear_path(self, input_image, steps):
        # Generate scaled xbar images
        xbar_list = [input_image*step/steps for step in range(steps)]
        return xbar_list

    def generate_gradients(self, input_image, target_class):
        # We want to get the gradients of the input image
        input_image.requires_grad=True
        # Forward
        model_output = self.model(input_image)
        # Zero grads
        self.model.zero_grad()
        # Target for backprop
        one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_().cuda()
        one_hot_output[0][target_class] = 1
        # Backward
        model_output.backward(gradient=one_hot_output)
        self.gradients = input_image.grad
        # Convert Pytorch variable to numpy array
        # [0] to get rid of the first channel (1,3,128,128)
        gradients_as_arr = self.gradients.data.cpu().numpy()[0]
        return gradients_as_arr

    def generate_integrated_gradients(self, input_image, target_class, steps):
        # Generate xbar images
        xbar_list = self.generate_images_on_linear_path(input_image, steps)
        # Initialize an image composed of zeros
        integrated_grads = np.zeros(input_image.size())
        for xbar_image in xbar_list:
            # Generate gradients from xbar images
            single_integrated_grad = self.generate_gradients(xbar_image, target_class)
            # Add rescaled grads from xbar images
            integrated_grads = integrated_grads + single_integrated_grad/steps
        # [0] to get rid of the first channel (1,3,128,128)
        return integrated_grads[0]

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())

# put the image to cuda
images, labels = train_set.getbatch(img_indices)
images = images.cuda()

IG = IntegratedGradients(model)
integrated_grads = []
for i, img in enumerate(images):
  img = img.unsqueeze(0)
  integrated_grads.append(IG.generate_integrated_gradients(img, labels[i], 10))
fig, axs = plt.subplots(2, len(img_indices), figsize=(15, 8))
for i, img in enumerate(images):
  axs[0][i].imshow(img.cpu().permute(1, 2, 0))
for i, img in enumerate(integrated_grads):
  axs[1][i].imshow(np.moveaxis(normalize(img),0,-1))
plt.show()
plt.close()

Sekyoro的博客小屋