热力图

汉无为 2022-04-03

展开全文

上一张《特征图可视化特征图代码》的介绍，可视化特征图和热力图的分析方法进行了热力图的分析。一个使用说明。
本文介绍了CAM、GradCAM的和缺陷模型，介绍了如何使用GradCAM实现热力可视化图，介绍了目标检测等热力建模、变换器类型任务的模型、热力图等视面化。

热力图可视化方法的原理

在图片中，我们并不知道从中得出的类别，根据是根据做出预测的输出，我们需要了解这种神经网络模型中的神经网络模型，我们需要根据这种网络模型来模拟预测的热效应。的作用，它通过在不同区域之间对模型的生成而生成类似等温图的图片。

热力图可视化方法经过从CAM，GradCAM，到GradCAM++的过程，比较常用的是GradCAM算法。

凸轮

CAM论文：Learning Deep Features for Discriminative Localization

CAM的原理是一类维权的全连接层中提供C的那部分，然后用W表示。原地图需要进行分类求和，由于此时不是在求和后还进行上绘制，得到了激活地图。

CAM模型建模目标的缺陷，它的结构是由CNN + GAP + FC + Softmax组成的。如果模型想要模型化现成的模型，对于没有GAP的比较来说需要修改结构，并与原重新开始，相当麻烦，而且如果训练很大，在之后修改重训练能达到原来的效果，可视化显示没有任何意义了。

，针对此缺陷，其现有改进版Grad-CAM。

GradCAM

Grad-CAM论文：Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

Grad-CAM的最大特点就是不再需要修改已有的模型结构了，也不需要重新训练了，直接在原模型上显示可视化。

用于处理CNN特征提取原理的最后一级特征图。对于最想要显示的类别C，输出的类别C的电影价值通过不同的传播方式传播到最后一级特征图，然后得到类别特定类别的特征图对特征图的计算值，对特征图的特征取值平均取值，可以根据所得特征图的平均化系数，表示这样的计算系数CAM 的系数的计算量几乎是等价的。下一个对特征图中的评分求和，使用 ReLU 进行修改，进行上采样。

使用 ReLU 的原因是对于那些负有，可认为与识别类别的 C 值，这些负值可能与其他类别有关，而正值对识别 C 有影响。

具体公式如下：

Grad-指CAM 的改进版还有改进版Grad-CAM++，它的定位更准确，更适合出现类似的情况，类似的目标是类似的例子中的多个目标类别，例如多个目标类别介绍该七种个人方法。改进方法是对因素提出的新方法的复杂性，这里不是。

GradCAM 的使用教程

这份代码来自 GradCAM 论文，原链接中包含了很多其他的 CAM，这里将 GradCAM 摘录出来做一个使用说明。

原代码链接：https://github.com/jacobgil/pytorch-grad-cam/tree/master/pytorch_grad_cam

本教程代码链接：https://github.com/CV-Tech-Guide/Visualize-feature-map s- and -heat map

使用流程

使用起来比较简单，只是主要了解功能即可。

if __name__ == '__main__':
   imgs_path = 'path/to/image.png'
   model = models.mobilenet_v3_large(pretrained=True)
   model.load_state_dict(torch.load('model.pth'))
   model = model.cuda().eval()
   
   #target_layers指的是需要可视化的层，这里可视化最后一层
   target_layers = [model.features[-1]]
   img, data = image_proprecess(imgs_path)
   data = data.cuda()

   cam = GradCAM(model=model, target_layers=target_layers)
   #指定可视化的类别，指定为None，则按照当前预测的最大概率的类作为可视化类。
   target_category = None

   grayscale_cam = cam(input_tensor=data, target_category=target_category)
   grayscale_cam = grayscale_cam[0, :]
   visualization = show_cam_on_image(np.array(img) / 255., grayscale_cam)
   plt.imshow(visualization)
   plt.xticks()
   plt.yticks()
   plt.axis('off')
   plt.savefig('path/to/gradcam_image.jpg')

如上代码显示，只需自行设置输入图片，显示可视化层，可视化部分可完全化，其他的可照用。

下面介绍细节部分。

数据特征

这里发现了四张图片，将图片化特征图的代码一样，返回为张量，如果只有一张图片，则需要将其作为扩展。

def image_proprecess(img_path):
   img = Image.open(img_path)
   data_transforms = transforms.Compose([
       transforms.Resize((384, 384), interpolation=3),
       transforms.ToTensor(),
       transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
      ])
   data = data_transforms(img)
   data = torch.unsqueeze(data,0)
   img_resize = img.resize((384,384))
   return img_resize,data

GradCAM

GradCAM 这个类是这个原理的原作，因此代码后面的介绍很容易理解第一个类的就比较。

class GradCAM:
   def __init__(self, model, target_layers, reshape_transform=None):
       self.model = model.eval()
       self.target_layers = target_layers
       self.reshape_transform = reshape_transform
       self.cuda = use_cuda
       self.activations_and_grads = ActivationsAndGradients(
           self.model, target_layers, reshape_transform)

   ''' Get a vector of weights for every channel in the target layer.
      Methods that return weights channels,
      will typically need to only implement this function. '''

   @staticmethod
   def get_cam_weights(grads):
       return np.mean(grads, axis=(2, 3), keepdims=True)

   @staticmethod
   def get_loss(output, target_category):
       loss = 0
       for i in range(len(target_category)):
           loss = loss + output[i, target_category[i]]
       return loss

   def get_cam_image(self, activations, grads):
       weights = self.get_cam_weights(grads)
       weighted_activations = weights * activations
       cam = weighted_activations.sum(axis=1)

       return cam

   @staticmethod
   def get_target_width_height(input_tensor):
       width, height = input_tensor.size(-1), input_tensor.size(-2)
       return width, height

   def compute_cam_per_layer(self, input_tensor):
       activations_list = [a.cpu().data.numpy()
                           for a in self.activations_and_grads.activations]
       grads_list = [g.cpu().data.numpy()
                     for g in self.activations_and_grads.gradients]
       target_size = self.get_target_width_height(input_tensor)

       cam_per_target_layer = []
       # Loop over the saliency image from every layer

       for layer_activations, layer_grads in zip(activations_list, grads_list):
           cam = self.get_cam_image(layer_activations, layer_grads)
           cam[cam < 0] = 0  # works like mute the min-max scale in the function of scale_cam_image
           scaled = self.scale_cam_image(cam, target_size)
           cam_per_target_layer.append(scaled[:, None, :])

       return cam_per_target_layer

   def aggregate_multi_layers(self, cam_per_target_layer):
       cam_per_target_layer = np.concatenate(cam_per_target_layer, axis=1)
       cam_per_target_layer = np.maximum(cam_per_target_layer, 0)
       result = np.mean(cam_per_target_layer, axis=1)
       return self.scale_cam_image(result)

   @staticmethod
   def scale_cam_image(cam, target_size=None):
       result = []
       for img in cam:
           img = img - np.min(img)
           img = img / (1e-7 + np.max(img))
           if target_size is not None:
               img = cv2.resize(img, target_size)
           result.append(img)
       result = np.float32(result)

       return result

   def __call__(self, input_tensor, target_category=None):
       # 正向传播得到网络输出logits(未经过softmax)
       output = self.activations_and_grads(input_tensor)
       if isinstance(target_category, int):
           target_category = [target_category] * input_tensor.size(0)

       if target_category is None:
           target_category = np.argmax(output.cpu().data.numpy(), axis=-1)
           print(f'category id: {target_category}')
       else:
           assert (len(target_category) == input_tensor.size(0))

       self.model.zero_grad()
       loss = self.get_loss(output, target_category)
       loss.backward(retain_graph=True)

       # In most of the saliency attribution papers, the saliency is
       # computed with a single target layer.
       # Commonly it is the last convolutional layer.
       # Here we support passing a list with multiple target layers.
       # It will compute the saliency image for every image,
       # and then aggregate them (with a default mean aggregation).
       # This gives you more flexibility in case you just want to
       # use all conv layers for example, all Batchnorm layers,
       # or something else.
       cam_per_layer = self.compute_cam_per_layer(input_tensor)
       return self.aggregate_multi_layers(cam_per_layer)

   def __del__(self):
       self.activations_and_grads.release()

   def __enter__(self):
       return self

   def __exit__(self, exc_type, exc_value, exc_tb):
       self.activations_and_grads.release()
       if isinstance(exc_value, IndexError):
           # Handle IndexError here...
           print(
               f'An exception occurred in CAM with block: {exc_type}. Message: {exc_value}')
           return True

简要一下在做什么，点击激活和所有模拟演示过程先通过中的激活函数值说明（通过显示其他类的损失损失的实时获取），将计算其损失的类的所有损失图并重新划分通道上获得每个特征图的值，将其划分为划分上的划分通道，并与每个特征图进行划分，并划分为即图。原图相加才能获得最终的热力图。

GradCAM这个主要类就是先，执行。需要定义输入网络和再调用定义的层，执行则需要输入图片和可视化的类别。

执行返回是区域图。

cam = GradCAM(model=model, target_layers=target_layers)
#指定可视化的类别，指定为None，则按照当前预测的最大概率的类作为可视化类。
target_category = None

grayscale_cam = cam(input_tensor=data, target_category=target_category)

获取推断过程中的主要是通过以下此类来完成。这里没有介绍。

class ActivationsAndGradients:
  ''' Class for extracting activations and
  registering gradients from targeted intermediate layers '''

  def __init__(self, model, target_layers, reshape_transform):
      self.model = model
      self.gradients = []
      self.activations = []
      self.reshape_transform = reshape_transform
      self.handles = []
      for target_layer in target_layers:
          self.handles.append(
              target_layer.register_forward_hook(
                  self.save_activation))
          # Backward compatibility with older pytorch versions:
          if hasattr(target_layer, 'register_full_backward_hook'):
              self.handles.append(
                  target_layer.register_full_backward_hook(
                      self.save_gradient))
          else:
              self.handles.append(
                  target_layer.register_backward_hook(
                      self.save_gradient))

  def save_activation(self, module, input, output):
      activation = output
      if self.reshape_transform is not None:
          activation = self.reshape_transform(activation)
      self.activations.append(activation.cpu().detach())

  def save_gradient(self, module, grad_input, grad_output):
      # Gradients are computed in reverse order
      grad = grad_output[0]
      if self.reshape_transform is not None:
          grad = self.reshape_transform(grad)
      self.gradients = [grad.cpu().detach()] + self.gradients

  def __call__(self, x):
      self.gradients = []
      self.activations = []
      return self.model(x)

  def release(self):
      for handle in self.handles:
          handle.remove()

然后就是将 GradCAM 输出的输出图在原图上显示，通过下面这个函数完成。

def show_cam_on_image(img: np.ndarray,
                    mask: np.ndarray,
                    use_rgb: bool = False,
                    colormap: int = cv2.COLORMAP_JET) -> np.ndarray:
  ''' This function overlays the cam mask on the image as an heatmap.
  By default the heatmap is in BGR format.
  :param img: The base image in RGB or BGR format.
  :param mask: The cam mask.
  :param use_rgb: Whether to use an RGB or BGR heatmap, this should be set to True if 'img' is in RGB format.
  :param colormap: The OpenCV colormap to be used.
  :returns: The default image with the cam overlay.
  '''

  heatmap = cv2.applyColorMap(np.uint8(255 * mask), colormap)
  if use_rgb:
      heatmap = cv2.cvtColor(heatmap, cv2.COLOR_BGR2RGB)
  heatmap = np.float32(heatmap) / 255

  if np.max(img) > 1:
      raise Exception(
          'The input image should np.float32 in the range [0, 1]')

  cam = heatmap + img
  cam = cam / np.max(cam)
  return np.uint8(255 * cam)