Guided Grad-CAM

February 3, 2019    CNN Visualization

  • Gradient-based Backpropagation은 relu outputs이 양수인 부분에만 gradients를 구하는 방식

  • Gradients-based Deconvnet은 gradients가 0보다 큰 gradients를 구하는 방식

  • 반면에, gradients-based guided Backpropagation은 특정 class(확률이 가장 높은)에 대해서 $1)$ gradients(w.r.t last conv)가 0보다 크며 $2)$ relu outputs이 양수인 부분인 gradients만 feature map에 곱하는 방식(weighted-average)

\[\text{gradients}= \frac{\partial y_{label_i}}{\partial \text{ last conv layer}} (\text{gradients}> 0) \ \& \ (\text{relu output} > 0)\]
  • Function of guided backprop gradients

      @ops.RegisterGradient("GuidedRelu")
      def _GuidedReluGrad(op, grad):
          # gradient: backpropagated Relu gradients
          # features: the outputs of Relu operation
          return tf.where(0. < grad, 
                          gen_nn_ops.relu_grad(gradients= grad,
                                               features = op.outputs[0]), 
                           tf.zeros(tf.shape(grad)))
    

Implementation

  • 전체 코드는 여기에 참조되어 있다.
%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import os

import cv2
from imagenet_classes import class_names

# image pre-processing
from imageio import imread
from PIL import Image

# relu gradient
from tensorflow.python.framework import ops
from tensorflow.python.ops import gen_nn_ops


  • pretraining weights 불러오기
  • vgg16_weights.npz파일은 여기에 첨부되어 있다.
# Assign weight file.
weight_file_path = 'vgg16_weights.npz'
# number of classes
n_labels = 1000      
pretrained_weights = dict(np.load(weight_file_path, encoding='bytes'))
print(class_names[0:5],'...','\n')
print('number of classes: ', len(class_names))
['tench, Tinca tinca', 'goldfish, Carassius auratus', 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias', 'tiger shark, Galeocerdo cuvieri', 'hammerhead, hammerhead shark'] ... 

number of classes:  1000



  • 이미지 전처리
    • 학습 네트워크의 이미지 사이즈는 $224 \times 224$ 로 고정되어 있다.
img = imread('tmp.jpeg')
print('shape:',img.shape)
shape: (960, 720, 3)
img = imread('tmp.jpeg')
# to PIL
img = Image.fromarray(img).resize((224, 224))
plt.imshow(img)
plt.axis('off')
(-0.5, 223.5, 223.5, -0.5)

# to numpy
img = np.array(img)
img.shape
(224, 224, 3)



VGG net

  • default 그래프 생성
  • Pre-trained VGG net을 bulid
graph = tf.get_default_graph()
  • 현재 graph에는 아무것도 없는 상태이지만 앞으로 생성되는 operation들은 이 그래프에 추가가 된다.
graph.get_operations()
[]


  • Guided Backpropagatio-based: $1)$ gradients(w.r.t last conv)가 0보다 크며 $2)$ relu outputs이 양수인 부분인 gradients를 계산하는 방식
  • {‘Relu’: ‘GuidedRelu’}: Relu의 그래디언트 계산방식을 GuidedRelu로 변환
@ops.RegisterGradient("GuidedRelu")
def _GuidedReluGrad(op, grad):
    # gradient: backpropagated Relu gradients
    # features: the outputs of Relu operation
    return tf.where(0. < grad, 
                    gen_nn_ops.relu_grad(gradients= grad,
                                         features = op.outputs[0]), 
                     tf.zeros(tf.shape(grad)))
def conv_layer(graph, inputs, name, stride = 1):    

    with tf.variable_scope(name) as scope:
        
        # The weights are retrieved according to how they are stored in arrays
        w = pretrained_weights[name+'_W']
        b = pretrained_weights[name+'_b']
        
        conv_weights = tf.get_variable(
                "W",
                shape=w.shape,
                initializer=tf.constant_initializer(w)
                )
        conv_biases = tf.get_variable(
                "b",
                shape=b.shape,
                initializer=tf.constant_initializer(b)
                )

        conv = tf.nn.conv2d(inputs, conv_weights, [1,stride,stride,1], padding='SAME')
        bias = tf.nn.bias_add(conv, conv_biases)
        
        with graph.gradient_override_map({'Relu': 'GuidedRelu'}):
            relu = tf.nn.relu(bias, name=name)
        
    return relu  


image_mean = [103.939, 116.779, 123.68]
epsilon = 1e-4
# Define Placeholders for images and labels
images_tf = tf.placeholder( tf.float32, [None, 224, 224, 3], name="images")
labels_tf = tf.placeholder( tf.int32, [None], name='labels')
r, g, b = tf.split(images_tf,[1,1,1] , 3)
print(r)
print(g)
print(b)
Tensor("split:0", shape=(?, 224, 224, 1), dtype=float32)
Tensor("split:1", shape=(?, 224, 224, 1), dtype=float32)
Tensor("split:2", shape=(?, 224, 224, 1), dtype=float32)


image = tf.concat([b-image_mean[0],g-image_mean[1], r-image_mean[2]],3)
image
<tf.Tensor 'concat:0' shape=(?, 224, 224, 3) dtype=float32>


  • Conv1_1 output 계산 식: $(224-3+2)/1 +1 = 224$
relu1_1 = conv_layer(graph, image, "conv1_1" )
relu1_1
<tf.Tensor 'conv1_1/conv1_1:0' shape=(?, 224, 224, 64) dtype=float32>


  • Conv1_2 output 계산 식: $(224-3+2)/1 +1 = 224$
relu1_2 = conv_layer(graph, relu1_1, "conv1_2" )
relu1_2
<tf.Tensor 'conv1_2/conv1_2:0' shape=(?, 224, 224, 64) dtype=float32>


  • pool1 output 계산 식: $(224-2)/2 +1 = 112$
pool1 = tf.nn.max_pool(relu1_2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool1')
pool1
<tf.Tensor 'pool1:0' shape=(?, 112, 112, 64) dtype=float32>


  • Conv2_1 output 계산 식: $(112-3+2)/1 +1 = 112$
relu2_1 = conv_layer(graph, pool1, "conv2_1")   
relu2_1
<tf.Tensor 'conv2_1/conv2_1:0' shape=(?, 112, 112, 128) dtype=float32>


  • Conv2_2 output 계산 식: $(112-3+2)/1 +1 = 112$
relu2_2 = conv_layer(graph, relu2_1, "conv2_2")
relu2_2
<tf.Tensor 'conv2_2/conv2_2:0' shape=(?, 112, 112, 128) dtype=float32>


  • pool2 output 계산 식: $(112-2)/2 +1 = 56$
pool2 = tf.nn.max_pool(relu2_2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],padding='SAME', name='pool2')
pool2
<tf.Tensor 'pool2:0' shape=(?, 56, 56, 128) dtype=float32>


  • Conv3_1 output 계산 식: $(56-3+2)/1 +1 = 56$
relu3_1 = conv_layer(graph, pool2, "conv3_1")
relu3_1
<tf.Tensor 'conv3_1/conv3_1:0' shape=(?, 56, 56, 256) dtype=float32>


  • Conv3_2 output 계산 식: $(56-3+2)/1 +1 = 56$
relu3_2 = conv_layer(graph, relu3_1, "conv3_2")
relu3_2
<tf.Tensor 'conv3_2/conv3_2:0' shape=(?, 56, 56, 256) dtype=float32>


  • Conv3_3 output 계산 식: $(56-3+2)/1 +1 = 56$
relu3_3 = conv_layer(graph, relu3_2, "conv3_3")
relu3_3
<tf.Tensor 'conv3_3/conv3_3:0' shape=(?, 56, 56, 256) dtype=float32>


  • pool3 output 계산 식: $(56-2)/2 +1 = 28$
pool3 = tf.nn.max_pool(relu3_3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
                       padding='SAME', name='pool3')
pool3
<tf.Tensor 'pool3:0' shape=(?, 28, 28, 256) dtype=float32>


  • Conv4_1 output 계산 식: $(28-3+2)/1 +1 = 28$
relu4_1 = conv_layer(graph, pool3, "conv4_1")
relu4_1
<tf.Tensor 'conv4_1/conv4_1:0' shape=(?, 28, 28, 512) dtype=float32>


  • Conv4_2 output 계산 식: $(28-3+2)/1 +1 = 28$
relu4_2 = conv_layer(graph, relu4_1, "conv4_2")
relu4_2
<tf.Tensor 'conv4_2/conv4_2:0' shape=(?, 28, 28, 512) dtype=float32>


  • Conv4_3 output 계산 식: $(28-3+2)/1 +1 = 28$
relu4_3 = conv_layer(graph, relu4_2, "conv4_3")
relu4_3
<tf.Tensor 'conv4_3/conv4_3:0' shape=(?, 28, 28, 512) dtype=float32>


  • pool4 output 계산 식: $(28-2)/2 +1 = 14$
pool4 = tf.nn.max_pool(relu4_3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
                       padding='SAME', name='pool4')
pool4
<tf.Tensor 'pool4:0' shape=(?, 14, 14, 512) dtype=float32>


  • Conv5_1 output 계산 식: $(14-3+2)/1 +1 = 14$
relu5_1 = conv_layer(graph, pool4, "conv5_1")
relu5_1
<tf.Tensor 'conv5_1/conv5_1:0' shape=(?, 14, 14, 512) dtype=float32>


  • Conv5_2 output 계산 식: $(14-3+2)/1 +1 = 14$
relu5_2 = conv_layer(graph, relu5_1, "conv5_2")
relu5_2
<tf.Tensor 'conv5_2/conv5_2:0' shape=(?, 14, 14, 512) dtype=float32>


  • Conv5_3 output 계산 식: $(14-3+2)/1 +1 = 14$
relu5_3 = conv_layer(graph, relu5_2, "conv5_3")
relu5_3
<tf.Tensor 'conv5_3/conv5_3:0' shape=(?, 14, 14, 512) dtype=float32>


  • pool5 output 계산 식: $(14-2)/2 +1 = 7$
pool5 = tf.nn.max_pool(relu5_3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
                       padding='SAME', name='pool5')
pool5
<tf.Tensor 'pool5:0' shape=(?, 7, 7, 512) dtype=float32>


  • Fully connected Layer 1
with tf.variable_scope('fc1') as scope:                        

    w = pretrained_weights['fc6_W']
    b = pretrained_weights['fc6_b']

    fc_weights = tf.get_variable("W", shape=w.shape, 
    	initializer=tf.constant_initializer(w))
    fc_biases  = tf.get_variable("b", shape=b.shape, 
    	initializer=tf.constant_initializer(b))           

    # flatten dim 
    shape = int(np.prod(pool5.get_shape()[1:])) #25088 
    pool5_flat = tf.reshape(pool5, [-1, shape])

    fc1l = tf.nn.bias_add(tf.matmul(pool5_flat, fc_weights),
     fc_biases)

    with graph.gradient_override_map({'Relu': 'GuidedRelu'}):
        fc1 = tf.nn.relu(fc1l)

    fc1 = tf.nn.dropout(fc1, keep_prob = 1.0) # option
fc1
<tf.Tensor 'fc1/Relu:0' shape=(?, 4096) dtype=float32>


  • Fully connected Layer 2
with tf.variable_scope('fc2') as scope: 

    w = pretrained_weights['fc7_W']
    b = pretrained_weights['fc7_b']

    fc_weights = tf.get_variable("W", shape=w.shape, 
    	initializer=tf.constant_initializer(w))
    fc_biases  = tf.get_variable("b", shape=b.shape, 
    	initializer=tf.constant_initializer(b))           

    fc2l = tf.nn.bias_add(tf.matmul(fc1, fc_weights), 
    	fc_biases)

    with graph.gradient_override_map({'Relu': 'GuidedRelu'}):
        fc2 = tf.nn.relu(fc2l)
    fc2 = tf.nn.dropout(fc2, keep_prob = 1.0)
fc2
<tf.Tensor 'fc2/Relu:0' shape=(?, 4096) dtype=float32>


  • Fully connected Layer 3

with tf.variable_scope('fc3') as scope:

    w = pretrained_weights['fc8_W']
    b = pretrained_weights['fc8_b']

    fc_weights = tf.get_variable("W", shape=w.shape, 
    	initializer=tf.constant_initializer(w))
    fc_biases  = tf.get_variable("b", shape=b.shape, 
    	initializer=tf.constant_initializer(b))         

    output = tf.nn.bias_add(tf.matmul(fc2, fc_weights), 
    	fc_biases)
output
<tf.Tensor 'fc3/BiasAdd:0' shape=(?, 1000) dtype=float32>


Inference

  • 다시 불러온 graph의 마지막 Conv와 최종 output를 가져옴
last_conv_layer = graph.get_tensor_by_name('conv5_3/conv5_3:0')
last_conv_layer
<tf.Tensor 'conv5_3/conv5_3:0' shape=(?, 14, 14, 512) dtype=float32>


output = graph.get_tensor_by_name('fc3/BiasAdd:0')
output
<tf.Tensor 'fc3/BiasAdd:0' shape=(?, 1000) dtype=float32>


  • 가장 확률이 높은 class와 연결된 gradients (w.r.t last conv feature maps)를 계산
\[\text{grad}= \frac{\partial y_{label_i}}{\partial \text{ last conv layer}} (\text{gradients}> 0) \ \& \ (\text{relu output} > 0)\]
gradient = tf.gradients(output[:,tf.squeeze(labels_tf,-1)], last_conv_layer)[0]

gradient
<tf.Tensor 'gradients/pool5_grad/MaxPoolGrad:0' shape=(?, 14, 14, 512) dtype=float32>
\[norm_i = \frac{grad_i}{\sqrt{\frac{1}{n}\sum grad_i^2}}\]
norm_grads = tf.div(gradient, tf.sqrt(tf.reduce_mean(tf.square(gradient))) + tf.constant(1e-5))
norm_grads
<tf.Tensor 'div:0' shape=(?, 14, 14, 512) dtype=float32>


sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
logits_classes = sess.run(output, 
                          feed_dict={images_tf: np.expand_dims(img, axis = 0)}
                         )
logits_classes.shape
(1, 1000)


  • Remove redundant axis
pred = np.squeeze(logits_classes, axis=0)
pred.shape
(1000,)


  • 작은값부터 큰값 순서를 나타내는 index
  • $[::-1] \rightarrow $ 큰값부터 작은값으로 바꾸는 인덱스 표현
  • 확률(logit)이 가장 큰 5개의 class 선택
pred = (np.argsort(pred)[::-1])[0:6]
pred
array([258, 279, 222, 257, 270, 250])
label_1 = pred[0]
label_1
258
label_2 = pred[5]
label_2
250



Gradient Class Activation Maps

fmaps = last_conv_layer
gradients= norm_grads 
height = 224 # upsampled height
width = 224 # upsampled width
num_fmaps = 512 # number of feature map for last conv


gradients.shape.as_list()
[None, 14, 14, 512]


  • 각 채널 별로 평균(global average pooling)
weights = tf.reduce_mean(gradients, axis=(1,2))
weights.shape.as_list()
[None, 512]


  • Resize bilinear

    • [None, 14, 14, 512] -> [None, 224, 224, 512]
fmaps_resized = tf.image.resize_bilinear(fmaps, [height, width] )
fmaps_resized
<tf.Tensor 'ResizeBilinear:0' shape=(?, 224, 224, 512) dtype=float32>


  • 4D tensor $\rightarrow$ 3D tensor
fmaps_reshaped = tf.reshape(fmaps_resized, [-1, height*width, num_fmaps]) 
fmaps_reshaped
<tf.Tensor 'Reshape:0' shape=(?, 50176, 512) dtype=float32>


  • 2D tensor $\rightarrow$ 3D tensor
label_w = tf.reshape( weights, [-1, num_fmaps, 1])
label_w
<tf.Tensor 'Reshape_1:0' shape=(?, 512, 1) dtype=float32>


  • Batch multiplication
    • Last feature maps $\times$ gradients GAP for a class
classmap = tf.matmul(fmaps_reshaped, label_w )
classmap
<tf.Tensor 'MatMul:0' shape=(?, 50176, 1) dtype=float32>


  • Image size 원복
classmap = tf.reshape( classmap, [-1, height, width] )
classmap
<tf.Tensor 'Reshape_2:0' shape=(?, 224, 224) dtype=float32>


  • Class map 산출
class_map1 = sess.run(classmap, feed_dict={ images_tf: np.expand_dims(img, axis = 0),labels_tf: [label_1]})
class_map2 = sess.run(classmap, feed_dict={ images_tf: np.expand_dims(img, axis = 0),labels_tf: [label_2]})

print(class_map1.shape)
print(class_map2.shape)

(1, 224, 224)
(1, 224, 224)


class_map1 = np.squeeze(class_map1, axis= 0)
class_map2 = np.squeeze(class_map2, axis= 0)

print(class_map1.shape)
print(class_map2.shape)
(224, 224)
(224, 224)


class_map1
array([[  11.326534,   10.256481,    9.186434, ..., -139.50922 ,
        -139.50922 , -139.50922 ],
       [  11.11174 ,   10.066767,    9.021786, ..., -128.73962 ,
        -128.73962 , -128.73962 ],
       [  10.896954,    9.877047,    8.857143, ..., -117.97    ,
        -117.97    , -117.97    ],
       ...,
       [  60.042534,   56.245735,   52.44895 , ...,   75.687744,
          75.687744,   75.687744],
       [  60.042534,   56.245735,   52.44895 , ...,   75.687744,
          75.687744,   75.687744],
       [  60.042534,   56.245735,   52.44895 , ...,   75.687744,
          75.687744,   75.687744]], dtype=float32)


class_map2
array([[   9.975134,    9.379017,    8.782894, ..., -125.02382 ,
        -125.02382 , -125.02382 ],
       [  10.092278,    9.504257,    8.916239, ..., -116.4028  ,
        -116.4028  , -116.4028  ],
       [  10.209423,    9.629507,    9.049589, ..., -107.7818  ,
        -107.7818  , -107.7818  ],
       ...,
       [  66.320564,   62.716473,   59.11238 , ...,   62.742596,
          62.742596,   62.742596],
       [  66.320564,   62.716473,   59.11238 , ...,   62.742596,
          62.742596,   62.742596],
       [  66.320564,   62.716473,   59.11238 , ...,   62.742596,
          62.742596,   62.742596]], dtype=float32)



Visualize

def normalize(img):
    """Normalize the image range for visualization"""
    return np.uint8((img - img.min()) / (img.max()-img.min())*255)
fig, axs = plt.subplots(1,2, figsize=(10,10))
axs[0].imshow(img)
axs[0].imshow(normalize(class_map1), cmap=plt.cm.jet, alpha=0.5, interpolation='nearest')
axs[0].set_title('1st class: %s' %class_names[label_1])
axs[0].axis('off')

axs[1].imshow(img)
axs[1].imshow(normalize(class_map2), cmap=plt.cm.jet, alpha=0.5, interpolation='nearest')
axs[1].set_title('2nd class: %s' %class_names[label_2])
axs[1].axis('off')
(-0.5, 223.5, 223.5, -0.5)

  • Contour를 찾아 가장 넓은 면적을 가지는 부분에 대하서 Bounding Box를 취함
heatmap = class_map1
threshold = 0.3
# Binarize the heatmap
_, thresholded_heatmap = cv2.threshold(heatmap, threshold * heatmap.max(), 1, cv2.THRESH_BINARY)
plt.imshow(thresholded_heatmap)
<matplotlib.image.AxesImage at 0x158bfab00>

# Required for converting image to uint8
print('Before:',thresholded_heatmap.dtype)
thresholded_heatmap = cv2.convertScaleAbs(thresholded_heatmap)
print('After:',thresholded_heatmap.dtype)
Before: float32
After: uint8


contours, _ = cv2.findContours(thresholded_heatmap, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print('number of contours:',len(contours))
number of contours: 1


  • Contours 면적 계산
contour_areas = []   
for i, c in enumerate(contours):
    contour_areas.append(cv2.contourArea(c))
# contour 면적이 큰 순서대로 정렬
sorted_contours = sorted(zip(contour_areas, contours), key=lambda x:x[0], reverse=True)
# # contour 면적이 큰 contours 선택
biggest_contour= sorted_contours[0][1]
# -1 : represent entire contours
# (255, 255, 255): color
# 3 : thinkness 
contour_image = cv2.drawContours(img.copy(), biggest_contour, -1, (255, 255, 255), 3)
plt.imshow(contour_image)
<matplotlib.image.AxesImage at 0x15c3bf5c0>

x,y,w,h = cv2.boundingRect(biggest_contour)
x,y,w,h
(88, 24, 136, 200)


box_image = cv2.rectangle(img.copy(), (x,y), (x+w, y+h), (0, 255,0), 2)
plt.imshow(box_image)
plt.axis('off')
(-0.5, 223.5, 223.5, -0.5)


Saliency map

  • 마지막 Conv가 아닌 input image에 대해서 gradients를 계산
\[\frac{\partial y_{label_i}}{\partial \text{ input $\textbf{x}$}}\]
gradient_bp = tf.gradients(output[:,tf.squeeze(labels_tf,-1)], images_tf)[0]
gradient_bp
<tf.Tensor 'gradients_1/split_grad/concat:0' shape=(?, 224, 224, 3) dtype=float32>


  • gradient normalize
norm_grads_bp = tf.div(gradient_bp, tf.sqrt(tf.reduce_mean(tf.square(gradient))) + tf.constant(1e-5))
norm_grads_bp
<tf.Tensor 'div_1:0' shape=(?, 224, 224, 3) dtype=float32>


# Gradients computation
grads_weights1 = sess.run(norm_grads_bp, feed_dict={images_tf: np.expand_dims(img, axis = 0),
                                               labels_tf: [label_1]})
grads_weights2 = sess.run(norm_grads_bp, feed_dict={images_tf: np.expand_dims(img, axis = 0),
                                               labels_tf: [label_2]})

grads_weights1 = np.squeeze(grads_weights1)
grads_weights2 = np.squeeze(grads_weights2)


  • MinMax uint8 normalize
fig, axs = plt.subplots(1,2, figsize=(10,10))
axs[0].imshow(normalize(grads_weights1), cmap=plt.cm.jet, alpha=0.5, interpolation='nearest')
axs[0].set_title('1st class: %s' %class_names[label_1])
axs[0].axis('off')

axs[1].imshow(normalize(grads_weights2), cmap=plt.cm.jet, alpha=0.5, interpolation='nearest')
axs[1].set_title('2nd class: %s' %class_names[label_2])
axs[1].axis('off')
(-0.5, 223.5, 223.5, -0.5)

  • Saliency map에 사용되는 변수

    • gradients w.r.t image
    • gradients w.r.t last convs
def saliency_normalize(img):
    """Normalize the image range for visualization"""
    return (img - img.min()) / (img.max()-img.min())
# range 0 ~ 1
saliency_normalize(class_map1)
array([[0.16219318, 0.16104256, 0.15989193, ..., 0.        , 0.        ,
        0.        ],
       [0.16196221, 0.16083856, 0.15971489, ..., 0.01158051, 0.01158051,
        0.01158051],
       [0.16173124, 0.16063455, 0.15953785, ..., 0.02316104, 0.02316104,
        0.02316104],
       ...,
       [0.21457733, 0.21049462, 0.20641196, ..., 0.23140056, 0.23140056,
        0.23140056],
       [0.21457733, 0.21049462, 0.20641196, ..., 0.23140056, 0.23140056,
        0.23140056],
       [0.21457733, 0.21049462, 0.20641196, ..., 0.23140056, 0.23140056,
        0.23140056]], dtype=float32)


  • $0 \sim 1$사이로 정규화 시킨 class activation map($\frac{\partial \textbf{y}}{\partial \text{last conv}}$)을 grad-RGB($\frac{\partial \textbf{y}}{\partial \textbf{x}}$)에 채널별로 각각 곱해줌
gradBGR1 = normalize(grads_weights1)
# VGG16 use BGR internally, so we manually change BGR to RGB
gradRGB_cam1 = np.dstack((
    normalize(gradBGR1[:, :, 2]* saliency_normalize(class_map1)),
    normalize(gradBGR1[:, :, 1]* saliency_normalize(class_map1)),
    normalize(gradBGR1[:, :, 0]* saliency_normalize(class_map1))
))

gradBGR2 = normalize(grads_weights2)
# VGG16 use BGR internally, so we manually change BGR to RGB
gradRGB_cam2 = np.dstack((
    normalize(gradBGR2[:, :, 2]* saliency_normalize(class_map2)),
    normalize(gradBGR2[:, :, 1]* saliency_normalize(class_map2)),
    normalize(gradBGR2[:, :, 0]* saliency_normalize(class_map2))
))
fig, axs = plt.subplots(1,2, figsize=(10,10))
axs[0].imshow(gradRGB_cam1, cmap=plt.cm.jet, alpha=0.5, interpolation='nearest')
axs[0].set_title('1st class: %s' %class_names[label_1])
axs[0].axis('off')

axs[1].imshow(gradRGB_cam2, cmap=plt.cm.jet, alpha=0.5, interpolation='nearest')
axs[1].set_title('2nd class: %s' %class_names[label_2])
axs[1].axis('off')
(-0.5, 223.5, 223.5, -0.5)


Fine-tunning

loss_tf = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=output, labels= labels_tf ), name='loss_tf')
loss_tf
<tf.Tensor 'loss_tf:0' shape=() dtype=float32>


tf.trainable_variables() 
[<tf.Variable 'conv1_1/W:0' shape=(3, 3, 3, 64) dtype=float32_ref>,
 <tf.Variable 'conv1_1/b:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'conv1_2/W:0' shape=(3, 3, 64, 64) dtype=float32_ref>,
 <tf.Variable 'conv1_2/b:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'conv2_1/W:0' shape=(3, 3, 64, 128) dtype=float32_ref>,
 <tf.Variable 'conv2_1/b:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'conv2_2/W:0' shape=(3, 3, 128, 128) dtype=float32_ref>,
 <tf.Variable 'conv2_2/b:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'conv3_1/W:0' shape=(3, 3, 128, 256) dtype=float32_ref>,
 <tf.Variable 'conv3_1/b:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'conv3_2/W:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'conv3_2/b:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'conv3_3/W:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'conv3_3/b:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'conv4_1/W:0' shape=(3, 3, 256, 512) dtype=float32_ref>,
 <tf.Variable 'conv4_1/b:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'conv4_2/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv4_2/b:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'conv4_3/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv4_3/b:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'conv5_1/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv5_1/b:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'conv5_2/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv5_2/b:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'conv5_3/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv5_3/b:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'fc1/W:0' shape=(25088, 4096) dtype=float32_ref>,
 <tf.Variable 'fc1/b:0' shape=(4096,) dtype=float32_ref>,
 <tf.Variable 'fc2/W:0' shape=(4096, 4096) dtype=float32_ref>,
 <tf.Variable 'fc2/b:0' shape=(4096,) dtype=float32_ref>,
 <tf.Variable 'fc3/W:0' shape=(4096, 1000) dtype=float32_ref>,
 <tf.Variable 'fc3/b:0' shape=(1000,) dtype=float32_ref>]


  • filter의 결과값은 한번 출력되면 사라지는 것을 주의
weights_only = filter( lambda x: x.name.endswith('W:0'), tf.trainable_variables() )
list(weights_only)
[<tf.Variable 'conv1_1/W:0' shape=(3, 3, 3, 64) dtype=float32_ref>,
 <tf.Variable 'conv1_2/W:0' shape=(3, 3, 64, 64) dtype=float32_ref>,
 <tf.Variable 'conv2_1/W:0' shape=(3, 3, 64, 128) dtype=float32_ref>,
 <tf.Variable 'conv2_2/W:0' shape=(3, 3, 128, 128) dtype=float32_ref>,
 <tf.Variable 'conv3_1/W:0' shape=(3, 3, 128, 256) dtype=float32_ref>,
 <tf.Variable 'conv3_2/W:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'conv3_3/W:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'conv4_1/W:0' shape=(3, 3, 256, 512) dtype=float32_ref>,
 <tf.Variable 'conv4_2/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv4_3/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv5_1/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv5_2/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'conv5_3/W:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'fc1/W:0' shape=(25088, 4096) dtype=float32_ref>,
 <tf.Variable 'fc2/W:0' shape=(4096, 4096) dtype=float32_ref>,
 <tf.Variable 'fc3/W:0' shape=(4096, 1000) dtype=float32_ref>]


weights_only = filter( lambda x: x.name.endswith('W:0'), tf.trainable_variables() )
[tf.nn.l2_loss(x) for x in list(weights_only)]
[<tf.Tensor 'L2Loss:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_1:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_2:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_3:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_4:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_5:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_6:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_7:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_8:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_9:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_10:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_11:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_12:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_13:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_14:0' shape=() dtype=float32>,
 <tf.Tensor 'L2Loss_15:0' shape=() dtype=float32>]


weights_only = filter( lambda x: x.name.endswith('W:0'), tf.trainable_variables() )
weight_decay = tf.reduce_sum(tf.stack([tf.nn.l2_loss(x) for x in weights_only])) * 0.0005 # decay rate
weight_decay
<tf.Tensor 'mul:0' shape=() dtype=float32>


  • 아래의 loss를 손실함수로 두고 학습하면 된다.
loss_tf += weight_decay


  • 정확도에 대한 op는 아래와 같이 정의 할 수 있다.
tf.argmax(output, 1)
<tf.Tensor 'ArgMax:0' shape=(?,) dtype=int64>


  • argmax의 output 자료형 또한 변환시킬 수 있다.
correct_pred = tf.equal(tf.argmax(output, 1, output_type=tf.int32), labels_tf)
correct_pred
<tf.Tensor 'Equal:0' shape=(?,) dtype=bool>


accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
accuracy
<tf.Tensor 'Mean_3:0' shape=() dtype=float32>

Reference

https://github.com/waleedgondal/weakly_supervised_localizations_tf

https://github.com/insikk/Grad-CAM-tensorflow/blob/master/utils.py


DSBA