基於TensorFlow Object Detection API進行遷移學習訓練自己的人臉檢測模型(二)


前言

已完成數據預處理工作,具體參照:

基於TensorFlow Object Detection API進行遷移學習訓練自己的人臉檢測模型(一)

設置配置文件

新建目錄face_faster_rcnn

將上文已完成預數據處理的目錄data移動至face_faster_rcnn目錄下,

並在face_faster_rcnn目錄下創建face_label.pbtxt文件,內容如下:

item {
  id: 1
  name: 'face'
}

在已下載的TensorFlow Object Detection API目錄下搜索faster_rcnn_inception_v2_coco.config,具體目錄models-master\research\object_detection\samples\configs,將其拷貝至face_faster_rcnn目錄下

 

在存儲庫中,faster_rcnn_inception_v2_coco.config文件用來訓練人工神經網絡的配置文件。該文件基於pet檢測器。
在本例中,num_classes的數量仍然是一個,因為只有人臉才會被識別。

 


變量fine_tune_checkpoint用於指示以前模型的路徑以獲得學習。微調檢查點文件(fine tune checkpoint file)在應用轉移學習上被使用。轉移學習是一種機器學習方法,它專注於將從一個問題中獲得的知識應用到另一個問題上。

 


在類train_input_reader中,用帶有TFRecord文件的鏈接以訓練模型。在配置文件中,需要將其自定義到正確的位置。
變量label_map_path包含索引ID和名稱。使用這個文件,0被用作占位符,所以我們從數字1開始。


驗證有兩個很重要的變量。在eval_config類中的變量 num_examples用於設置示例的數量。
eval_input_reader類描述了驗證數據的位置。在這個位置也有一條路徑。

 


此外,還可以改變學習速度、批量大小和其他設置。現在,我保留了默認設置。

完整的faster_rcnn_inception_v2_coco.config

# Faster R-CNN with Inception v2, configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.


model {
  faster_rcnn {
    num_classes: 1
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_inception_v2'
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 300
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0002
          schedule {
            step: 900000
            learning_rate: .00002
          }
          schedule {
            step: 1200000
            learning_rate: .000002
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "data/faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the COCO dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "data/train.record"
  }
  label_map_path: "face_label.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/val.record"
  }
  label_map_path: "face_label.pbtxt"
  shuffle: false
  num_readers: 1
}

訓練

將models-master\research目錄下的object_detection文件夾整個拷貝至face_faster_rcnn目錄下,並將object_detection目錄下的train.py、export_inference_graph.py、eval.py三個文件拷貝至face_faster_rcnn目錄,至此目錄結構全部完成

紅框可忽略,為已訓練的模型

現在,它將開始真正的工作。計算機將從人臉檢測數據集中學習並建立一個神經網絡。當我在CPU上模擬訓練時,需要幾天的時間才能得到一個好的結果。但強大的Nvidia顯卡可以將時間縮短為幾個小時。

python train.py --logtostderr --train_dir=data/ --pipeline_config_path=faster_rcnn_inception_v2_coco.config  --train_dir=model_output

運行會出現以下異常

ValueError: Tried to convert 't' to a tensor and failed. Error: Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted [].

可能是python3兼容性問題

把object_detection/utils/learning_schedules.py文件的 第167-169行由

#修改167-169行
rate_index = tf.reduce_max(tf.where(tf.greater_equal(global_step, boundaries),
                                    range(num_boundaries),
                                    [0] * num_boundaries))
#修改成
rate_index = tf.reduce_max(tf.where(tf.greater_equal(global_step, boundaries),
                                    list(range(num_boundaries)),
                                    [0] * num_boundaries))

Tensorboard深入的了解了學習過程。該工具是Tensorflow的一部分,且可以自動安裝。

tensorboard --logdir=model_output

 

檢查點轉換為protobuf變為可運行的模型

在帶有計算機視覺庫Tensorflow目標識別檢測中使用該模型。 以下命令提供了模型存儲庫的位置和最后一個檢查點。文件夾文件夾將包含frozen_inference_graph.pb

python export_inference_graph.py --input_type image_tensor --pipeline_config_path faster_rcnn_inception_v2_coco.config --trained_checkpoint_prefix model_output/model.ckpt-11543 --output_directory face_faster_rcnn_model/

打包模型

tar zcvf face_faster_rcnn_model.tar.gz face_faster_rcnn_model

預測新圖片

編寫ImageTest.py

# coding: utf-8
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
#sys.path.append("..")
from object_detection.utils import ops as utils_ops

if tf.__version__ < '1.4.0':
    raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!')
  
# This is needed to display the images.
from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
##Model preparation##

# What model to download.
MODEL_NAME = 'face_faster_rcnn_model'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'face_label.pbtxt')

NUM_CLASSES = 1

## Download Model##
#opener = urllib.request.URLopener()
#opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
  file_name = os.path.basename(file.name)
  if 'frozen_inference_graph.pb' in file_name:
    tar_file.extract(file, os.getcwd())
    
## Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')
    
## Loading label map
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)
      
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # 獲取輸入和輸出張量
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # 以下處理僅針對單個圖像
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # 需要重從框坐標轉換成圖像坐標,並適合圖像大小。
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # 通過添加批次尺寸來遵循慣例
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # 所有輸出都是FLUAT32 NUMPY數組,以適應轉換類型
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict              

for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # 這個array在之后會被用來准備為圖片加上框和標簽
  image_np = load_image_into_numpy_array(image)
  # 擴展維度,因為模型期望圖像具有形狀:: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # 執行偵測任務
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # 檢測結果的可視化
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image_np)
  plt.show()
  

新建test_images文件夾,並放入圖片

修改MODEL_NAME,PATH_TO_LABELS

python ImageTest.py

效果

評估

除了用於Tensorflow目標識別檢測訓練的數據外,還有一個評估數據集。基於此評估數據集,可以計算精度。對於我的模型,我計算了精度(平均精度)。我以14392步的速度獲得了83.80%的分數(epochs)。對於這個過程,Tensorflow有一個腳本,使它可以在Tensorboard中看到分數是多少。除了訓練之外,建議你運行評估過程。

python eval.py --logtostderr --pipeline_config_path=faster_rcnn_inception_v2_coco.config  --checkpoint_dir=model_output --eval_dir=eval

檢測視頻人臉

新建CameraFaceTest.py

# coding: utf-8

# # Object Detection Demo
# Welcome to the object detection inference walkthrough!  This notebook will walk you step by step through the process of using a pre-trained model to detect objects in an image. Make sure to follow the [installation instructions](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/installation.md) before you start.

# # Imports

# In[1]:

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

import cv2                  #add 20170825
#此處修改rtsp路徑
cap = cv2.VideoCapture(1)   #add 20170825

# ## Env setup

# In[2]:                                  #delete 20170825
# This is needed to display the images.    #delete 20170825
#get_ipython().magic('matplotlib inline')   #delete 20170825

# This is needed since the notebook is stored in the object_detection folder.  
sys.path.append("..")


# ## Object detection imports
# Here are the imports from the object detection module.

# In[3]:

from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util


# # Model preparation 

# ## Variables
# 
# Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_CKPT` to point to a new .pb file.  
# 
# By default we use an "SSD with Mobilenet" model here. See the [detection model zoo](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md) for a list of other models that can be run out-of-the-box with varying speeds and accuracies.

# In[4]:

# What model to download.

#此處修改模型路徑

#模型地址:https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
MODEL_NAME = 'faster_rcnn_inception_v2_coco_2018_01_28'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'face_label.pbtxt')

NUM_CLASSES = 1


# ## Download Model

# In[5]:

#opener = urllib.request.URLopener()
#opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
  file_name = os.path.basename(file.name)
  if 'frozen_inference_graph.pb' in file_name:
    tar_file.extract(file, os.getcwd())


# ## Load a (frozen) Tensorflow model into memory.

# In[6]:

detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')


# ## Loading label map
# Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine

# In[7]:

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

gpu_options = tf.GPUOptions(allow_growth=True)
#gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
with detection_graph.as_default():
  with tf.Session(graph=detection_graph,config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
    while True:    
      ret, image_np = cap.read()
      
      # 擴展維度,應為模型期待: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      # 每個框代表一個物體被偵測到
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
       #每個分值代表偵測到物體的可信度.  
      scores = detection_graph.get_tensor_by_name('detection_scores:0')
      classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')
      # 執行偵測任務.  
      (boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # 檢測結果的可視化
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8)
      cv2.imshow('object detection', cv2.resize(image_np,(800,600)))
      if cv2.waitKey(25) & 0xFF ==ord('q'):
        cv2.destroyAllWindows()
        break




注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
粤ICP备14056181号  © 2014-2021 ITdaan.com