分享

机器学习笔记...

 行走在理想边缘 2022-05-23 发布于四川

        学习如何训练自定义深度学习模型,以通过Keras和TensorFlow的来执行对象检测。

一、安装环境

        1、首先安装tensorflow 2.0,参考:机器学习笔记 - win10安装tensorflow-gpu.2.2 + Cuda10+cudnn7.6.5_bashendixie5的博客-CSDN博客https://blog.csdn.net/bashendixie5/article/details/110260615

        2、提前下载vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5文件,放到C:\Users\【user】\.keras\models文件夹下

二、准备数据集

        1、首先安装打标签的工具,此次使用的是labelImg,git官网GitHub - tzutalin/labelImg: 🖍️ LabelImg is a graphical image annotation tool and label object bounding boxes in imageshttps://github.com/tzutalin/labelImg,安装方式官网有详细说明,如果有什么包装不上,就下载到本地再装,运行起来如下界面。

        2、标记文件,得到xml文件

        3、xml转csv

  1. import os
  2. import glob
  3. import pandas as pd
  4. import xml.etree.ElementTree as ET

  5. def xml_to_csv(path):
  6. xml_list = []
  7. # 读取注释文件
  8. for xml_file in glob.glob(path + '/*.xml'):
  9. tree = ET.parse(xml_file)
  10. root = tree.getroot()
  11. for member in root.findall('object'):
  12. value = (root.find('filename').text,
  13. int(root.find('size')[0].text),
  14. int(root.find('size')[1].text),
  15. member[0].text,
  16. int(member[4][0].text),
  17. int(member[4][1].text),
  18. int(member[4][2].text),
  19. int(member[4][3].text)
  20. )
  21. xml_list.append(value)
  22. column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']

  23. # 将所有数据分为样本集和验证集,一般按照3:1的比例
  24. train_list = xml_list[0: int(len(xml_list) * 0.67)]
  25. eval_list = xml_list[int(len(xml_list) * 0.67) + 1: ]

  26. # 保存为CSV格式
  27. train_df = pd.DataFrame(train_list, columns=column_name)
  28. eval_df = pd.DataFrame(eval_list, columns=column_name)
  29. train_df.to_csv('C:/Users/zxc/Desktop/labelImg/train_peaches.csv', index=None)
  30. eval_df.to_csv('C:/Users/zxc/Desktop/labelImg/eval_peaches.csv', index=None)

  31. def main():
  32. path = 'C:/Users/zxc/Desktop/labelImg'
  33. xml_to_csv(path)
  34. print('Successfully converted xml to csv.')
  35. main()

三、训练代码

1、配置文件

        config.py

  1. # import the necessary packages
  2. import os
  3. # define the base path to the input dataset and then use it to derive
  4. # the path to the images directory and annotation CSV file
  5. BASE_PATH = "C:/Users/zxc/Desktop/labelImg"
  6. IMAGES_PATH = os.path.sep.join([BASE_PATH, ""])
  7. ANNOTS_PATH = os.path.sep.join([BASE_PATH, "train_peaches.csv"])
  8. # define the path to the base output directory
  9. BASE_OUTPUT = "C:/Users/zxc/Desktop/labelImg/output"
  10. # define the path to the output serialized model, model training plot,
  11. # and testing image filenames
  12. MODEL_PATH = os.path.sep.join([BASE_OUTPUT, "detector.h5"])
  13. PLOT_PATH = os.path.sep.join([BASE_OUTPUT, "plot.png"])
  14. TEST_FILENAMES = os.path.sep.join([BASE_OUTPUT, "test_images.txt"])
  15. # initialize our initial learning rate, number of epochs to train
  16. # for, and the batch size
  17. INIT_LR = 1e-4
  18. NUM_EPOCHS = 25
  19. BATCH_SIZE = 32

2、训练模型

        train.py

  1. # import the necessary packages
  2. import config
  3. from tensorflow.keras.applications import VGG16
  4. from tensorflow.keras.layers import Flatten
  5. from tensorflow.keras.layers import Dense
  6. from tensorflow.keras.layers import Input
  7. from tensorflow.keras.models import Model
  8. from tensorflow.keras.optimizers import Adam
  9. from tensorflow.keras.preprocessing.image import img_to_array
  10. from tensorflow.keras.preprocessing.image import load_img
  11. from sklearn.model_selection import train_test_split
  12. import matplotlib.pyplot as plt
  13. import numpy as np
  14. import cv2
  15. import os

  16. # load the contents of the CSV annotations file
  17. print("[INFO] loading dataset...")
  18. rows = open(config.ANNOTS_PATH).read().strip().split("\n")
  19. # initialize the list of data (images), our target output predictions
  20. # (bounding box coordinates), along with the filenames of the
  21. # individual images
  22. data = []
  23. targets = []
  24. filenames = []

  25. # loop over the rows
  26. for row in rows:
  27. # break the row into the filename and bounding box coordinates
  28. row = row.split(",")
  29. (filename, startX, startY, endX, endY) = row
  30. # derive the path to the input image, load the image (in OpenCV
  31. # format), and grab its dimensions
  32. imagePath = os.path.sep.join([config.IMAGES_PATH, filename])
  33. image = cv2.imread(imagePath)
  34. (h, w) = image.shape[:2]
  35. # scale the bounding box coordinates relative to the spatial
  36. # dimensions of the input image
  37. startX = float(startX) / w
  38. startY = float(startY) / h
  39. endX = float(endX) / w
  40. endY = float(endY) / h
  41. # load the image and preprocess it
  42. image = load_img(imagePath, target_size=(224, 224))
  43. image = img_to_array(image)
  44. # update our list of data, targets, and filenames
  45. data.append(image)
  46. targets.append((startX, startY, endX, endY))
  47. filenames.append(filename)

  48. # convert the data and targets to NumPy arrays, scaling the input
  49. # pixel intensities from the range [0, 255] to [0, 1]
  50. data = np.array(data, dtype="float32") / 255.0
  51. targets = np.array(targets, dtype="float32")
  52. # partition the data into training and testing splits using 90% of
  53. # the data for training and the remaining 10% for testing
  54. split = train_test_split(data, targets, filenames, test_size=0.10, random_state=42)
  55. # unpack the data split
  56. (trainImages, testImages) = split[:2]
  57. (trainTargets, testTargets) = split[2:4]
  58. (trainFilenames, testFilenames) = split[4:]
  59. # write the testing filenames to disk so that we can use then
  60. # when evaluating/testing our bounding box regressor
  61. print("[INFO] saving testing filenames...")
  62. f = open(config.TEST_FILENAMES, "w")
  63. f.write("\n".join(testFilenames))
  64. f.close()

  65. # load the VGG16 network, ensuring the head FC layers are left off
  66. vgg = VGG16(weights="imagenet", include_top=False, input_tensor=Input(shape=(224, 224, 3)))
  67. # freeze all VGG layers so they will *not* be updated during the
  68. # training process
  69. vgg.trainable = False
  70. # flatten the max-pooling output of VGG
  71. flatten = vgg.output
  72. flatten = Flatten()(flatten)
  73. # construct a fully-connected layer header to output the predicted
  74. # bounding box coordinates
  75. bboxHead = Dense(128, activation="relu")(flatten)
  76. bboxHead = Dense(64, activation="relu")(bboxHead)
  77. bboxHead = Dense(32, activation="relu")(bboxHead)
  78. bboxHead = Dense(4, activation="sigmoid")(bboxHead)
  79. # construct the model we will fine-tune for bounding box regression
  80. model = Model(inputs=vgg.input, outputs=bboxHead)

  81. # initialize the optimizer, compile the model, and show the model
  82. # summary
  83. opt = Adam(lr=config.INIT_LR)
  84. model.compile(loss="mse", optimizer=opt)
  85. print(model.summary())
  86. # train the network for bounding box regression
  87. print("[INFO] training bounding box regressor...")
  88. H = model.fit(
  89. trainImages, trainTargets,
  90. validation_data=(testImages, testTargets),
  91. batch_size=config.BATCH_SIZE,
  92. epochs=config.NUM_EPOCHS,
  93. verbose=1)

  94. # serialize the model to disk
  95. print("[INFO] saving object detector model...")
  96. model.save(config.MODEL_PATH, save_format="h5")
  97. # plot the model training history
  98. N = config.NUM_EPOCHS
  99. plt.style.use("ggplot")
  100. plt.figure()
  101. plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
  102. plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
  103. plt.title("Bounding Box Regression Loss on Training Set")
  104. plt.xlabel("Epoch #")
  105. plt.ylabel("Loss")
  106. plt.legend(loc="lower left")
  107. plt.savefig(config.PLOT_PATH)

3、进行预测

        predict.py

  1. # import the necessary packages
  2. import config
  3. from tensorflow.keras.preprocessing.image import img_to_array
  4. from tensorflow.keras.preprocessing.image import load_img
  5. from tensorflow.keras.models import load_model
  6. import numpy as np
  7. import mimetypes
  8. import argparse
  9. import imutils
  10. import cv2
  11. import os

  12. # construct the argument parser and parse the arguments
  13. ap = argparse.ArgumentParser()
  14. ap.add_argument("-i", "--input", required=True,
  15. help="path to input image/text file of image filenames")
  16. args = vars(ap.parse_args())

  17. # determine the input file type, but assume that we're working with
  18. # single input image
  19. filetype = mimetypes.guess_type(args["input"])[0]
  20. imagePaths = [args["input"]]
  21. # if the file type is a text file, then we need to process *multiple*
  22. # images
  23. if "text/plain" == filetype:
  24. # load the filenames in our testing file and initialize our list
  25. # of image paths
  26. filenames = open(args["input"]).read().strip().split("\n")
  27. imagePaths = []
  28. # loop over the filenames
  29. for f in filenames:
  30. # construct the full path to the image filename and then
  31. # update our image paths list
  32. p = os.path.sep.join([config.IMAGES_PATH, f])
  33. imagePaths.append(p)

  34. # load our trained bounding box regressor from disk
  35. print("[INFO] loading object detector...")
  36. model = load_model(config.MODEL_PATH)
  37. # loop over the images that we'll be testing using our bounding box
  38. # regression model
  39. for imagePath in imagePaths:
  40. # load the input image (in Keras format) from disk and preprocess
  41. # it, scaling the pixel intensities to the range [0, 1]
  42. image = load_img(imagePath, target_size=(224, 224))
  43. image = img_to_array(image) / 255.0
  44. image = np.expand_dims(image, axis=0)

  45. # make bounding box predictions on the input image
  46. preds = model.predict(image)[0]
  47. (startX, startY, endX, endY) = preds
  48. # load the input image (in OpenCV format), resize it such that it
  49. # fits on our screen, and grab its dimensions
  50. image = cv2.imread(imagePath)
  51. image = imutils.resize(image, width=600)
  52. (h, w) = image.shape[:2]
  53. # scale the predicted bounding box coordinates based on the image
  54. # dimensions
  55. startX = int(startX * w)
  56. startY = int(startY * h)
  57. endX = int(endX * w)
  58. endY = int(endY * h)
  59. # draw the predicted bounding box on the image
  60. cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)
  61. # show the output image
  62. cv2.imwrite("C:/Users/csv/Desktop/111.png", image)
  63. cv2.waitKey(0)

4、训练结果一

        由于我只准备了几十张图片,而且不是有明显特征的,所以训练结果很差,就不上结果图了,啥都看不出来。训练完成得到h5文件,可以用于测试。

5、训练结果二

        准备了246张猫咪的图片,对头部进行了重新标记,训练结果也是不太理想,应该还是数据集不够,质量一般。

        从预测结果看 ,位置还是差很多,回头再仔细准备数据集测试看看

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多