【原】ESP32-CAM使用+源码分析

云深无际 2021-11-03

展开全文

我觉得一开始就得放一下这个图

这个是一些相关的特性，就是沾ESP32芯片的光了。

忘了说这个东西10g可以上飞机，就是这个处理速度实在拉胯

这里放一些更加细致的资料

这个帧率吧太小了，没有实际的应用价值我觉得

概述

MTMN 由三个主要部分组成：

建议网（P-Net）：建议候选边界框，并将其发送到R-Net;
优化网络（R-Net）：从P-Net筛选边界框;
输出网络（O-Net）：输出最终结果，即精确边界框、置信系数和 5 点地标。

API 简介

box_array_t *face_detect(dl_matrix3du_t *image_matrix, mtmn_config_t *config);

这将处理整个人脸检测任务。face_detect()

输入包括：

image_matrix：类型中的图像dl_matrix3du_t
配置：MTMN 的配置。有关更多详细信息，请参阅"提前配置"部分。

输出为：

类型值包含人脸框，以及每个框的分数和地标。box_array_t

此结构定义如下：

typedef struct tag_box_list
{
fptp_t *score;
box_t *box;
landmark_t *landmark;
int len;
} box_array_t;

该结构包含数组头，每个数组具有相同的长度，即图像中的面数。

提前配置

face_detect()为用户自定义定义提供参数。config

box_array_t *face_detect(dl_matrix3du_t *image_matrix, mtmn_config_t *config);

的定义：mtmn_config_t

typedef struct
{
float min_face; /// The minimum size of a detectable face
float pyramid; /// The scale of the gradient scaling for the input images
int pyramid_times; /// The pyramid resizing times
threshold_config_t p_threshold; /// The thresholds for P-Net. For details, see the definition of threshold_config_t
threshold_config_t r_threshold; /// The thresholds for R-Net. For details, see the definition of threshold_config_t
threshold_config_t o_threshold; /// The thresholds for O-Net. For details, see the definition of threshold_config_t
mtmn_resize_type type; /// The image resize type. 'pyramid' will lose efficacy, when 'type'==FAST.
} mtmn_config_t;typedef struct
{
float score; /// The threshold of confidence coefficient. The candidate bounding boxes with a confidence coefficient lower than the threshold will be filtered out.
float nms; /// The threshold of NMS. During the Non-Maximum Suppression, the candidate bounding boxes with a overlapping ratio higher than the threshold will be filtered out.
int candidate_number; /// The maximum number of allowed candidate bounding boxes. Only the first 'candidate_number' of all the candidate bounding boxes will be kept.
} threshold_config_t;

min_face：

不同大小的生成图像的数量越大;
可检测面的最小尺寸越小;
处理时间越长
范围：+12，原始输入图像最短边缘的长度）。
对于固定大小的原始输入图像，尺寸越小，min_face
反之亦然。

金字塔

不同大小的生成图像的数量越大;
检测比越高;
处理时间越长
指定控制生成的金字塔的刻度。
范围：（0，1）
对于固定大小的原始输入图像，大小越大，pyramid
反之亦然。

pyramid_times

指定控制生成的金字塔的数字。
范围：{1，\inf）
与金字塔和min_face一起，可以在范围 [min_face、min_face/金字塔=pyramid_times] 和 min_face/金字塔=pyramid_times < 原始输入图像最短边缘的长度确定主要可检测到的面大小。

类型

FAST：金字塔等于默认值。在同一金字塔值中，类型比类型快。0.707106781FASTNORMAL
NORMAL：如果要自定义金字塔值，请将类型设置为请。NORMAL
选项：或FASTNORMAL

分数阈值

筛选出候选边界框的数量越大
检测比率越低
范围：（0，1）
对于固定大小的原始输入图像，大小越大，score
反之亦然。

nms 阈值

检测到重叠面的可能性越高;
检测到同一面的候选边界框的数量越大
范围：（0，1）
对于固定大小的原始输入图像，大小越大，nms
反之亦然。

候选人编号

越大，处理时间越长;candidate_number
O-Net 越大，检测到的面数越大candidate_number
P-Net：[1， 200]
R-Net：[1， 100]
O-Net：[1， 10]
指定每个网络的输出候选框数。
范围
对于固定大小的原始输入图像，
反之亦然。

用户可以根据实际要求配置这些参数。另请参阅以下通用方案（一个人脸检测）的建议配置：

mtmn_config.type = FAST;
mtmn_config.min_face = 80;
mtmn_config.pyramid = 0.707;
mtmn_config.pyramid_times = 4;
mtmn_config.p_threshold.score = 0.6;
mtmn_config.p_threshold.nms = 0.7;
mtmn_config.p_threshold.candidate_number = 20;
mtmn_config.r_threshold.score = 0.7;
mtmn_config.r_threshold.nms = 0.7;
mtmn_config.r_threshold.candidate_number = 10;
mtmn_config.o_threshold.score = 0.7;
mtmn_config.o_threshold.nms = 0.7;
mtmn_config.o_threshold.candidate_number = 1;

型号选择

MTMN 现在有两个版本可用：

MTMN lite 在量化（默认值)
浮动中的 Mtmn 精简版
MTMN 在量化方面很重

性能

我们使用相同的配置和我们自己的测试集来评估所有型号。结果如下所示。

mtmn_config.type = FAST;
mtmn_config.pyramid = 0.707;
mtmn_config.min_face = 80;
mtmn_config.pyramid_times = 4;
mtmn_config.p_threshold.score = 0.6;
mtmn_config.p_threshold.nms = 0.7;
mtmn_config.p_threshold.candidate_number = 100;
mtmn_config.r_threshold.score = 0.7;
mtmn_config.r_threshold.nms = 0.7;
mtmn_config.r_threshold.candidate_number = 100;
mtmn_config.o_threshold.score = 0.7;
mtmn_config.o_threshold.nms = 0.7;
mtmn_config.o_threshold.candidate_number = 1;

这个是性能部分

https://github.com/espressif/esp-face/blob/master/face_detection/README.md

参考资料

这个是服务稳定后的后的一些打印的东西

接着又是一个服务器相关的打印语句

因为在主文件里面已经选择了对应的平台，所以在camera_pin的里面全点亮

先看一下概览的样子，只引入了相机，wifi，引脚的头文件

这些引入的头文件里面还有什么的头文件再说

一开始开启了串口的一些相关的配置，初始化了波特率，打开了debug。

下面是引脚的配置部分，在使用这些库去移植的时候需要改动的地方

这个代码就说的很明白了，就是这个地方没有写很底层的代码就是去申请内存什么的。只是开启了更高的分辨率，相当于开启功能

相机初始化

摄像头初始化

Step:

寻找摄像头

提供摄像头时钟、初始化 SCCB 总线、硬件复位摄像头
轮询地址寻找摄像头，通过 SCCB 总线读取摄像头 ID 等信息
更改摄像头的 ID 判断型号，并绑定对应的相关函数（摄像头传感器配置相关函数）

初始化摄像头

根据选择的图像格式、和是否是高速模式，选择对应的 DMA BUF 处理函数和 DMA FIFO 模式
初始化 I2S 总线，使能 I2S_IN_DONE_INT 中断：当前 DMA 接收链表描述符被处理时即触发此中断 3. 初始化 DMA 相关变量（链表描述符、DMA 使用的数据缓冲区链表等），DMA 单次最多 4KB 、每行 DMA 采集几次
初始化存储图像的数据缓冲区（添加到一个链表中）并清空
初始化相关信号量：DMA 数据采集完成、一帧图像采集完成信号量、图像数据缓冲区进出信号量
创建 dma_filter_task 将 DMA 数据转换成像素数据并存储到图像缓冲区，这里也会检查是否有脏数据产生，判断是否接收完一帧完整的图像
初始化 vsync io 中断：每一帧图像开始结尾都会发生电平翻转
摄像头传感器相关配置（图像大小、格式等）

剩下有一点代码，我因为自己这个图像识别方面学的少就不作更多的分析了

代码放这里

最后一部分是WiFi配置部分，一开始的配置通过指针传到这里

配置部分到这里就好了。

因为是demo嘛，一些逻辑代码就没有写

然后整体来说，框架搭好啦。

然后这两个地方是超级大的数组

这个是最后一部分的代码库的头文件了

你看引入了，http的服务器，定时器相关，相机，图像转换库，相机index？

arduino的头，以及三个关于人脸识别的库

一开始是宏定义，名字一目了然

下面是两个结构体