使用冻结层进行迁移学习

使用冻结层进行迁移学习

在yolov5的训练过程中，作者介绍了如何使用冻结层实现迁移学习的策略。具体可以参考官方话题：Transfer Learning with Frozen Layers · Issue #1314 · ultralytics/yolov5 · GitHub

在很多情况下，迁移学习是一种十分有用的方法，可以在新的数据集上快速重新训练模型，无需重新训练整个模型。对部分权重进行冻结，其余权重进行更新并计算损失，比正常训练需要更少的计算资源，更少的训练时间（更快的达到收敛速度），yolov5将冻结层的梯度设置为0控制参数更新实现冻结训练。下面我们看实施的细节：

1.1层结构

通过如下指令打印出模型的层结构：

for k, v in model.named_parameters():print(k)

#output

model.0.conv.weightmodel.0.bn.weightmodel.0.bn.biasmodel.1.conv.weightmodel.1.bn.weightmodel.1.bn.biasmodel.2.cv1.conv.weightmodel.2.cv1.bn.weightmodel.2.cv1.bn.biasmodel.2.cv2.conv.weightmodel.2.cv2.bn.weightmodel.2.cv2.bn.biasmodel.2.cv3.conv.weightmodel.2.cv3.bn.weightmodel.2.cv3.bn.biasmodel.2.m.0.cv1.conv.weightmodel.2.m.0.cv1.bn.weightmodel.2.m.0.cv1.bn.biasmodel.2.m.0.cv2.conv.weightmodel.2.m.0.cv2.bn.weightmodel.2.m.0.cv2.bn.biasmodel.3.conv.weightmodel.3.bn.weightmodel.3.bn.biasmodel.4.cv1.conv.weightmodel.4.cv1.bn.weightmodel.4.cv1.bn.biasmodel.4.cv2.conv.weightmodel.4.cv2.bn.weightmodel.4.cv2.bn.biasmodel.4.cv3.conv.weightmodel.4.cv3.bn.weightmodel.4.cv3.bn.biasmodel.4.m.0.cv1.conv.weightmodel.4.m.0.cv1.bn.weightmodel.4.m.0.cv1.bn.biasmodel.4.m.0.cv2.conv.weightmodel.4.m.0.cv2.bn.weightmodel.4.m.0.cv2.bn.biasmodel.4.m.1.cv1.conv.weightmodel.4.m.1.cv1.bn.weightmodel.4.m.1.cv1.bn.biasmodel.4.m.1.cv2.conv.weightmodel.4.m.1.cv2.bn.weightmodel.4.m.1.cv2.bn.biasmodel.5.conv.weightmodel.5.bn.weightmodel.5.bn.biasmodel.6.cv1.conv.weightmodel.6.cv1.bn.weightmodel.6.cv1.bn.biasmodel.6.cv2.conv.weightmodel.6.cv2.bn.weightmodel.6.cv2.bn.biasmodel.6.cv3.conv.weightmodel.6.cv3.bn.weightmodel.6.cv3.bn.biasmodel.6.m.0.cv1.conv.weightmodel.6.m.0.cv1.bn.weightmodel.6.m.0.cv1.bn.biasmodel.6.m.0.cv2.conv.weightmodel.6.m.0.cv2.bn.weightmodel.6.m.0.cv2.bn.biasmodel.6.m.1.cv1.conv.weightmodel.6.m.1.cv1.bn.weightmodel.6.m.1.cv1.bn.biasmodel.6.m.1.cv2.conv.weightmodel.6.m.1.cv2.bn.weightmodel.6.m.1.cv2.bn.biasmodel.6.m.2.cv1.conv.weightmodel.6.m.2.cv1.bn.weightmodel.6.m.2.cv1.bn.biasmodel.6.m.2.cv2.conv.weightmodel.6.m.2.cv2.bn.weightmodel.6.m.2.cv2.bn.biasmodel.7.conv.weightmodel.7.bn.weightmodel.7.bn.biasmodel.8.cv1.conv.weightmodel.8.cv1.bn.weightmodel.8.cv1.bn.biasmodel.8.cv2.conv.weightmodel.8.cv2.bn.weightmodel.8.cv2.bn.biasmodel.8.cv3.conv.weightmodel.8.cv3.bn.weightmodel.8.cv3.bn.biasmodel.8.m.0.cv1.conv.weightmodel.8.m.0.cv1.bn.weightmodel.8.m.0.cv1.bn.biasmodel.8.m.0.cv2.conv.weightmodel.8.m.0.cv2.bn.weightmodel.8.m.0.cv2.bn.biasmodel.9.cv1.conv.weightmodel.9.cv1.bn.weightmodel.9.cv1.bn.biasmodel.9.cv2.conv.weightmodel.9.cv2.bn.weightmodel.9.cv2.bn.biasmodel.10.conv.weightmodel.10.bn.weightmodel.10.bn.biasmodel.13.cv1.conv.weightmodel.13.cv1.bn.weightmodel.13.cv1.bn.biasmodel.13.cv2.conv.weightmodel.13.cv2.bn.weightmodel.13.cv2.bn.biasmodel.13.cv3.conv.weightmodel.13.cv3.bn.weightmodel.13.cv3.bn.biasmodel.13.m.0.cv1.conv.weightmodel.13.m.0.cv1.bn.weightmodel.13.m.0.cv1.bn.biasmodel.13.m.0.cv2.conv.weightmodel.13.m.0.cv2.bn.weightmodel.13.m.0.cv2.bn.biasmodel.14.conv.weightmodel.14.bn.weightmodel.14.bn.biasmodel.17.cv1.conv.weightmodel.17.cv1.bn.weightmodel.17.cv1.bn.biasmodel.17.cv2.conv.weightmodel.17.cv2.bn.weightmodel.17.cv2.bn.biasmodel.17.cv3.conv.weightmodel.17.cv3.bn.weightmodel.17.cv3.bn.biasmodel.17.m.0.cv1.conv.weightmodel.17.m.0.cv1.bn.weightmodel.17.m.0.cv1.bn.biasmodel.17.m.0.cv2.conv.weightmodel.17.m.0.cv2.bn.weightmodel.17.m.0.cv2.bn.biasmodel.18.conv.weightmodel.18.bn.weightmodel.18.bn.biasmodel.20.cv1.conv.weightmodel.20.cv1.bn.weightmodel.20.cv1.bn.biasmodel.20.cv2.conv.weightmodel.20.cv2.bn.weightmodel.20.cv2.bn.biasmodel.20.cv3.conv.weightmodel.20.cv3.bn.weightmodel.20.cv3.bn.biasmodel.20.m.0.cv1.conv.weightmodel.20.m.0.cv1.bn.weightmodel.20.m.0.cv1.bn.biasmodel.20.m.0.cv2.conv.weightmodel.20.m.0.cv2.bn.weightmodel.20.m.0.cv2.bn.biasmodel.21.conv.weightmodel.21.bn.weightmodel.21.bn.biasmodel.23.cv1.conv.weightmodel.23.cv1.bn.weightmodel.23.cv1.bn.biasmodel.23.cv2.conv.weightmodel.23.cv2.bn.weightmodel.23.cv2.bn.biasmodel.23.cv3.conv.weightmodel.23.cv3.bn.weightmodel.23.cv3.bn.biasmodel.23.m.0.cv1.conv.weightmodel.23.m.0.cv1.bn.weightmodel.23.m.0.cv1.bn.biasmodel.23.m.0.cv2.conv.weightmodel.23.m.0.cv2.bn.weightmodel.23.m.0.cv2.bn.biasmodel.24.m.0.weightmodel.24.m.0.biasmodel.24.m.1.weightmodel.24.m.1.biasmodel.24.m.2.weightmodel.24.m.2.bias

1.2在训练的过程中，通过将梯度设置为0实现匹配层的冻结。

 # Freezefreeze = [f'model.{x}.' for x in range(freeze)]# layers to freezefor k, v in model.named_parameters():v.requires_grad = True# train all layersif any(x in k for x in freeze):print(f'freezing {k}')v.requires_grad = False

1.3冻结骨干网络，根据yaml配置文件可以看出，0-9层为Backbone层，所以我们只需设置freeze为10即可在训练的时候冻结骨干网络进行训练，同理设置freeze为24即可冻结所有的层。

yaml配置文件

python train.py --freeze 10 #冻结骨干网络

python train.py --freeze 24 #冻结所有的层

同时在话题的下面有人提出了有趣的训练过程：

作者也是给出了回复：

修改训练好的模型（按需要修改）

yolov5保存的权重文件不仅仅包含是模型和参数，还包含其他的一些东西

从save model可以看出，训练结果还保存的其他的参数：

#epoch– 当前模型对应的epoch数。

#best_fitness– Fitness 是我们寻找最大值的变量，在 YOLOv5 中，我们将默认适应度函数定义为度量的加权组合：mAP@0.5 贡献了 10% 的权重，mAP@0.5:0.95 贡献了剩余的 90%，没有 Precision P 和 Recall R。您可以根据需要调整这些设置或使用默认的适合度定义。

#model– 保存的模型。

#ema– 指数移动平均。在深度学习中，经常会使用EMA（指数移动平均）这个方法对模型的参数做平均，以求提高测试指标并增加模型鲁棒。

#updata– 保存的模型

#optimizer– 优化信息

#wandb_id– 可视化工具

——————————

我们可以打印出模型的信息：

model = torch.load("yolov5s.pt")print(model)

可以清晰的看出模型包含哪些信息。

我们可以按需要修改参数：

import argparseimport torch import numpy as npif __name__ == '__main__':parser = argparse.ArgumentParser()parser.add_argument('--weights', type=str, default='', help='weights of input'))opt = parser.parse_args()model = torch.load(opt.weights, map_location=torch.device('cpu')) # 模型实例化net = model['model']# 只保留有用信息ckpt = {'epoch': -1,'best_fitness': model['best_fitness'],'model': net,'ema':None,'updates':None,'optimizer': None,'wandb_id':None,'date':model['date']} # 保存模型torch.save(ckpt, 'my_weight.pt') print('=========DONE=========')

3.使用迁移学习的精度比较以及GPU的利用率可以参考官网：

Freezing Layers in YOLOv5 | yolov5_tutorial_freeze – Weights & Biases (wandb.ai)

使用冻结层进行迁移学习