YoloV8改进策略：独家原创，全网首发，复现Drone-Yolo，以及改进方法

2023-11-06 18:44 作者:AI小浩 0人读过 | 我要投稿

摘要

Drone-Yolo在无人机数据集上取得了巨大的成功，mAP0.5指标上取得了显著改进，在VisDrone2019-test上增加了13.4%，在VisDrone2019-val上增加了17.40%。这篇文章我首先复现Drone-Yolo，然后，在Drone-Yolo的基础上加入我自己对小目标检测的改进。

YoloV8官方结果

YOLOv8l summary (fused): 268 layers, 43631280 parameters, 0 gradients, 165.0 GFLOPs Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 29/29 [ all 230 1412 0.922 0.957 0.986 0.737 c17 230 131 0.973 0.992 0.995 0.825 c5 230 68 0.945 1 0.995 0.836 helicopter 230 43 0.96 0.907 0.951 0.607 c130 230 85 0.984 1 0.995 0.655 f16 230 57 0.955 0.965 0.985 0.669 b2 230 2 0.704 1 0.995 0.722 other 230 86 0.903 0.942 0.963 0.534 b52 230 70 0.96 0.971 0.978 0.831 kc10 230 62 0.999 0.984 0.99 0.847 command 230 40 0.97 1 0.995 0.811 f15 230 123 0.891 1 0.992 0.701 kc135 230 91 0.971 0.989 0.986 0.712 a10 230 27 1 0.555 0.899 0.456 b1 230 20 0.972 1 0.995 0.793 aew 230 25 0.945 1 0.99 0.784 f22 230 17 0.913 1 0.995 0.725 p3 230 105 0.99 1 0.995 0.801 p8 230 1 0.637 1 0.995 0.597 f35 230 32 0.939 0.938 0.978 0.574 f18 230 125 0.985 0.992 0.987 0.817 v22 230 41 0.983 1 0.995 0.69 su-27 230 31 0.925 1 0.995 0.859 il-38 230 27 0.972 1 0.995 0.811 tu-134 230 1 0.663 1 0.995 0.895 su-33 230 2 1 0.611 0.995 0.796 an-70 230 2 0.766 1 0.995 0.73 tu-22 230 98 0.984 1 0.995 0.831 Speed: 0.2ms preprocess, 3.8ms inference, 0.0ms loss, 0.8ms postprocess per image

BiC模块

BiC模块模块，有三个输入，一个输出组成，如下图：

我参照YoloV6中的源码，结合YoloV8，对BiC模块做了适当的修改，适应channel的输入和输出，代码如下：

class BiFusion(nn.Module): '''BiFusion Block in PAN''' def __init__(self, in_channels1,in_channels2,in_channels3, out_channels): super().__init__() self.CV1 = Conv(in_channels1, out_channels, 1, 1) self.CV2 = Conv(in_channels2, out_channels, 1, 1) self.CV3 = Conv(in_channels3, out_channels, 1, 1) self.cv_out = Conv(out_channels * 3, out_channels, 1, 1) self.upsample = ConvTranspose( out_channels, out_channels, ) self.downsample = Conv( out_channels, out_channels, 3, 2 ) def forward(self, x): x0 = self.upsample(self.CV1(x[0])) x1 = self.CV2(x[1]) x2 = self.downsample(self.CV3(x[2])) x3= self.cv_out(torch.cat((x0, x1, x2), dim=1)) return x3

改进一

测试结果

YOLOv8l summary (fused): 291 layers, 49370288 parameters, 0 gradients, 194.9 GFLOPs Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 15/15 [00:13<00:00, 1.11it/s] all 230 1412 0.932 0.966 0.986 0.735 c17 230 131 0.955 0.992 0.995 0.827 c5 230 68 0.954 1 0.993 0.829 helicopter 230 43 0.902 0.977 0.964 0.585 c130 230 85 0.978 0.941 0.991 0.669 f16 230 57 0.853 0.912 0.957 0.649 b2 230 2 0.768 1 0.995 0.697 other 230 86 0.954 0.93 0.972 0.557 b52 230 70 0.972 0.957 0.97 0.794 kc10 230 62 0.99 0.968 0.987 0.845 command 230 40 0.975 0.99 0.975 0.785 f15 230 123 0.948 0.976 0.993 0.706 kc135 230 91 0.965 0.989 0.979 0.672 a10 230 27 0.95 0.697 0.951 0.436 b1 230 20 0.974 0.95 0.988 0.677 aew 230 25 0.926 1 0.99 0.772 f22 230 17 0.925 1 0.995 0.721 p3 230 105 1 0.977 0.995 0.81 p8 230 1 0.755 1 0.995 0.697 f35 230 32 0.967 0.906 0.967 0.54 f18 230 125 0.967 0.992 0.992 0.809 v22 230 41 0.989 1 0.995 0.624 su-27 230 31 0.982 1 0.995 0.856 il-38 230 27 0.997 1 0.995 0.801 tu-134 230 1 0.732 1 0.995 0.995 su-33 230 2 1 0.923 0.995 0.796 an-70 230 2 0.809 1 0.995 0.848 tu-22 230 98 0.99 1 0.995 0.834 Speed: 0.1ms preprocess, 7.3ms inference, 0.0ms loss, 2.8ms postprocess per image Results saved to runs\detect\train3

基本上没有变化，反而增加了计算量！

改进二

测试结果

原数据集测试结果

YOLOv8l summary (fused): 343 layers, 46326864 parameters, 0 gradients, 244.4 GFLOPs Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 15/15 [00:05<00:00, 2.51it/s] all 230 1412 0.906 0.975 0.986 0.73 c17 230 131 0.954 0.992 0.995 0.807 c5 230 68 0.919 0.997 0.993 0.825 helicopter 230 43 0.818 0.953 0.954 0.582 c130 230 85 0.971 0.965 0.993 0.655 f16 230 57 0.829 0.93 0.967 0.667 b2 230 2 0.768 1 0.995 0.722 other 230 86 0.881 0.953 0.975 0.539 b52 230 70 0.92 0.943 0.98 0.802 kc10 230 62 1 0.98 0.989 0.824 command 230 40 0.894 1 0.983 0.797 f15 230 123 0.926 1 0.992 0.694 kc135 230 91 0.967 0.989 0.981 0.689 a10 230 27 0.913 0.926 0.935 0.424 b1 230 20 0.927 1 0.995 0.739 aew 230 25 0.916 1 0.995 0.782 f22 230 17 0.931 1 0.995 0.763 p3 230 105 0.994 0.971 0.995 0.805 p8 230 1 0.886 1 0.995 0.796 f35 230 32 0.882 0.875 0.96 0.54 f18 230 125 0.949 0.984 0.989 0.81 v22 230 41 0.968 1 0.995 0.663 su-27 230 31 0.921 1 0.995 0.828 il-38 230 27 0.984 1 0.995 0.803 tu-134 230 1 0.605 1 0.995 0.895 su-33 230 2 1 0.86 0.995 0.697 an-70 230 2 0.759 1 0.995 0.749 tu-22 230 98 0.987 1 0.995 0.816 Speed: 0.1ms preprocess, 6.5ms inference, 0.0ms loss, 0.7ms postprocess per image

ViSDrone2019数据集测试结果

Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 35/35 [00:06<00:00, 5.46it/s] all 548 38759 0.584 0.469 0.491 0.302 pedestrian 548 8844 0.707 0.477 0.572 0.282 people 548 5125 0.619 0.425 0.464 0.199 bicycle 548 1287 0.315 0.282 0.224 0.104 car 548 14064 0.761 0.843 0.863 0.631 van 548 1975 0.604 0.498 0.527 0.382 truck 548 750 0.572 0.431 0.433 0.3 tricycle 548 1045 0.479 0.407 0.378 0.219 awning-tricycle 548 532 0.374 0.179 0.197 0.12 bus 548 251 0.718 0.614 0.639 0.48 motor 548 4886 0.621 0.561 0.58 0.2

比论文的结果低一些，这个和batchsize以及epoch有关系！我选用的epoch为150，batchsize为8。如果按照论文中的300epoch可能会更高一些。

改进三

测试结果

Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 35/35 [00:19<00:00, 1.82it/s] all 548 38759 0.57 0.473 0.491 0.305 pedestrian 548 8844 0.678 0.504 0.571 0.287 people 548 5125 0.628 0.418 0.466 0.2 bicycle 548 1287 0.37 0.237 0.227 0.112 car 548 14064 0.733 0.852 0.86 0.627 van 548 1975 0.569 0.507 0.517 0.376 truck 548 750 0.532 0.437 0.459 0.319 tricycle 548 1045 0.487 0.395 0.37 0.218 awning-tricycle 548 532 0.37 0.212 0.202 0.13 bus 548 251 0.709 0.618 0.656 0.49

和原来的模型相比，总体结果差不多，但是你仔细对比，发现，更小的目标mAP50 更高，反而大的目标mAP50有所降低！在训练的过程中，验证集已经达到了0.512，接近作者的指标，但是由于最后一步有融合的操作，融合后的结果有所下降！

链接：

标签：

YoloV8改进策略：独家原创，全网首发，复现Drone-Yolo，以及改进方法

摘要

YoloV8官方结果

BiC模块

改进一

测试结果

改进二

测试结果

原数据集测试结果

ViSDrone2019数据集测试结果

改进三

测试结果