欢迎光临散文网 会员登陆 & 注册

YoloV8改进策略:独家原创,全网首发,复现Drone-Yolo,以及改进方法

2023-11-06 18:44 作者:AI小浩  | 我要投稿

摘要

Drone-Yolo在无人机数据集上取得了巨大的成功,mAP0.5指标上取得了显著改进,在VisDrone2019-test上增加了13.4%,在VisDrone2019-val上增加了17.40%。这篇文章我首先复现Drone-Yolo,然后,在Drone-Yolo的基础上加入我自己对小目标检测的改进。

YoloV8官方结果

YOLOv8l summary (fused): 268 layers, 43631280 parameters, 0 gradients, 165.0 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 29/29 [
                   all        230       1412      0.922      0.957      0.986      0.737
                   c17        230        131      0.973      0.992      0.995      0.825
                    c5        230         68      0.945          1      0.995      0.836
            helicopter        230         43       0.96      0.907      0.951      0.607
                  c130        230         85      0.984          1      0.995      0.655
                   f16        230         57      0.955      0.965      0.985      0.669
                    b2        230          2      0.704          1      0.995      0.722
                 other        230         86      0.903      0.942      0.963      0.534
                   b52        230         70       0.96      0.971      0.978      0.831
                  kc10        230         62      0.999      0.984       0.99      0.847
               command        230         40       0.97          1      0.995      0.811
                   f15        230        123      0.891          1      0.992      0.701
                 kc135        230         91      0.971      0.989      0.986      0.712
                   a10        230         27          1      0.555      0.899      0.456
                    b1        230         20      0.972          1      0.995      0.793
                   aew        230         25      0.945          1       0.99      0.784
                   f22        230         17      0.913          1      0.995      0.725
                    p3        230        105       0.99          1      0.995      0.801
                    p8        230          1      0.637          1      0.995      0.597
                   f35        230         32      0.939      0.938      0.978      0.574
                   f18        230        125      0.985      0.992      0.987      0.817
                   v22        230         41      0.983          1      0.995       0.69
                 su-27        230         31      0.925          1      0.995      0.859
                 il-38        230         27      0.972          1      0.995      0.811
                tu-134        230          1      0.663          1      0.995      0.895
                 su-33        230          2          1      0.611      0.995      0.796
                 an-70        230          2      0.766          1      0.995       0.73
                 tu-22        230         98      0.984          1      0.995      0.831
Speed: 0.2ms preprocess, 3.8ms inference, 0.0ms loss, 0.8ms postprocess per image

BiC模块

BiC模块模块,有三个输入,一个输出组成,如下图:

我参照YoloV6中的源码,结合YoloV8,对BiC模块做了适当的修改,适应channel的输入和输出,代码如下:

class BiFusion(nn.Module):
    '''BiFusion Block in PAN'''
    def __init__(self, in_channels1,in_channels2,in_channels3, out_channels):
        super().__init__()
        self.CV1 = Conv(in_channels1, out_channels, 11)
        self.CV2 = Conv(in_channels2, out_channels, 11)
        self.CV3 = Conv(in_channels3, out_channels, 11)
        self.cv_out = Conv(out_channels * 3, out_channels, 11)

        self.upsample = ConvTranspose(
            out_channels,
            out_channels,
        )
        self.downsample = Conv(
            out_channels,
            out_channels,
            3,
            2
        )

    def forward(self, x):
        x0 = self.upsample(self.CV1(x[0]))
        x1 = self.CV2(x[1])
        x2 = self.downsample(self.CV3(x[2]))
        x3= self.cv_out(torch.cat((x0, x1, x2), dim=1))
        return x3

改进一

测试结果

YOLOv8l summary (fused): 291 layers, 49370288 parameters, 0 gradients, 194.9 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 15/15 [00:13<00:00,  1.11it/s]
                   all        230       1412      0.932      0.966      0.986      0.735
                   c17        230        131      0.955      0.992      0.995      0.827
                    c5        230         68      0.954          1      0.993      0.829
            helicopter        230         43      0.902      0.977      0.964      0.585
                  c130        230         85      0.978      0.941      0.991      0.669
                   f16        230         57      0.853      0.912      0.957      0.649
                    b2        230          2      0.768          1      0.995      0.697
                 other        230         86      0.954       0.93      0.972      0.557
                   b52        230         70      0.972      0.957       0.97      0.794
                  kc10        230         62       0.99      0.968      0.987      0.845
               command        230         40      0.975       0.99      0.975      0.785
                   f15        230        123      0.948      0.976      0.993      0.706
                 kc135        230         91      0.965      0.989      0.979      0.672
                   a10        230         27       0.95      0.697      0.951      0.436
                    b1        230         20      0.974       0.95      0.988      0.677
                   aew        230         25      0.926          1       0.99      0.772
                   f22        230         17      0.925          1      0.995      0.721
                    p3        230        105          1      0.977      0.995       0.81
                    p8        230          1      0.755          1      0.995      0.697
                   f35        230         32      0.967      0.906      0.967       0.54
                   f18        230        125      0.967      0.992      0.992      0.809
                   v22        230         41      0.989          1      0.995      0.624
                 su-27        230         31      0.982          1      0.995      0.856
                 il-38        230         27      0.997          1      0.995      0.801
                tu-134        230          1      0.732          1      0.995      0.995
                 su-33        230          2          1      0.923      0.995      0.796
                 an-70        230          2      0.809          1      0.995      0.848
                 tu-22        230         98       0.99          1      0.995      0.834
Speed: 0.1ms preprocess, 7.3ms inference, 0.0ms loss, 2.8ms postprocess per image
Results saved to runs\detect\train3

基本上没有变化,反而增加了计算量!

改进二

测试结果

原数据集测试结果

YOLOv8l summary (fused): 343 layers, 46326864 parameters, 0 gradients, 244.4 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 15/15 [00:05<00:00,  2.51it/s]
                   all        230       1412      0.906      0.975      0.986       0.73
                   c17        230        131      0.954      0.992      0.995      0.807
                    c5        230         68      0.919      0.997      0.993      0.825
            helicopter        230         43      0.818      0.953      0.954      0.582
                  c130        230         85      0.971      0.965      0.993      0.655
                   f16        230         57      0.829       0.93      0.967      0.667
                    b2        230          2      0.768          1      0.995      0.722
                 other        230         86      0.881      0.953      0.975      0.539
                   b52        230         70       0.92      0.943       0.98      0.802
                  kc10        230         62          1       0.98      0.989      0.824
               command        230         40      0.894          1      0.983      0.797
                   f15        230        123      0.926          1      0.992      0.694
                 kc135        230         91      0.967      0.989      0.981      0.689
                   a10        230         27      0.913      0.926      0.935      0.424
                    b1        230         20      0.927          1      0.995      0.739
                   aew        230         25      0.916          1      0.995      0.782
                   f22        230         17      0.931          1      0.995      0.763
                    p3        230        105      0.994      0.971      0.995      0.805
                    p8        230          1      0.886          1      0.995      0.796
                   f35        230         32      0.882      0.875       0.96       0.54
                   f18        230        125      0.949      0.984      0.989       0.81
                   v22        230         41      0.968          1      0.995      0.663
                 su-27        230         31      0.921          1      0.995      0.828
                 il-38        230         27      0.984          1      0.995      0.803
                tu-134        230          1      0.605          1      0.995      0.895
                 su-33        230          2          1       0.86      0.995      0.697
                 an-70        230          2      0.759          1      0.995      0.749
                 tu-22        230         98      0.987          1      0.995      0.816
Speed: 0.1ms preprocess, 6.5ms inference, 0.0ms loss, 0.7ms postprocess per image

ViSDrone2019数据集测试结果

                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 35/35 [00:06<00:00,  5.46it/s]
                   all        548      38759      0.584      0.469      0.491      0.302
            pedestrian        548       8844      0.707      0.477      0.572      0.282
                people        548       5125      0.619      0.425      0.464      0.199
               bicycle        548       1287      0.315      0.282      0.224      0.104
                   car        548      14064      0.761      0.843      0.863      0.631
                   van        548       1975      0.604      0.498      0.527      0.382
                 truck        548        750      0.572      0.431      0.433        0.3
              tricycle        548       1045      0.479      0.407      0.378      0.219
       awning-tricycle        548        532      0.374      0.179      0.197       0.12
                   bus        548        251      0.718      0.614      0.639       0.48
                 motor        548       4886      0.621      0.561       0.58      0.2

比论文的结果低一些,这个和batchsize以及epoch有关系! 我选用的epoch为150,batchsize为8。如果按照论文中的300epoch可能会更高一些。

改进三

测试结果

             Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 35/35 [00:19<00:00,  1.82it/s]
                   all        548      38759       0.57      0.473      0.491      0.305
            pedestrian        548       8844      0.678      0.504      0.571      0.287
                people        548       5125      0.628      0.418      0.466        0.2
               bicycle        548       1287       0.37      0.237      0.227      0.112
                   car        548      14064      0.733      0.852       0.86      0.627
                   van        548       1975      0.569      0.507      0.517      0.376
                 truck        548        750      0.532      0.437      0.459      0.319
              tricycle        548       1045      0.487      0.395       0.37      0.218
       awning-tricycle        548        532       0.37      0.212      0.202       0.13
                   bus        548        251      0.709      0.618      0.656       0.49

和原来的模型相比,总体结果差不多,但是你仔细对比,发现,更小的目标mAP50 更高,反而大的目标mAP50有所降低!  在训练的过程中,验证集已经达到了0.512,接近作者的指标,但是由于最后一步有融合的操作,融合后的结果有所下降!

在这里插入图片描述

链接:


YoloV8改进策略:独家原创,全网首发,复现Drone-Yolo,以及改进方法的评论 (共 条)

分享到微博请遵守国家法律