YoloV5改进策略:SwiftFormer,全网首发,独家改进的高效加性注意力用于实时移动视觉
摘要
本文提出了新型高效加性注意力机制,替代传统自注意力机制中的二次矩阵乘法操作,线性元素级乘法可实现关键-值交互的替换。该高效自注意力机制可在网络所有阶段使用,不会牺牲准确性。同时介绍了名为“SwiftFormer”的模型系列,在准确性和移动推理速度方面达到了最先进的性能。其中一种小规模变体在iPhone 14上以仅0.8毫秒的延迟实现了78.5%的ImageNet-1K准确率,比MobileViT-v2更准确且快两倍,可用于分类、检测和分割等视觉应用。与EfficientFormer-L1相比,SwiftFormer-L1在准确率方面绝对增加了1.7%,同时保持相同的延迟,且不需要任何神经架构搜索。

将其引入到YoloV8中,会有什么样的效果呢?
文章链接:https://blog.csdn.net/m0_47867638/article/details/133897551?spm=1001.2014.3001.5502
YoloV5官方代码测试结果
YOLOv5l summary: 267 layers, 46275213 parameters, 0 gradients, 108.2 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 15/15 [00:02<00:00, 5.16it/s]
all 230 1412 0.971 0.93 0.986 0.729
c17 230 131 0.992 0.992 0.995 0.797
c5 230 68 0.953 1 0.994 0.81
helicopter 230 43 0.974 0.907 0.948 0.57
c130 230 85 1 0.981 0.994 0.66
f16 230 57 0.999 0.93 0.975 0.677
b2 230 2 0.971 1 0.995 0.746
other 230 86 0.987 0.915 0.974 0.545
b52 230 70 0.983 0.957 0.981 0.803
kc10 230 62 1 0.977 0.985 0.819
command 230 40 0.971 1 0.986 0.782
f15 230 123 0.992 0.976 0.994 0.655
kc135 230 91 0.988 0.989 0.986 0.699
a10 230 27 1 0.526 0.912 0.391
b1 230 20 0.949 1 0.995 0.719
aew 230 25 0.952 1 0.993 0.781
f22 230 17 0.901 1 0.995 0.763
p3 230 105 0.997 0.99 0.995 0.789
p8 230 1 0.885 1 0.995 0.697
f35 230 32 0.969 0.984 0.985 0.569
f18 230 125 0.974 0.992 0.99 0.806
v22 230 41 0.994 1 0.995 0.641
su-27 230 31 0.987 1 0.995 0.842
il-38 230 27 0.994 1 0.995 0.785
tu-134 230 1 0.879 1 0.995 0.796
su-33 230 2 1 0 0.995 0.846
an-70 230 2 0.943 1 0.995 0.895
tu-22 230 98 0.983 1 0.995 0.788
改进一
测试结果
YOLOv5l summary: 490 layers, 28495747 parameters, 0 gradients, 58.5 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 15/15 [00:03<00:00, 4.49it/s]
all 230 1412 0.911 0.956 0.99 0.712
c17 230 131 0.974 1 0.995 0.816
c5 230 68 0.907 1 0.99 0.824
helicopter 230 43 0.939 1 0.974 0.595
c130 230 85 0.963 1 0.995 0.668
f16 230 57 0.893 0.982 0.985 0.679
b2 230 2 0.767 1 0.995 0.547
other 230 86 0.849 0.965 0.959 0.484
b52 230 70 0.951 0.971 0.984 0.812
kc10 230 62 0.984 0.968 0.985 0.815
command 230 40 0.982 1 0.995 0.774
f15 230 123 0.944 1 0.995 0.658
kc135 230 91 0.969 0.989 0.985 0.659
a10 230 27 0.903 0.963 0.983 0.462
b1 230 20 0.944 1 0.995 0.608
aew 230 25 0.91 1 0.995 0.767
f22 230 17 0.839 1 0.995 0.713
p3 230 105 0.875 0.981 0.992 0.779
p8 230 1 0.62 1 0.995 0.697
f35 230 32 0.91 1 0.994 0.542
f18 230 125 0.981 0.992 0.992 0.809
v22 230 41 0.981 1 0.995 0.729
su-27 230 31 0.898 1 0.995 0.833
il-38 230 27 0.955 1 0.995 0.805
tu-134 230 1 0.918 1 0.995 0.895
su-33 230 2 1 0 0.995 0.647
an-70 230 2 0.764 1 0.995 0.796
tu-22 230 98 0.967 1 0.995 0.796
改进二
测试结果
YOLOv5l summary: 307 layers, 58420493 parameters, 0 gradients, 135.0 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 15/15 [00:03<00:00, 4.43it/s]
all 230 1412 0.977 0.936 0.99 0.733
c17 230 131 0.99 1 0.995 0.815
c5 230 68 0.981 1 0.995 0.835
helicopter 230 43 0.954 0.964 0.977 0.606
c130 230 85 1 0.989 0.995 0.643
f16 230 57 1 0.957 0.981 0.687
b2 230 2 0.948 1 0.995 0.821
other 230 86 1 0.916 0.983 0.548
b52 230 70 0.955 0.957 0.972 0.799
kc10 230 62 1 0.977 0.985 0.812
command 230 40 0.972 1 0.979 0.796
f15 230 123 0.983 1 0.995 0.684
kc135 230 91 0.989 0.958 0.975 0.69
a10 230 27 1 0.555 0.969 0.396
b1 230 20 0.988 1 0.995 0.738
aew 230 25 0.955 1 0.992 0.774
f22 230 17 0.962 1 0.995 0.736
p3 230 105 0.996 1 0.995 0.789
p8 230 1 0.921 1 0.995 0.697
f35 230 32 1 1 0.995 0.511
f18 230 125 0.99 0.992 0.991 0.825
v22 230 41 0.997 1 0.995 0.699
su-27 230 31 0.987 1 0.995 0.822
il-38 230 27 0.993 1 0.995 0.82
tu-134 230 1 0.882 1 0.995 0.796
su-33 230 2 1 0 0.995 0.846
an-70 230 2 0.94 1 0.995 0.796
tu-22 230 98 0.99 1 0.995 0.814
改进三
测试结果
Fusing layers...
YOLOv5l summary: 762 layers, 57360717 parameters, 0 gradients, 136.5 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 15/15 [00:03<00:00, 3.94it/s]
all 230 1412 0.971 0.944 0.983 0.714
c17 230 131 0.983 1 0.995 0.818
c5 230 68 0.971 0.998 0.99 0.811
helicopter 230 43 0.932 0.977 0.971 0.598
c130 230 85 0.998 1 0.995 0.674
f16 230 57 0.995 0.965 0.984 0.67
b2 230 2 0.977 1 0.995 0.697
other 230 86 0.987 0.911 0.961 0.511
b52 230 70 0.987 0.986 0.986 0.811
kc10 230 62 0.997 0.984 0.986 0.815
command 230 40 0.983 1 0.995 0.817
f15 230 123 0.981 1 0.995 0.674
kc135 230 91 0.995 0.989 0.986 0.674
a10 230 27 1 0.779 0.967 0.389
b1 230 20 0.993 1 0.995 0.635
aew 230 25 0.95 1 0.993 0.772
f22 230 17 0.886 1 0.976 0.749
p3 230 105 0.992 0.981 0.995 0.778
p8 230 1 0.889 1 0.995 0.597
f35 230 32 0.968 0.938 0.992 0.546
f18 230 125 0.99 0.992 0.993 0.818
v22 230 41 0.996 1 0.995 0.695
su-27 230 31 0.977 1 0.995 0.832
il-38 230 27 0.987 1 0.995 0.812
tu-134 230 1 0.893 1 0.995 0.796
su-33 230 2 1 0 0.828 0.728
an-70 230 2 0.924 1 0.995 0.796
tu-22 230 98 0.996 1 0.995 0.774