21 binary heads partial-label BCE MobileNetV3-L 4.23M 21 attrs 訓練日期:2026-05-04 | 5090-2 dual-GPU 平行訓練 | 來源:cvat2 project 5 + project 12(307K crops)
針對「hard_hat / harness / safety_vest」三個關鍵 PPE attr 降低 FP 率:
• hard_hat 是工地最常見項目,誤報損信任
• harness 是高處作業合規關鍵,FP 高造成警報疲勞
• safety_vest 是反光背心通用判定 attr
以 v503 baseline 為對照,跑 6 個 weighting / aug 變體 + research agent 平行對照組,找出最佳組合。
| 版本 | 超參數 | hard_hat | harness | safety_vest | overall mAP | 備註 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FP | F1 | AP | FP | F1 | AP | FP | F1 | AP | ||||
| v503 baseline | no weighting | 218 | 0.975 | 0.994 | 592 | 0.863 | 0.925 | 284 | 0.921 | 0.957 | 0.9675 | ✗ 對照組 |
| v504 negw_harness20 | harness neg×2.0 | 102 | 0.973 | 0.993 | 170 | 0.866 | 0.923 | 170 | 0.924 | 0.959 | 0.9712 | ⭐ 已部署主版 |
| v504 negw15 | all attr neg×1.5 | 97 | 0.972 | 0.994 | 159 | 0.859 | 0.919 | 155 | 0.928 | 0.957 | 0.9702 | |
| v504a boost15 (agent) | all attr pos_w×1.5 | 102 | 0.975 | 0.994 | 169 | 0.859 | 0.917 | 156 | 0.931 | 0.953 | 0.9732 | |
| v504b boost_aug (agent) | pos_w×1.5 + rot+blur | 94 | 0.976 | 0.993 | 175 | 0.852 | 0.910 | 153 | 0.924 | 0.957 | 0.9694 | |
| v504c h3.0/hat1.3/vst1.3 | harness neg×3.0 + hat/vst neg×1.3 | 71 | 0.979 | 0.995 | 214 | 0.863 | 0.906 | 124 | 0.931 | 0.954 | 0.9698 | 🥇 hat & vst FP 雙冠 |
| v504d h2.5/hat1.5/vst1.5 | harness neg×2.5 + hat/vst neg×1.5 | 92 | 0.977 | 0.994 | 253 | 0.863 | 0.919 | 176 | 0.929 | 0.961 | 0.9690 | |
三個關鍵變體公開下載,皆為 16 MB,state_dict 直接 torch.load 即可(含 21-attr binary heads):
| 版本 | 角色 | best.pt | summary.json |
|---|---|---|---|
| negw_harness20_camaug ⭐ | 主版 — harness FP 王(170) | ⬇ best.pt | summary |
| a_boost15 | mAP 王(0.9732) | ⬇ best.pt | summary |
| c_h30 🥇 | hat (71) + vest (124) FP 雙冠 | ⬇ best.pt | summary |
import torch, timm, torch.nn as nn
class GenericClassifier(nn.Module):
def __init__(self, backbone, n_attr, feat_dim):
super().__init__()
self.backbone = timm.create_model(backbone, pretrained=False, num_classes=0, global_pool="avg")
self.dropout = nn.Dropout(0.3)
self.cls = nn.Linear(feat_dim, n_attr)
def forward(self, x): return self.cls(self.dropout(self.backbone(x)))
ckpt = torch.load("factory_ppe_v20260504_negw_harness20_camaug.pt", weights_only=False)
model = GenericClassifier(ckpt["backbone_name"], len(ckpt["attrs"]), ckpt["feat_dim"]).eval()
model.load_state_dict(ckpt["model_state"])
# attrs (21): hard_hat, no_head_protection, full_face_mask, face_mask, no_gloves,
# cotton_gloves, rubber_gloves, no_protective_clothing, cleanroom_suit,
# splash_proof_gown, safety_vest, safety_shoes, no_safety_shoes, no_sleeves,
# heartbeat, sleeves, safety_glasses, hair_cover, helmet_goggles, harness, fall
# 推論:person crop 384×192 → ImageNet normalize → sigmoid 21 outputs
# F1-best thresholds 在 ckpt["thresholds"] dict 內
| 場景 | 推薦版本 | 理由 |
|---|---|---|
| harness 降 FP(高處作業合規) | negw_harness20 ⭐ | harness FP 170 - 跨變體最低,已部署 ppe-demo 主版 |
| hard_hat + safety_vest 雙料降 FP | v504c (h3.0/hat1.3/vst1.3) | hat 71、vst 124 雙冠,但 harness FP 214 比 negw_harness20 多 44 |
| 整體 mAP / F1 最高 | v504a boost15 | 0.9732,per-attr 指標表現最均衡 |
| Attribute | AP | F1 | P | R | thr | TP | FP | FN | Valid |
|---|---|---|---|---|---|---|---|---|---|
| hard_hat | 0.995 | 0.979 | 0.979 | 0.979 | 0.63 | 3367 | 71 | 73 | 6324 |
| no_head_protection | 0.993 | 0.974 | 0.970 | 0.979 | 0.36 | 2786 | 86 | 60 | 6296 |
| full_face_mask | 0.998 | 0.985 | 0.992 | 0.978 | 0.95 | 828 | 7 | 19 | 2754 |
| face_mask | 0.989 | 0.970 | 0.956 | 0.984 | 0.42 | 1140 | 52 | 19 | 1599 |
| no_gloves | 0.999 | 0.990 | 0.990 | 0.991 | 0.42 | 2106 | 21 | 20 | 2829 |
| cotton_gloves | 0.872 | 0.769 | 0.690 | 0.870 | 0.72 | 20 | 9 | 3 | 1789 |
| rubber_gloves | 1.000 | 0.998 | 0.997 | 1.000 | 0.84 | 632 | 2 | 0 | 2424 |
| no_protective_clothing | 1.000 | 0.997 | 0.998 | 0.997 | 0.48 | 2281 | 5 | 8 | 3398 |
| cleanroom_suit | 1.000 | 0.994 | 0.987 | 1.000 | 0.30 | 155 | 2 | 0 | 1910 |
| splash_proof_gown | 1.000 | 1.000 | 1.000 | 1.000 | 0.89 | 540 | 0 | 0 | 2315 |
| safety_vest | 0.954 | 0.931 | 0.929 | 0.932 | 0.75 | 1633 | 124 | 119 | 5850 |
| safety_shoes | 0.851 | 0.854 | 0.864 | 0.844 | 0.93 | 38 | 6 | 7 | 1806 |
| no_safety_shoes | 1.000 | 0.997 | 0.996 | 0.998 | 0.39 | 1637 | 6 | 4 | 1806 |
| no_sleeves | 1.000 | 0.994 | 0.994 | 0.994 | 0.59 | 1728 | 10 | 10 | 1806 |
| heartbeat | 0.834 | 0.819 | 0.782 | 0.860 | 0.89 | 129 | 36 | 21 | 3734 |
| sleeves | 1.000 | 1.000 | 1.000 | 1.000 | 0.18 | 5 | 0 | 0 | 1806 |
| safety_glasses | 0.993 | 0.980 | 0.975 | 0.985 | 0.69 | 581 | 15 | 9 | 1493 |
| hair_cover | 1.000 | 1.000 | 1.000 | 1.000 | 1.00 | 3 | 0 | 0 | 1806 |
| helmet_goggles | 1.000 | 0.996 | 0.992 | 1.000 | 0.10 | 261 | 2 | 0 | 2066 |
| harness | 0.906 | 0.863 | 0.877 | 0.851 | 0.46 | 1521 | 214 | 267 | 13009 |
| fall | 0.982 | 0.967 | 0.964 | 0.970 | 0.88 | 161 | 6 | 5 | 2097 |
| Epochs | 30 |
| Batch size | 96 per GPU (×2) |
| LR | 3e-4 cosine |
| Patience | 8 |
| Augmentation | camaug (rotation ±5° + Gaussian blur σ0.5-1.5 + horizontal flip + photometric) |
| train_p9_attr_v7.py | 支援 --attr-neg-weight harness=N,hard_hat=M (key=value 多 attr) + --attr-pos-weight (boost 對照組) |
Generated 2026-05-04 | rai-vision-training | kaggle-reports.pages.dev