🦺 Factory PPE — v20260504 | 8hr 自主研究

21 binary heads partial-label BCE MobileNetV3-L 4.23M 21 attrs 訓練日期:2026-05-04 | 5090-2 dual-GPU 平行訓練 | 來源:cvat2 project 5 + project 12(307K crops)

🎯 研究目的

針對「hard_hat / harness / safety_vest」三個關鍵 PPE attr 降低 FP 率: • hard_hat 是工地最常見項目,誤報損信任
• harness 是高處作業合規關鍵,FP 高造成警報疲勞
• safety_vest 是反光背心通用判定 attr
以 v503 baseline 為對照,跑 6 個 weighting / aug 變體 + research agent 平行對照組,找出最佳組合。

📊 主要結論(test split)

harness FP 王者
170
negw_harness20
hard_hat FP 王者
71
v504c
safety_vest FP 王者
124
v504c
mAP 王者
0.9732
v504a boost15
vs baseline (v503)
-67%
hard_hat FP 降幅

📈 7 變體對比(focus 3 attr)

版本超參數 hard_hat harness safety_vest overall
mAP
備註
FPF1APFPF1APFPF1AP
v503 baselineno weighting 2180.9750.994 5920.8630.925 2840.9210.9570.9675✗ 對照組
v504 negw_harness20harness neg×2.0 1020.9730.993 1700.8660.923 1700.9240.9590.9712⭐ 已部署主版
v504 negw15all attr neg×1.5 970.9720.994 1590.8590.919 1550.9280.9570.9702
v504a boost15 (agent)all attr pos_w×1.5 1020.9750.994 1690.8590.917 1560.9310.9530.9732
v504b boost_aug (agent)pos_w×1.5 + rot+blur 940.9760.993 1750.8520.910 1530.9240.9570.9694
v504c h3.0/hat1.3/vst1.3harness neg×3.0 + hat/vst neg×1.3 710.9790.995 2140.8630.906 1240.9310.9540.9698🥇 hat & vst FP 雙冠
v504d h2.5/hat1.5/vst1.5harness neg×2.5 + hat/vst neg×1.5 920.9770.994 2530.8630.919 1760.9290.9610.9690
⚠ baseline v503 跟 v504* 的 manifest 不同(v504 系列 test set 縮一半),所以 v503 raw FP 對比僅供參考。v504* 系列之間的對比是同 test set,內部對比 100% 公平

FP 視覺化(3 個 attr × 7 變體)

🧠 Research agent 核心 insight

「正類 ×1.5」與「負類 ×1.5」F1-best threshold 後幾乎等效 — eval 時 best_thr 自動 calibrate 抹掉 bias 方向,真實有效機制 = per-attr loss emphasis (gradient 加權),跟 weighting 是壓在 positive 還是 negative 無關。

這推翻了一開始「pos_weight ×1.5 會增加 FP」的直覺:BCE pos_weight 確實會把 prediction 往 positive 推,但 F1-best threshold search 會自動 raise threshold 補回來,最終 FP 不變甚至降。weighting 的價值在於「強迫 model 對該 attr 學更精細的 feature」,bias 方向只是 cosmetic。

📦 模型下載(Cloudflare R2)

三個關鍵變體公開下載,皆為 16 MB,state_dict 直接 torch.load 即可(含 21-attr binary heads):

版本角色best.ptsummary.json
negw_harness20_camaug主版 — harness FP 王(170)⬇ best.ptsummary
a_boost15mAP 王(0.9732)⬇ best.ptsummary
c_h30 🥇hat (71) + vest (124) FP 雙冠⬇ best.ptsummary
📋 載入範例(Python,點開)
import torch, timm, torch.nn as nn

class GenericClassifier(nn.Module):
    def __init__(self, backbone, n_attr, feat_dim):
        super().__init__()
        self.backbone = timm.create_model(backbone, pretrained=False, num_classes=0, global_pool="avg")
        self.dropout = nn.Dropout(0.3)
        self.cls = nn.Linear(feat_dim, n_attr)
    def forward(self, x): return self.cls(self.dropout(self.backbone(x)))

ckpt = torch.load("factory_ppe_v20260504_negw_harness20_camaug.pt", weights_only=False)
model = GenericClassifier(ckpt["backbone_name"], len(ckpt["attrs"]), ckpt["feat_dim"]).eval()
model.load_state_dict(ckpt["model_state"])
# attrs (21): hard_hat, no_head_protection, full_face_mask, face_mask, no_gloves,
#   cotton_gloves, rubber_gloves, no_protective_clothing, cleanroom_suit,
#   splash_proof_gown, safety_vest, safety_shoes, no_safety_shoes, no_sleeves,
#   heartbeat, sleeves, safety_glasses, hair_cover, helmet_goggles, harness, fall
# 推論:person crop 384×192 → ImageNet normalize → sigmoid 21 outputs
# F1-best thresholds 在 ckpt["thresholds"] dict 內

🥇 部署建議

場景推薦版本理由
harness 降 FP(高處作業合規)negw_harness20harness FP 170 - 跨變體最低,已部署 ppe-demo 主版
hard_hat + safety_vest 雙料降 FPv504c (h3.0/hat1.3/vst1.3)hat 71、vst 124 雙冠,但 harness FP 214 比 negw_harness20 多 44
整體 mAP / F1 最高v504a boost150.9732,per-attr 指標表現最均衡

🚀 下階段建議(依 ROI 優先序)

  1. 3-seed ensemble(最高 ROI):對 negw_harness20 跑 3 不同 random seed,softmax average → 預期再降 ~10% FP,零 architecture 改動。
  2. harness 改 RoIAlign:從 safety_rope v9 借經驗(CVAT bbox-tight + 1.0/0.2/1.5 外擴 + HD 1280×720),harness 是 fine-grained 結構(鉤子、繩索路徑),RoIAlign 比 person crop resize 更合適。
  3. safety_vest multi-task consistency:safety_vest 跟 sleeves/hi_visibility 視覺重疊,加 consistency loss 強制邏輯互斥可進一步降 FP。
  4. inference temporal smoothing:已在 v6 series 啟用(5-frame median),其他 attr 也可考慮。

📊 v504c 完整 21-attr 指標

AttributeAPF1PRthrTPFPFNValid
hard_hat0.9950.9790.9790.9790.63336771736324
no_head_protection0.9930.9740.9700.9790.36278686606296
full_face_mask0.9980.9850.9920.9780.958287192754
face_mask0.9890.9700.9560.9840.42114052191599
no_gloves0.9990.9900.9900.9910.42210621202829
cotton_gloves0.8720.7690.6900.8700.7220931789
rubber_gloves1.0000.9980.9971.0000.84632202424
no_protective_clothing1.0000.9970.9980.9970.482281583398
cleanroom_suit1.0000.9940.9871.0000.30155201910
splash_proof_gown1.0001.0001.0001.0000.89540002315
safety_vest0.9540.9310.9290.9320.7516331241195850
safety_shoes0.8510.8540.8640.8440.9338671806
no_safety_shoes1.0000.9970.9960.9980.391637641806
no_sleeves1.0000.9940.9940.9940.59172810101806
heartbeat0.8340.8190.7820.8600.8912936213734
sleeves1.0001.0001.0001.0000.185001806
safety_glasses0.9930.9800.9750.9850.695811591493
hair_cover1.0001.0001.0001.0001.003001806
helmet_goggles1.0000.9960.9921.0000.10261202066
harness0.9060.8630.8770.8510.46152121426713009
fall0.9820.9670.9640.9700.88161652097

📈 v504c 訓練曲線

📦 訓練設定

Epochs30
Batch size96 per GPU (×2)
LR3e-4 cosine
Patience8
Augmentationcamaug (rotation ±5° + Gaussian blur σ0.5-1.5 + horizontal flip + photometric)
train_p9_attr_v7.py支援 --attr-neg-weight harness=N,hard_hat=M (key=value 多 attr) + --attr-pos-weight (boost 對照組)

Generated 2026-05-04 | rai-vision-training | kaggle-reports.pages.dev