訓練日期 2026-05-27 · 5090-2 GPU 0 batch 128 · MobileNetV3-L · 修 video-mode bug 後第一次「真實」訓練
v521 / v522 export 用 frames[f] iterate,對 video-mode task meta["frames"] 只有 1 entry → 整支影片只訓第 1 張 frame。v526 改用 cvat_helpers.cvat_frame_iter,video-mode task 整段影片 frame 全進訓。
結果:manifest 從 ~9k 暴衝 184,951 crops (×20)。val_mAP 由 0.977 → 0.960,test_mAP 由 0.957 → 0.946,是因為 test 也從「frame 1 同分布的簡單子集」變成「跨整支影片」更困難更真實的測試集。對 production deploy 是更可靠的數字。
| v522 | v526 | × | |
|---|---|---|---|
| total crops | ~9,000 | 184,951 | ×20 |
| train | — | 141,392 | — |
| val | — | 27,403 | — |
| test | — | 16,156 | — |
| source | crops |
|---|---|
| other(RAI 自錄場域 / SIEMENS / HONCHUAN / IRODA / FOX 等) | 102,383 |
| r2ppe | 25,349 |
| guanxi | 16,984 |
| sh17 | 13,790 |
| raicvat_p26 | 13,041 |
| raicvat_p17 | 10,087 |
| raicvat_p9 | 1,873 |
| cppe5 | 1,444 |
| Attr | AP | F1 | P | R | thr | valid samples |
|---|---|---|---|---|---|---|
| hard_hat | 0.995 | 0.976 | 0.980 | 0.972 | 0.78 | 6,575 |
| no_head_protection | 0.995 | 0.974 | 0.978 | 0.971 | 0.43 | 6,330 |
| full_face_mask | 0.990 | 0.951 | 0.929 | 0.974 | 0.42 | 3,536 |
| face_mask | 0.992 | 0.956 | 0.944 | 0.969 | 0.29 | 1,834 |
| no_gloves | 0.999 | 0.990 | 0.991 | 0.989 | 0.40 | 2,953 |
| cotton_gloves | 0.806 | 0.821 | 0.755 | 0.901 | 0.22 | 2,168 |
| rubber_gloves | 1.000 | 0.997 | 0.994 | 1.000 | 0.13 | 2,576 |
| no_protective_clothing | 1.000 | 0.997 | 0.998 | 0.996 | 0.51 | 3,475 |
| cleanroom_suit | 0.996 | 0.990 | 0.994 | 0.987 | 0.96 | 2,062 |
| splash_proof_gown | 1.000 | 1.000 | 1.000 | 1.000 | 0.99 | 2,467 |
| safety_vest | 0.963 | 0.934 | 0.929 | 0.938 | 0.79 | 5,966 |
| safety_shoes | 0.875 | 0.840 | 0.773 | 0.920 | 0.53 | 1,923 |
| no_safety_shoes | 1.000 | 0.999 | 0.999 | 0.999 | 0.19 | 1,822 |
| no_sleeves | 1.000 | 0.993 | 0.994 | 0.992 | 0.74 | 1,958 |
| heartbeat | 0.889 | 0.861 | 0.844 | 0.878 | 0.94 | 3,771 |
| sleeves | 0.737 | 0.739 | 0.605 | 0.947 | 1.00 | 2,017 |
| safety_glasses | 0.980 | 0.934 | 0.920 | 0.948 | 0.16 | 1,670 |
| hair_cover | 0.767 | 0.882 | 0.815 | 0.962 | 0.00 | 1,958 |
| helmet_goggles | 0.930 | 0.898 | 0.837 | 0.969 | 1.00 | 2,275 |
| harness | 0.919 | 0.865 | 0.877 | 0.853 | 0.71 | 13,064 |
| fall | 0.972 | 0.961 | 0.953 | 0.970 | 0.79 | 2,258 |
| aluminized_apron | 1.000 | 1.000 | 1.000 | 1.000 | 0.88 | 251 |
→ 下一波重點:補 sleeves / hair_cover / cotton_gloves 的 video 場域標注,或對這些 attr 做 per-attr 增強 (e.g. attr-neg-weight)。
factory_ppe_v20260526/best.pt ⬇
backbone: mobilenetv3_large_100.ra_in1k (4.23M params) arch: 22-head BCE + partial-label mask img_size: 384 × 192 batch: 128, epochs: 40 (best ep27, early stopped), patience: 8 lr: 3e-4, wd: 0.01, mixup α: 0.2 aug: camaug (resize-crop + flip + affine + ±5° rot + ColorJitter + GaussianBlur + RandomErasing) attr_neg_weight: 全 1.0 (無 per-attr 加權) # v526 唯一改動:export 改用 cvat_helpers.cvat_frame_iter # → video-mode task 整段影片 frame 全進訓(v521/v522 只訓第 1 張)