safety_rope_v20260512 — DINOv3-S binary (v9 復刻 + ckpt 補 thr)
cvat2 #10 person bbox + safety_rope_use attr。Schema 改回 v9 binary (wrong/correct)。Unknown 已 filter。
ppe-demo handler 直接 hot-swap weight 即可(ckpt 內 labels + thr 完備)。
v20260512 變更(vs v20260511):
- head: Linear(256, 3) → Linear(256, 2)(v9 schema)
- filter unknown 樣本(manifest 已 0 樣本,no-op 安全)
- ckpt 內 labels=["wrong","correct"] 不再含 unknown
- ckpt 內 thr=0.0543(v9 + v511 都沒存讓 handler load 報錯)
- class weights wrong=1.5, correct=1.0(v9 原比例)
三版對照
| v9 (2-class baseline) | v20260511 (3-class arch) | v20260512 (binary 復刻) |
| test AP | 0.9336 | 0.9581 (mean active) | 0.9308 |
| test F1 | 0.8770 | 0.8956 (mean active) | 0.8567 |
| test accuracy | 0.9070 | 0.9015 | 0.8821 |
| test thr | 0.79 | — | 0.054 |
| val thr (deployed) | — | — | 0.054 |
| head | Linear(256, 2) | Linear(256, 3) | Linear(256, 2) |
| ckpt labels | ["wrong","correct"] | ["unknown","correct","wrong"] | ["wrong","correct"] |
| ckpt thr | missing | missing | 0.0543 ✓ |
Test 指標
| AP | F1 | P | R | accuracy | thr | TP/FP/FN/TN |
| 0.9308 | 0.8567 | 0.8344 | 0.8803 |
0.8821 | 0.054 |
927/184/126/1393 |
Per-class metrics
| class | P | R | F1 | support |
| wrong | 0.9171 | 0.8833 | 0.8999 | 1577 |
| correct | 0.8344 | 0.8803 | 0.8567 | 1053 |
Confusion matrix (rows=true, cols=pred)
| wrong | correct | row total |
| wrong | 1393 | 184 | 1577 |
| correct | 126 | 927 | 1053 |
| col total | 1519 | 1111 | 2630 |
|---|
Training history
| ep | train_loss | val_AP | val_F1 | val_acc | val_thr |
| 1 | 0.4856 | 0.8960 | 0.8073 | 0.8414 | 0.255 |
| 2 | 0.1859 | 0.9318 | 0.8523 | 0.8707 | 0.114 |
| 3 | 0.1265 | 0.9245 | 0.8371 | 0.8474 | 0.118 |
| 4 | 0.1284 | 0.8976 | 0.8134 | 0.8330 | 0.239 |
| 5 | 0.0923 | 0.9298 | 0.8498 | 0.8642 | 0.350 |
| 6 | 0.0811 | 0.9200 | 0.8316 | 0.8510 | 0.325 |
| 7 | 0.0680 | 0.9196 | 0.8439 | 0.8630 | 0.258 |
| 8 | 0.0660 | 0.9004 | 0.8431 | 0.8665 | 0.245 |
| 9 | 0.0529 | 0.9302 | 0.8585 | 0.8707 | 0.342 |
| 10 | 0.0427 | 0.9202 | 0.8477 | 0.8594 | 0.140 |
Sample inference (test, 4 per class,box 顏色 = truth)
truth=wrong pred=wrong (P(correct)=0.6%, thr=5.4%)

truth=wrong pred=wrong (P(correct)=0.0%, thr=5.4%)

truth=wrong pred=wrong (P(correct)=1.7%, thr=5.4%)

truth=wrong pred=wrong (P(correct)=0.1%, thr=5.4%)

truth=correct pred=correct (P(correct)=98.9%, thr=5.4%)

truth=correct pred=correct (P(correct)=95.5%, thr=5.4%)

truth=correct pred=correct (P(correct)=99.1%, thr=5.4%)

truth=correct pred=correct (P(correct)=99.6%, thr=5.4%)

FP audit — top-8 最自信錯判(sampled 300 test imgs)
truth=correct → pred=wrong (conf=99.9%)

truth=correct → pred=wrong (conf=99.7%)

truth=correct → pred=wrong (conf=99.6%)

truth=correct → pred=wrong (conf=99.1%)

truth=correct → pred=wrong (conf=98.9%)

truth=correct → pred=wrong (conf=98.7%)

truth=correct → pred=wrong (conf=97.7%)

truth=correct → pred=wrong (conf=97.7%)

Config
{
"version": "v20260512",
"backbone_name": "vit_small_patch16_dinov3",
"arch": "DINOv3-S + RoIAlign + MLP 2-cls (wrong/correct) + photometric + random_erase + camaug — v9 binary復刻",
"params_M": 22.47245,
"img_size": [
1280,
720
],
"feat_ch": 384,
"expand": {
"x": 1.0,
"y_top": 0.2,
"y_bot": 1.5
},
"jitter": {
"center": 0.2,
"size": [
0.7,
1.4
],
"ex_x": [
0.5,
1.5
],
"ex_yt": [
0.5,
2.0
],
"ex_yb": [
0.7,
1.3
]
},
"class_weights": {
"wrong": 1.5,
"correct": 1.0
},
"labels": [
"wrong",
"correct"
],
"thr": 0.054290771484375,
"best_val_AP": 0.9318118478968604,
"best_epoch": 2,
"epochs_run": 10,
"total_train_time_s": 1008.9956252574921,
"test_metrics": {
"ap": 0.9307922369815115,
"acc": 0.8821292775665399,
"p": 0.8343834383438344,
"r": 0.8803418803418803,
"f1": 0.8567467652495379,
"thr": 0.054290771484375,
"tp": 927,
"fp": 184,
"fn": 126,
"tn": 1393,
"n_pos": 1053,
"n_total": 2630,
"per_class": {
"wrong": {
"precision": 0.9170506912442397,
"recall": 0.8833227647431833,
"f1": 0.8998708010335917,
"support": 1577
},
"correct": {
"precision": 0.8343834383438344,
"recall": 0.8803418803418803,
"f1": 0.8567467652495379,
"support": 1053
}
},
"confusion_matrix": [
[
1393,
184
],
[
126,
927
]
]
},
"hyperparams": {
"batch": 8,
"epochs": 20,
"lr": 5e-05,
"wd": 0.01,
"patience": 8
}
}