safety_rope_v20260512 — DINOv3-S binary (v9 復刻 + ckpt 補 thr)

cvat2 #10 person bbox + safety_rope_use attr。Schema 改回 v9 binary (wrong/correct)。Unknown 已 filter。
ppe-demo handler 直接 hot-swap weight 即可（ckpt 內 labels + thr 完備）。

v20260512 變更（vs v20260511）：

head: Linear(256, 3) → Linear(256, 2)（v9 schema）
filter unknown 樣本（manifest 已 0 樣本，no-op 安全）
ckpt 內 labels=["wrong","correct"] 不再含 unknown
ckpt 內 thr=0.0543（v9 + v511 都沒存讓 handler load 報錯）
class weights wrong=1.5, correct=1.0（v9 原比例）

三版對照

	v9 (2-class baseline)	v20260511 (3-class arch)	v20260512 (binary 復刻)
test AP	0.9336	0.9581 (mean active)	0.9308
test F1	0.8770	0.8956 (mean active)	0.8567
test accuracy	0.9070	0.9015	0.8821
test thr	0.79	—	0.054
val thr (deployed)	—	—	0.054
head	Linear(256, 2)	Linear(256, 3)	Linear(256, 2)
ckpt labels	["wrong","correct"]	["unknown","correct","wrong"]	["wrong","correct"]
ckpt thr	missing	missing	0.0543 ✓

Test 指標

AP	F1	P	R	accuracy	thr	TP/FP/FN/TN
0.9308	0.8567	0.8344	0.8803	0.8821	0.054	927/184/126/1393

Per-class metrics

class	P	R	F1	support
wrong	0.9171	0.8833	0.8999	1577
correct	0.8344	0.8803	0.8567	1053

Confusion matrix (rows=true, cols=pred)

	wrong	correct	row total
wrong	1393	184	1577
correct	126	927	1053
col total	1519	1111	2630

Training history

ep	train_loss	val_AP	val_F1	val_acc	val_thr
1	0.4856	0.8960	0.8073	0.8414	0.255
2	0.1859	0.9318	0.8523	0.8707	0.114
3	0.1265	0.9245	0.8371	0.8474	0.118
4	0.1284	0.8976	0.8134	0.8330	0.239
5	0.0923	0.9298	0.8498	0.8642	0.350
6	0.0811	0.9200	0.8316	0.8510	0.325
7	0.0680	0.9196	0.8439	0.8630	0.258
8	0.0660	0.9004	0.8431	0.8665	0.245
9	0.0529	0.9302	0.8585	0.8707	0.342
10	0.0427	0.9202	0.8477	0.8594	0.140

Sample inference (test, 4 per class，box 顏色 = truth)

truth=wrong pred=wrong (P(correct)=0.6%, thr=5.4%)

truth=wrong pred=wrong (P(correct)=0.0%, thr=5.4%)

truth=wrong pred=wrong (P(correct)=1.7%, thr=5.4%)

truth=wrong pred=wrong (P(correct)=0.1%, thr=5.4%)

truth=correct pred=correct (P(correct)=98.9%, thr=5.4%)

truth=correct pred=correct (P(correct)=95.5%, thr=5.4%)

truth=correct pred=correct (P(correct)=99.1%, thr=5.4%)

truth=correct pred=correct (P(correct)=99.6%, thr=5.4%)

FP audit — top-8 最自信錯判（sampled 300 test imgs）

truth=correct → pred=wrong (conf=99.9%)

truth=correct → pred=wrong (conf=99.7%)

truth=correct → pred=wrong (conf=99.6%)

truth=correct → pred=wrong (conf=99.1%)

truth=correct → pred=wrong (conf=98.9%)

truth=correct → pred=wrong (conf=98.7%)

truth=correct → pred=wrong (conf=97.7%)

Config

{
  "version": "v20260512",
  "backbone_name": "vit_small_patch16_dinov3",
  "arch": "DINOv3-S + RoIAlign + MLP 2-cls (wrong/correct) + photometric + random_erase + camaug — v9 binary復刻",
  "params_M": 22.47245,
  "img_size": [
    1280,
    720
  ],
  "feat_ch": 384,
  "expand": {
    "x": 1.0,
    "y_top": 0.2,
    "y_bot": 1.5
  },
  "jitter": {
    "center": 0.2,
    "size": [
      0.7,
      1.4
    ],
    "ex_x": [
      0.5,
      1.5
    ],
    "ex_yt": [
      0.5,
      2.0
    ],
    "ex_yb": [
      0.7,
      1.3
    ]
  },
  "class_weights": {
    "wrong": 1.5,
    "correct": 1.0
  },
  "labels": [
    "wrong",
    "correct"
  ],
  "thr": 0.054290771484375,
  "best_val_AP": 0.9318118478968604,
  "best_epoch": 2,
  "epochs_run": 10,
  "total_train_time_s": 1008.9956252574921,
  "test_metrics": {
    "ap": 0.9307922369815115,
    "acc": 0.8821292775665399,
    "p": 0.8343834383438344,
    "r": 0.8803418803418803,
    "f1": 0.8567467652495379,
    "thr": 0.054290771484375,
    "tp": 927,
    "fp": 184,
    "fn": 126,
    "tn": 1393,
    "n_pos": 1053,
    "n_total": 2630,
    "per_class": {
      "wrong": {
        "precision": 0.9170506912442397,
        "recall": 0.8833227647431833,
        "f1": 0.8998708010335917,
        "support": 1577
      },
      "correct": {
        "precision": 0.8343834383438344,
        "recall": 0.8803418803418803,
        "f1": 0.8567467652495379,
        "support": 1053
      }
    },
    "confusion_matrix": [
      [
        1393,
        184
      ],
      [
        126,
        927
      ]
    ]
  },
  "hyperparams": {
    "batch": 8,
    "epochs": 20,
    "lr": 5e-05,
    "wd": 0.01,
    "patience": 8
  }
}