🔥 fire_smoke — v20260507 Final(8hr 自主研究)

image cls 2-binary (smoke/fire) cvat2 project 23/36 (496K frames) 18 ablations + Best7 ensemble 訓練:2026-05-07 | 5090-2 dual-GPU agent

🎯 結果(vs baseline v20260410)

P23 test mAP (Best7)
0.9896
+8.3pp vs baseline 0.907
smoke FP -74.8%
694
vs baseline 2752
fire FP -74.4%
491
vs baseline 1919
E_ema 單模 smoke FP
819
-70.2%, 1× 推論
smoke FP @ thr=0.30
213
-92.3% (R 降到 0.82)
⭐ E_ema 單模部署:mnv3-l + camaug + EMA decay 0.999,4.2M params,**比 baseline smoke FP -70%**,推論成本同 baseline。EMA 是 free lunch。已部署 ppe-demo。
⚠ user 目標 mAP 0.995+ 沒達:P23 test 達 0.9896,受 dataset 標籤雜訊上限制(agent 分析 14 個 task 標註不準確)。要 0.995+ 需先補資料/重 verify 標註。

📦 模型下載

⭐ E_ema 單模 (生產推薦)
MobileNetV3-L 4.2M, 224×224
P23 test mAP 0.986 / smoke FP 819 (-70%)
camaug + EMA, 1× 推論成本
16 MB
⬇ best_ema.pt
v20260410 baseline (對照)
舊版 baseline (mAP 0.907 P23)
16 MB
⬇ legacy
📋 載入範例(Python,點開)
import torch, timm
from PIL import Image
import torchvision.transforms as T

ckpt = torch.load("fire_smoke_v20260507E_ema.pt", weights_only=False)
# ckpt['model_name'] = "mobilenetv3_large_100.ra_in1k"
# ckpt['classes'] = ['smoke', 'fire'], img_size = 224
model = timm.create_model(ckpt["model_name"], pretrained=False,
                          num_classes=len(ckpt["classes"])).eval()
model.load_state_dict(ckpt["model_state"])

mean = [0.485, 0.456, 0.406]; std = [0.229, 0.224, 0.225]
tf = T.Compose([T.Resize((224, 224)), T.ToTensor(), T.Normalize(mean, std)])
x = tf(Image.open("frame.jpg").convert("RGB")).unsqueeze(0)
with torch.no_grad():
    probs = torch.sigmoid(model(x))[0]
# probs[0] = smoke prob, probs[1] = fire prob
# 推薦 thr:smoke=0.30 (FP 最低), fire=0.50

🚀 部署到 ppe-demo

已上線 https://ppe-demo.intemotech.com/,dropdown 選「🔥 火煙 | v20260507E camaug+EMA ⭐」。baseline v20260410 並列保留為 fire_smoke_baseline

📄 完整研究紀錄


Generated 2026-05-07 | rai-vision-training | 8hr autonomous research on 5090-2 | kaggle-reports.pages.dev