YOLO11n det cascade pipeline 堆高機操作員稽核 訓練日期:2026-04-28 | 來源:cvat2 project 9(7 個 dataset 整合)
本模型不是獨立使用,而是作為 cascade pipeline 的第一段:
cvat2 project 9(forklift_detection),共 21 個 task(7 來源 × Train/Val/Test)。 跨來源用 md5 dedup(去掉 1,294 張完全相同的 mirror 圖)。
| 來源 | imgs | bbox | 說明 |
|---|---|---|---|
| rf_forklift_team | 4,147 | 4,272 | Roboflow Universe,最大來源 |
| rf_hitsz | 1,933 | 2,673 | HITSZ Forklift-and-Human(class 名為 cart,需做 alias) |
| kg_walidguirat | 1,445 | 1,408 | Kaggle YOLO format |
| rf_phantom | 1,202 | 119 | 多 negative,作為 hard background |
| rf_csv2tfrecord | 1,000 | 1,096 | Roboflow Universe |
| loco | 449 | 598 | TUM-FML LOCO(含 5 cls,filter forklift) |
| kg_mahmoudbelooo | 421 | 459 | Kaggle YOLOv8 format |
| 合計 | 10,597 | 10,625 | Train 8,483 / Val 1,057 / Test 1,057 |
from ultralytics import YOLO
model = YOLO("forklift_yolo11n_v20260502.pt")
results = model("image.jpg", conf=0.35)
for r in results:
for box in r.boxes:
x1, y1, x2, y2 = box.xyxy[0].tolist()
conf = float(box.conf[0])
print(f"forklift {conf:.2f} bbox=({x1:.0f},{y1:.0f},{x2:.0f},{y2:.0f})")
from ultralytics import YOLO
import torch, torch.nn as nn
import timm
from PIL import Image
# 三個模型
forklift_yolo = YOLO("forklift_yolo11n_v20260502.pt")
person_yolo = YOLO("person_seg_yolo11n.pt")
ppe_ckpt = torch.load("factory_ppe_v20260501.pt", weights_only=False)
# PPE classifier
class GenericClassifier(nn.Module):
def __init__(self, backbone, n_attr, feat_dim):
super().__init__()
self.backbone = timm.create_model(backbone, pretrained=False, num_classes=0, global_pool="avg")
self.dropout = nn.Dropout(0.3); self.cls = nn.Linear(feat_dim, n_attr)
def forward(self, x): return self.cls(self.dropout(self.backbone(x)))
ppe_model = GenericClassifier(ppe_ckpt["backbone_name"], len(ppe_ckpt["attrs"]),
ppe_ckpt["feat_dim"]).cuda().eval()
ppe_model.load_state_dict(ppe_ckpt["model_state"])
attrs = ppe_ckpt["attrs"]
# Cascade inference
def overlap_ratio(p, f):
"""person bbox 跟 forklift bbox 重疊面積 / person 面積"""
ix1,iy1 = max(p[0],f[0]), max(p[1],f[1])
ix2,iy2 = min(p[2],f[2]), min(p[3],f[3])
if ix2 <= ix1 or iy2 <= iy1: return 0.0
inter = (ix2-ix1)*(iy2-iy1)
return inter / max((p[2]-p[0])*(p[3]-p[1]), 1)
def cascade_predict(img_pil):
# 1) forklift
forklifts = forklift_yolo(img_pil, conf=0.35)[0].boxes.xyxy.cpu().numpy().tolist()
if not forklifts: return []
# 2) person (整 frame)
persons = person_yolo(img_pil, conf=0.35, classes=[0])[0].boxes.xyxy.cpu().numpy().tolist()
# 3) filter:person 跟某 forklift 重疊 ≥ 0.15
operators = [p for p in persons if max(overlap_ratio(p, f) for f in forklifts) >= 0.15]
# 4) PPE for each operator
results = []
for x1,y1,x2,y2 in operators:
crop = img_pil.crop((x1, y1, x2, y2)).resize((192, 384))
# ... 跑 ppe_model ...
results.append({"bbox":(x1,y1,x2,y2), "ppe": "..."})
return results
已整合到 rai-model-viewer,部署於
http://192.168.53.21:7860/。
選 model = forklift_ppe_v20260502 (堆高機操作員 PPE) 後:
yolo detect train \
data=/mnt/ssd/cvat2/external/forklift_yolo/data.yaml \
model=yolo11n.pt \
epochs=100 imgsz=640 batch=32 device=1 \
project=/home/ubuntu/forklift_runs name=v20260502 \
patience=15
| Backbone | YOLO11n(pretrained on COCO) |
|---|---|
| Optimizer | SGD(ultralytics 預設) |
| Batch | 32 |
| Image size | 640 × 640 |
| Epochs run | 100 / 100(patience=15 未觸發) |
| Best epoch | 98(mAP50-95=0.851) |
| 訓練時間 | ~39 分鐘(1× RTX 5090) |
| AMP | auto(fp16) |
| Loss | box + cls + dfl(YOLO 標準) |
| Augmentation | HSV、flip、mosaic、blur、CLAHE(ultralytics 預設) |
R2 Bucket:rai-models / forklift_yolo11n_v20260502 / best.pt
Public URL(待上傳):https://pub-478929a98a5c440cb22c2241c0bde314.r2.dev/forklift_yolo11n_v20260502/best.pt
Generated 2026-04-28 | 回到目錄