YOLO (You Only Look Once) is a popular, state-of-the-art, real-time object detection model family known for its high speed and accuracy. Unlike older, two-stage detectors (like R-CNN) that first find regions of interest and then classify them, YOLO uses a single convolutional neural network (CNN) to predict bounding boxes and class probabilities in a single pass over the image.
ultralytics/ultralytics: Ultralytics YOLO 🚀
Discover Ultralytics YOLO models | State-of-the-Art Computer Vision
There are several strong open/free alternatives to Ultralytics YOLO for image segmentation:
Direct YOLO Variants
- YOLOv5/v7/v8 from other sources — The underlying YOLO architectures are open. Ultralytics' wrapper has licensing nuances (AGPL-3.0), but you can use the base models or forks
- YOLO-NAS by Deci AI — Apache 2.0 licensed, claims better accuracy/speed tradeoffs than YOLOv8
- RT-DETR — Baidu's real-time detection transformer, now included in some Ultralytics builds but also available independently
Segment Anything Family
- SAM (Segment Anything Model) by Meta — Apache 2.0, excellent zero-shot segmentation
- FastSAM — Faster alternative using a CNN backbone instead of ViT
- MobileSAM — Lightweight version for edge deployment
- SAM 2 — Meta's updated version with video support
General Segmentation Models
- Detectron2 (Meta) — Apache 2.0, includes Mask R-CNN and other architectures
- MMDetection/MMSegmentation (OpenMMLab) — Apache 2.0, comprehensive toolbox with dozens of models
- DeepLabV3+ — Strong semantic segmentation, available in torchvision
- SegFormer — Transformer-based, good accuracy/efficiency balance
Lightweight/Edge Options
- EfficientDet — Google's efficient detection family
- PaddleDetection (Baidu) — Apache 2.0, includes PP-YOLO variants
- NanoDet — Extremely lightweight, good for mobile
No comments:
Post a Comment