Monday, February 16, 2026

IMG AI: YOLO models: image segmentation

YOLO (You Only Look Once) is a popular, state-of-the-art, real-time object detection model family known for its high speed and accuracy. Unlike older, two-stage detectors (like R-CNN) that first find regions of interest and then classify them, YOLO uses a single convolutional neural network (CNN) to predict bounding boxes and class probabilities in a single pass over the image. 

ultralytics/ultralytics: Ultralytics YOLO 🚀

Discover Ultralytics YOLO models | State-of-the-Art Computer Vision

There are several strong open/free alternatives to Ultralytics YOLO for image segmentation:

Direct YOLO Variants

  • YOLOv5/v7/v8 from other sources — The underlying YOLO architectures are open. Ultralytics' wrapper has licensing nuances (AGPL-3.0), but you can use the base models or forks
  • YOLO-NAS by Deci AI — Apache 2.0 licensed, claims better accuracy/speed tradeoffs than YOLOv8
  • RT-DETR — Baidu's real-time detection transformer, now included in some Ultralytics builds but also available independently

Segment Anything Family

  • SAM (Segment Anything Model) by Meta — Apache 2.0, excellent zero-shot segmentation
  • FastSAM — Faster alternative using a CNN backbone instead of ViT
  • MobileSAM — Lightweight version for edge deployment
  • SAM 2 — Meta's updated version with video support

General Segmentation Models

  • Detectron2 (Meta) — Apache 2.0, includes Mask R-CNN and other architectures
  • MMDetection/MMSegmentation (OpenMMLab) — Apache 2.0, comprehensive toolbox with dozens of models
  • DeepLabV3+ — Strong semantic segmentation, available in torchvision
  • SegFormer — Transformer-based, good accuracy/efficiency balance

Lightweight/Edge Options

  • EfficientDet — Google's efficient detection family
  • PaddleDetection (Baidu) — Apache 2.0, includes PP-YOLO variants
  • NanoDet — Extremely lightweight, good for mobile


No comments: