Guide To Next-Generation Object Detection

Object detection is a tedious job, and if you ever tried to build a custom object detector for your research there are many factors architectures we have to think about, we have to consider our model architecture like FPN(feature pyramid network) with region purposed network, and on opting for region proposal methods we have Faster R-CNN, or we can use more of one-shot techniques like SSD(single shot detector) and YOLO(you only look once).

Now in all of this continuous competition of making object detection models better and efficient, the Facebook AI team has launched many cutting edge detectors, models, frameworks, and datasets over the years. But still, it never came out of controversies easily there are tweets and negative images that are going on the internet towards facebook AI systems.

In 2018, Facebook AI Research (FAIR) published a new object detection algorithm called Detectron. It was a great library that implements state-of-art object detection, including Mask R-CNNIt was written in Python and Caffe2 deep learning framework.

Due to Detectron, there were many research projects published later like Feature pyramid network(FPN), Data Distillation, Omni-Supervised Learning, and Mask R-CNN. Detectron backbone network framework was based on:

  • ResNet(50, 101, 152)
  • ResNeXt(50, 101, 152)
  • FPN(Feature Pyramid Networks) with Resnet/ResNeXt
  • VGG16

The goal of detectron was pretty simple to provide a high- performance codebase for object detection, but there were many difficulties like it was very hard to use since it’s using caffe2 & Pytorch combined and it was becoming difficult to install.

And that’s why FAIR came up with the new version of Detectron.

Detectron2

“Detectron2 is Facebook AI Research’s next-generation software system that implements state-of-the-art object detection algorithms”

– Github Detectron2

Detectron2 is built using Pytorch, which has a very active community and continuous up-gradation & bug fixes. This time Facebook AI research team really listened to issues and provided very easy setup instructions for installations. They also provided a very easy API to extract scoring results. Other Frameworks like YOLO have an obscure format of their scoring results which are delivered in multidimensional array objects. YOLO takes more effort to parse the scoring results and inference it in the right place.

Detectron2 got pretty massive trending on the internet since its release:

Detectron2 originates from Mask R-CNN benchmark, and Some of the new features of detectron2 comes with are as follows:

  • This time it is Powered by Pytorch deep learning framework.
  • Panoptic segmentation
  • Include Densepose
  • Provide a wide set of baseline results and trained models for download in the Detectron2 ModelZoo.
  • Included projects like DeepLab, TensorMask, PointRend, and more.
  • Can be used as a wrapper on top of other projects.
  • Exported to easily accessible formats like caffe2 and torchscript.
  • Flexible and fast training on single or multiple GPU servers.

There is also a new model launched with detectron2, i.e. Detectron2go, which is made by adding an additional software layer, Dtectron2go makes it easier to deploy advanced new models to production. Some of the other features of detectron2go are:

  • Standard training workflows with-in-house datasets
  • Network quantization
  • Model conversion to optimized formats for deployment to mobile devices and cloud.

Installation

We are going to use Google Colab for this tutorial. You can find the installation guide here. Also, there is a Dockerfile available for easier installation.

Requirements

  • Operating System: Linux or macOS
  • Python: 3.6+
  • Pytorch: 1.5+ & torchvison that matches the Pytorch installation. You can install both together at pytorch.org
  • OpenCV for Visualization

Getting Started:

We are going to use the official Google Colab tutorial from Detectron2.

  1. Installing dependencies (pyyaml)
!pip install pyyaml==5.1
import torch, torchvision
  1. Install Detectron2 and restart your runtime after executing below command:
import torch
assert torch.__version__.startswith("1.7")
!pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.7/index.
html
  1. Setup Detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()
  1. Import additional libraries
import numpy as np
import os, json, cv2, random
from google.colab.patches import cv2_imshow
  1. Import detectron2 utilites for easy execution
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
  1. Run a detectron2 model trained on COCO dataset
!wget http://images.cocodataset.org/test-stuff2017/000000017581.jpg -q -O input.jpg
im = cv2.imread("./input.jpg")
cv2_imshow(im)

 

  1. Create a detectron2 configuration and a DefaultPredictor to run inference on input image

cfg = get_cfg()
# add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set threshold for this model
# Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(im)

  1. Print predicted output

print(outputs["instances"].pred_classes)

print(outputs["instances"].pred_boxes)

  1. Visualize the predicted output using Visulizer utility by Detectron2

output = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = output1.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])

For common installation, error refer here

Conclusion

FAIR team has gone pretty straight this time by open-sourcing everything, this was a good move as the team believes that they can’t achieve the state of the art algorithms and techniques in isolation so open source is the solution to a better AI era. FAIR has done many interesting projects like Multimodal hate speech Memes challenges:

This article has been published from a wire agency feed without modifications to the text. Only the headline has been changed.

Source link