According to Reuters, the Segment Anything Model (SAM), a new AI model from Meta, can recognize specific items in pictures and videos, even those that weren’t present during training.
SAM is an image segmentation model that can react to user clicks or text commands to isolate particular items within an image, according to a blog post from Meta. An image is divided into several segments or areas, each of which represents a different object or area of interest, as part of the image segmentation process used in computer vision.
Image segmentation is used to simplify the processing or analysis of a picture. In addition, according to Meta, the technology is helpful for image editing, augmented reality applications, comprehending internet content, and assisting scientific research by automatically localizing animals or other items to monitor on video.
A precise segmentation model requires highly specialized work by technical experts with access to AI training infrastructure and large volumes of carefully annotated in-domain data, according to Meta. By removing the requirement for specialized training and knowledge, Meta intends to “democratize” this process through SAM, which it expects will encourage more computer vision research.
In addition to SAM, Meta has put together a dataset it calls “SA-1B” that consists of 1.1 billion segmentation masks generated by its segmentation method, 11 million images obtained under license from “a large photo company,” and other data. Under an Apache 2.0 license, Meta will make SAM and its dataset available for research.
The code is currently accessible on GitHub without the weights, and Meta has produced a cost-free interactive demo of its segmentation algorithm. Visitors to the demo can upload a photo and choose things by hovering over them with the mouse, clicking them in a selection box, or clicking “Everything.” (which attempts to automatically ID every object in the image).
Despite the fact that picture segmentation technology is not new, SAM stands out for its capacity to recognize items that are absent from its training dataset as well as its partially open methodology. A new wave of computer vision applications might also be inspired by the introduction of the SA-1B model, much to how Meta’s LLaMA language model has already sparked related initiatives.
Incorporating generative AI into the company’s apps this year is crucial, according to Mark Zuckerberg, CEO of Meta, Reuters reports. Although Meta hasn’t yet made a commercial product utilizing this kind of AI, it has previously worked with Facebook to tag photos, moderate content, and determine the most popular posts on Facebook and Instagram using technology similar to SAM.
The revelation by Meta comes amid intense competition among Big Tech companies to rule the AI industry. The ChatGPT language model from Microsoft-backed OpenAI attracted considerable attention in the fall of 2022, igniting a surge of investments that could define the next significant business trend in technology beyond social media and smartphones.