This week Microsoft introduced Seeing AI, a research project that uses a smartphone or Pivothead smart glass app to describe what’s happening in the world to help a blind person read signs or menus, identify the emotional state of people in a room, and more.

While Seeing AI isn’t yet available to the public, Microsoft has released a tool that uses some of the same technology. CaptionBot is a service that attempts to determine the contents of a photo and create text-based descriptions.

Microsoft CaptionBot
Microsoft CaptionBot

CaptionBot is still very much a work in progress, but it uses Microsoft’s Computer Vision, Emotion, and Bing Image APIs to determine what’s going on in a photograph and describe it in natural language.

It also uses machine learning, which means that the more photos it analyzes, the better it should get, particularly if people provide ratings for its captions (and don’t abuse the system).

Over the past few years, we’ve seen Microsoft introduce similar tools that attempt to guess your age from photographs, decide if you look like a celebrity, and more. While these things might seem gimmicky on their own, they’re helping Microsoft’s software get better at identifying and describing visual imagery which can have applications from improving Bing’s image search tools to improving the Seeing AI technology to help blind people navigate the world in new ways.

via Microsoft Blogs

Support Liliputing

Liliputing's primary sources of revenue are advertising and affiliate links (if you click the "Shop" button at the top of the page and buy something on Amazon, for example, we'll get a small commission).

But there are several ways you can support the site directly even if you're using an ad blocker* and hate online shopping.

Contribute to our Patreon campaign

or...

Contribute via PayPal

* If you are using an ad blocker like uBlock Origin and seeing a pop-up message at the bottom of the screen, we have a guide that may help you disable it.