Microsoft has confirmed that it has developed a powerful and smart image captioning system. The AI-powered system can quickly and autonomously generate relevant cations of images. The system might not be as fluent or creative as humans creating captions for images, but it has been reportedly trained on huge datasets to improve relevance and accuracy. Microsoft assures the system is twice as powerful as the prevalent one being used in the services of the company.
Microsoft has a new auto-captioning system for images. The system will launch first in Azure Cognitive Services. However, Microsoft has indicated that the same will trickle down to Microsoft Word, Outlook, and PowerPoint.
How Does The New AI-Drive Image Captioning System Work?
Any AI-driven system has to be first trained on relevant datasets. These algorithms learn from the data points and then gain the ability to mimic the expected behavioral patterns. Microsoft’s new auto image captioning system too is reportedly trained with a huge dataset of images that were paired with word tags. These word tags were mapped to a distinct object in an image.
After the initial training, researchers fine-tuned the pre-trained model for captioning on the already captioned images dataset. The training and finetuning process allowed the AI model to learn how to compose an understandable sentence. The new AI model subsequently leverages the visual vocabulary to self-generate captions for images containing novel or distinct objects accurately. It appears the emphasis is on the object that is specific or unique in the image.
As with all AI Models, even Microsoft’s image captioning system isn’t a 100 percent accurate or perfect. However, Microsoft assures the new AI Model is twice as better as the image captioning model currently being used in the company’s products and services. Internal testing indicates the new model can create captions that are more descriptive and accurate than the captions written manually by humans, claims Xuedong Huang, a Microsoft technical fellow and the chief technology officer of Azure AI Cognitive Services in Redmond, Washington,
“We’re taking this AI breakthrough to Azure as a platform to serve a broader set of customers. It is not just a breakthrough in the research; the time it took to turn that breakthrough into production on Azure is also a breakthrough.”
Sharing a major milestone in Microsoft's journey to make our products and services more accessible for everyone. Researchers have created an #AI system that generates image captions that are on par with, or better than, what people write! #MSFTAdvocate https://t.co/W83cGFiCI8
— Ken Ross (@hotkrossbits) October 14, 2020
What Huang indicated was that Microsoft has able to significantly accelerate the development, refinement, and deployment of AI Models which can compete against human-generated content. However, it is important to note that these models usually follow a specific set of guidelines and rely heavily on the datasets.
Microsoft has been working hard for the last few years to infuse the power of AI across several of its products and services. AI holds the power to boost productivity while freeing humans to do more creative tasks. Interestingly, Microsoft aims to help all users access the vital content in any image for people with vision impairment through the new automatic image captioning system.