Chatbot-Like Siri Patent Includes Intelligent Image, Video, and Audio Recognition Within Messages

A patent application published by the United States Patent and Trademark Office today details a new Apple service where users could make inquiries and talk with the company’s AI assistant Siri through Messages (via AppleInsider). The new patent is similar to a filing the USPTO published late last year, but now includes deeper integration with audio, video, and image files.

Similar to chatbots in Facebook Messenger and other texting services, Apple’s patent describes a Siri that could perform her current duties without the user having to speak aloud, which could be helpful in certain public situations.

The “Intelligent Automated Assistant in a Messaging Environment” could respond to text, audio, images, and video when sent to it by the user, which Apple said would result in “a richer interactive experience between a user and a digital assistant.” The patent gives a few examples of a conversation held between Siri and a user in Messages, with the user asking questions regarding calorie content in food, upcoming meetings, and even asking Siri to text a friend.


Interesting applications include a thread where a user texts Siri a picture of a car or a bottle of wine, and Siri sees the images and can intelligently respond to the user’s inquiries about them. For the car, the user asks Siri for details on pricing for a specific model using only an image, and Siri searches the internet and returns the relevant MSRP information.

The bottle of wine image is used as an example to show Siri’s memory functions, where a user asks Siri to remember their favorite wine, which she can resurface at a later date. Siri sees the wine image, reads the label, and can then respond to a user’s question in text format about the brand and even year it was made.


Other image-related inquiries include “Where is this place?” and “What insect is this?”, to which Siri would respond “This is the country Algeria” and “This is an earwig,” respectively. Audio and video could also be recognized by Siri, including simple Shazam-like questions related to songs and the content of shared videos.

Apple points out in its patent that thanks to the chronological format of texting, users would be able to “review previous interactions” with Siri, unlike how current Siri conversations disappear immediately after they conclude. Subsequently, Siri would be able to use that history to become smarter and “define a wider range of tasks.”

The messaging platform can enable multiple modes of input (e.g., text, audio, images, video, etc.) to be sent and received. As described herein, this can increase the functionality and capabilities of the digital assistant, thereby providing a richer interactive experience between a user and a digital assistant.

A digital assistant in a message environment can thus enable greater accessibility to the digital assistant. In particular, the digital assistant can be accessible in noisy environments or in environments where audio output is not desired (e.g., the library). Moreover, the chronological format enables a user to conveniently review previous interactions with the digital assistant and utilize the contextual history associated with the previous interactions to define a wider range of tasks.

The patent includes a description where Siri would be “a participant in a multi-party conversation,” allowing group chats to use Apple’s AI simultaneously. Apple gives an example where one user asks Siri to list nearby Chinese restaurants to begin making the group’s dinner plans, and then another user responds by asking Siri to whittle down the list to only include the cheapest places. One user’s personal Siri can even be asked to remind other participants of the upcoming dinner.


Apple is believed to be working on an “enhanced Siri” that might launch in iOS 11 this fall, but the exact specifications as to what would make the new Siri “enhanced” have never been divulged. A questionable rumor in March stated that deep Siri integration is coming to Messages in iOS 11, but the source of the news — The Verifier — doesn’t have a previous track record of reporting accurate rumors.

Chatbots are certainly growing in popularity so it wouldn’t be too surprising if Apple introduced some kind of text-based Siri interface, particularly considering the multiple patents the company has published on the topic. Still, as with all patents it’s best to look at Apple’s new filing as an intriguing insight into what the company might be working on for the future, rather than proof of an impending launch.

Tags: Siri, patent

Discuss this article in our forums

You can follow iPhoneFirmware.com on Twitter, add us to your circle on Google+ or like our Facebook page to keep yourself updated on all the latest from Apple and the Web.