AI Supports Human Communication Like Human Does
In an age where technology advances at an ever faster pace, we now generate vast amounts of data each and every day. To store it, we have driven forward progress in storage to realize exabyte-level data centers—that’s an 8 followed by 18 zeros—and to mine and use this information we are promoting constant innovations in image and speech recognition technologies. One outcome: advances in AI that allows computers to interpret and understand data and make judgments on their own.
Toshiba has long been in the vanguard of recognition technologies, and has now integrated its image recognition, speech recognition and language and knowledge technologies into AI services that support interpersonal communications. For example, one technology can transform speech to text and automatically use it to create a report, while another can understand situations and predict the future based on the attributes and actions shown in a video.
Mr. Hideo Umeki, who leads media intelligence technology development at Toshiba Industrial ICT Solutions Company, says, “To realize a society where people and AI cohabit safely and comfortably, AI-based control and judgment needs to take into consideration human senses and human knowledge acquired from experience. Even when we cannot pick up every word in a conversation, we can still communicate with the help of context. We can talk about multiple topics by preparing corresponding scenarios on each topic inside our head. Furthermore, we can accumulate knowledge and revise it if necessary. Toshiba’s communication AI understands situations and human intentions comprehensively in the same way we do, from human facial expressions, tone of voices and actions, and provides appropriate response and treatment.”
Smooth and Customizable Speech Interpretation
One of Toshiba’s communication AI services interprets one language into another. While there are many language interpretation apps, Toshiba’s takes pride in its dictionary registration function and real time translation function. Dictionary registration function allows users to add words and terms to a configurable dictionary, which advises the AI software to translate certain words and terms in a certain way, preventing mistranslation. With most speech interpretation apps, users cannot speak naturally and speech must be input carefully, sentence by sentence. Toshiba’s communication AI doesn’t need users to pause at every sentence. It automatically identifies appropriate blocks of sentences in a naturally spoken conversation. Interpretation without pausing realizes smooth, less stressful conversation.
Listen, Interpret. See, Interpret?
Toshiba is also advancing development of technology that can extract and translate language information from an image. Imagine you are on holiday overseas, famished, and you find yourself in a restaurant where the food looks really good. Your expectations grow with your appetite, but then you open the menu and…everything is in a language you don’t understand, with no pictures to help.
What if you could take a picture of the menu with your smartphone that then showed you the menu in your language? Toshiba’s Camera Image Recognition Technology does just that, by recognizing text information. The technology needs to be able to recognize text information in an image, where non-text information and text information coexist or text can be distorted, even upside down. Drawing on its long term research into facial recognition and human detection technologies, Toshiba has developed image-extraction technology and successfully applied it to distinguishing embedded text. Applied to smartphone cameras, text in an image can be distinguished, recognized as a language and translated. Combined with wearable devices, such as smartglasses, foreign-language text information could be automatically translated and projected in front of the eyes in your language. In the near future, there might come a day when fears about ordering the wrong thing from a foreign language menu are a thing of the past.
Developing Technologies that Care for People
In this age of IoT (Internet of Things), where all sorts of device get connected to the internet, Mr. Umeki offers a comment that underlines how Toshiba sees its position and the future of IoT: “We believe what lies ahead for IoT is caring for people. ICT that supports people’s lives and business activities safely, securely and comfortably through a “Things×ICT×Humans” interlink.” Toshiba is determined to contribute to a more secure, safer and more comfortable society by advancing its development of AI that gets smarter as it is used.