Our previous three articles introduced how Toshiba applies AI at Yokkaichi Operations, in an image recognition LSI, and in communications. These technologies all originated at Toshiba’s Corporate Research and Development Center. Osamu Hori, the Director of the Center, has been closely involved with image recognition research throughout much of his career, which is interwoven with the progress of Toshiba’s AI. We asked him to tell us something of this history—and the future.
In the forefront of AI research
-You joined Toshiba in 1986 during what we now call the second AI boom. Did you recognize that your field of research was in AI?
Hori: Yes, of course. We already had the term “AI”, and Marvin Minsky, co-founder of the AI Lab at the Massachusetts Institute of Technology, was very well known. It was in this context that there was a surging interest in how to get computers to do intelligent tasks, and since the time I was at university, I, personally, have been interested in getting machines to do what people could do.
At that time, Toshiba already had top-level image recognition technology using AI, and that was what attracted me to the company. My first project at Toshiba was to develop a system for converting hand-drawn diagrams of underground power lines into CAD data, using image recognition technology. We needed to get the system to recognize that “this is a straight line” or “this is a character string” from images of drawings. Toshiba had already produced a system for converting diagrams of power plant systems to CAD data and we had received a request from our client, an electric power company, to develop something similar for power lines.
(Marvin Minsky became Toshiba Professor of Media Arts and Sciences at MIT in 1990)
Based on a figure originally published by the Ministry of Internal Affairs and Communications
The hurdle of manually loading machines with knowledge
-So Toshiba already had a system for power plant drawings. Did the mapping of power transmission lines require new techniques?
Hori: Yes. The mainstream work of the second AI boom was about putting knowledge into machines. For example, let’s say you want a machine that can recognize roosters. What you do is you make a rule like: “Roosters are birds with crests, that crow and that can’t fly,” and put that into the machine’s memory; and you need different rules for getting the machine to recognize that a rooster is a rooster, a cat is a cat, and so forth.
In my case, the marks to be identified and the signal processing when the images were being read for the power line drawings differed from those for the plant system diagrams.We could use some aspects of the existing system in our project, but we had to rebuild much of it.
Looking for a breakthrough with machine learning research
-So the need for people to write the rules for machines to recognize an object became the stumbling block for the second AI boom. Then the long “AI winter” began. Did Toshiba continue its AI research?
Hori: Well, people stopped using the word “AI,” but the research continued without a break, although the focus of interest shifted. For instance, in image recognition technology, people who were deeply involved with robotic vision in the 1980’s moved to vehicle-mounted image recognition systems. They said, “Well, we swapped something that walked on two legs for something on four wheels. But, never mind.” However, that research actually led to Toshiba’s present support technology for driverless vehicles.
-Did you also shift your area of work?
Hori: I went on to video. Around 1995, the World Wide Web was spreading everywhere, and we all expected to see much more text, image and video data. We had reasonable success in extracting target text from massive amounts of data, so I decided to work on visual searches of moving images which people do easily, but is difficult for machines, and I started developing technology for finding faces in an image, an area where there was big demand.
The main problem was the design of feature selection rules. When you try to get a machine to recognize a face, you first need to design a features index that represents the shapes of eyes and ears. From experience, people know what shapes eyes have and can recognize them on sight. But machines cannot. Various methods for designing feature selection rules were proposed — even in the depth of the AI winter. What we came up with was the “co-occurrence features” technique.
When we look at human faces, we can simultaneously see there are eyes and ears. The “co-occurrence features” technique increases the number of feature types that can be selected by looking at such combinations of features which appear simultaneously, and by using the techniques to reconstruct the distinguishing features of a face, a machine is able to recognize objects more accurately. CoHOG*, which is used by Toshiba's automotive image recognition LSI to recognize the presence of pedestrians, is an example of the “co-occurrence features” technique.
(* CoHOG: Toshiba’s Co-occurrence Histograms of Oriented Gradients technology)
-It is surprising to hear that there was progress in image recognition technology even in the AI winter.
Hori: Even then there was a lot we could do, such as extracting feature selection details from data by statistical methods and using statistical machine learning methods in classifying objects in the proposal for the “co-occurrence features” technique.
In those days AI research was heading in two directions. One was to develop an AI neural network to imitate the brain’s neural functions, but that was hindered by limitations in processing power. So I chose the other path, statistical machine learning, for my research and looked for more ways to put AI to practical use.
Acquiring feature selection rules autonomously with deep learning
-So while you were researching statistical machine learning, when did you start to notice the importance of deep learning? I understand that deep learning is said to be AI with the ability to learn things for itself.
Hori: I became aware of it in 2013. In 2012, the University of Toronto won decisively a world-class image recognition competition with “Super Vision,” which used deep learning. When I heard about this, I immediately recognized it was something different, because it had succeeded where competing machines had failed completely.
Actually, with statistical machine learning a machine can make classifications automatically based on feature selection rules designed by a person, so in that sense deep learning is no different from statistical machine learning. However, with statistical machine learning a machine cannot design feature selection rules on its own. For example, if you tell a machine that rabbits have long ears, it will classify objects with long ears as “rabbits,” but it cannot come up with the distinguishing feature of long ears by itself. That is why we had to work so hard on the co-occurrence technique and design the feature selection rules ourselves.
The thing about deep learning, however, is that it enables the machine to design the feature selection rules by itself. If we input a massive amount of image data into a machine and tell it that they are images of rabbits, the machine will decide by itself what the feature selection rules are for recognizing rabbits. This autonomous designing of feature selection rules was a big labor-saving breakthrough.
Supporting society by putting AI into infrastructure
-Since the appearance of deep learning, what are Toshiba’s advantages in AI?
Hori: I believe that Toshiba has two advantages. One is experienced researchers in what is now called AI, but was known in the past as machine learning or pattern recognition, a traditional field of research for Toshiba. Actually, when deep learning appeared in 2012, many researchers in Japan responded coldly, due to their bitter experience with neural networks during the AI winter. But there were many researchers at Toshiba, whether pro or anti, who looked closely at deep learning; so we had a base of AI research that allowed us to respond to advances in leading-edge technology.
Our other advantage is that we have big data and domain-specific knowledge in infrastructure. Vast amounts of data are needed for deep learning to enable machines to learn feature selection rules. Toshiba has such vast amounts of data on infrastructure, such as production data at Yokkaichi Operation, which manufactures flash memories. We also have frontline people with domain-specific knowledge.
With these two strengths Toshiba can, if it wishes, carry out AI research entirely on its own. I believe that is why the project at Yokkaichi produced results very quickly.
-What problems is Toshiba trying to tackle by combining AI with its well-established infrastructure technologies?
Hori: I am sure you know the term “IoT.” It is commonly seen as combining data on the Web and things, but I think the future will be more about extracting data from things. Toshiba can introduce sensors on infrastructure systems, such as roads, railroads and factories, and extract data. However, people cannot analyze such huge volumes of data manually. So we will use AI to do the analysis to pick up the early warning signs, so that we can implement preventive maintenance and optimize operations. Toshiba will contribute to society by making highly reliable infrastructure systems and optimizing the operation of the increasingly complex systems on which society relies.
-Can you tell me how our lives will change specifically?
Hori: Well, as one example, with the liberalization of the electricity market in Japan the number of companies responsible for supplying electricity will increase and this is leading to a complex supply and demand structure. In order to help prevent power outages resulting from imbalances in supply and demand, we can get AI to decide in real time which power plants to put online or to stand down by analyzing sales data on the amount of power bought and sold, amongst other factors. This will lead to the efficient generation of electricity.
AI as a means for promoting well-being
-There is a lot of discussion about the ethical and societal impacts of AI. How do you see the future of AI?
Hori: I see the broad definition of AI as “a machine that acts in a way that seems intelligent to humans”. That is why I think AI is seen as taking over jobs that appear at first sight to be “intellectual”. Again, since work includes responsibilities, there are discussions on who bears responsibility if AI fails in the workplace. It can be said that AI development faces a mountain of issues. Also, machines can only deal with data of past events so it is difficult for AI to make decisions in totally new circumstances or concepts. Therefore, I think that people will continue to be needed to make creative decisions.
Toshiba is said to be a technology company. I, too, see it as a company that values technology. I believe that technology should promote people’s well-being and that AI is a powerful tool for this purpose. I think it is wonderful that Toshiba is involved in this endeavor.