Can the "eyes" of a machine see through society and look into the future? – Only a human can create a machine that surpasses human capabilities –

Blog post   •   Jul 19, 2016 09:00 UTC

Machine vision that surpasses human vision

Wind noise becomes louder and louder, indicating that something is approaching.
The sound suddenly peaks, and at the next moment, you see a passenger vehicle speeding through. As the sound gradually diminishes and becomes remote, a cargo-loaded truck comes past with a rumbling sound, followed by a speeding sport car.

There is a technology that detects varied types of vehicles passing in succession and coming at different speeds, and captures the image of their license plates to distinguish one vehicle from another. No matter how good one's kinetic vision might be, it would be impossible for humans to catch up with the vision this technology provides. Masaki Suwa, a leading engineer specializing in vision sensing, developed this technology that can recognize an object faster than human vision by perceiving space three-dimensionally as humans do.

Masaki Suwa, chief technologist at OMRON's Technology and Intellectual Property HQ

Vision sensing deals with the "eyes" of a machine.

No matter how high a machine's capability might be, a machine without sense organs or "five senses" can do nothing but move around blindly. Vision sensing technology, when incorporated into machines or systems, serves as "eyes" that visually capture a target object.

More than 80% of information that the human brain receives comes from the eyes. A machine also cannot control itself in relation to the surroundings unless it has "eyes" to see environmental conditions changing moment by moment.

For more than 40 years, OMRON has been committed to developing and refining vision sensing technology that can substitute for human eyes, supporting the development of society in a diversity of spheres related to industry, society, and lifestyle.

The job Suwa first engaged in at OMRON was the development of a 2D character recognition sensor. This technology is now employed for a wide range of applications. For example, character/barcode readers use this technology to recognize "best before" dates and manufacturing lot numbers printed on food products at factories. It is also used for vehicle license plate recognition systems and business card reader apps for smartphones.

Vision sensing technology capable of reading manufacturing dates, etc. printed on products

Vision sensing technology for reading vehicle license plates

The landmark event that determined Suwa's later career as a developer was the development of a road traffic sensor built with technology that was a "first" in the industry at that time.

Road traffic management systems control traffic signals and display traffic data on information boards by correctly recognizing the speed and type of each vehicle on the road in order to optimally manage traffic flow and avoid congestion. But 2D vision sensing technology was not always able to accurately detect moment-by-moment changes in road conditions. This was because of interference such as shadows of a building or a car running in parallel, a line of cars caught in a traffic jam, or headlight reflections on the road.

To solve this problem, Suwa supposed that if two cameras were used, it could be possible to measure distances using depth perception, in the same way that two human eyes do. The members of the development project team had repeated discussions on this, finally leading to a concept that the team felt held tremendous future potential. The team then persistently sought a way to mount two cameras on a system based on the same principles as human eyes.

The system had to be durable enough to continue working even in harsh environments with heavy rain and winds, while correctly performing the functions of human eyes. Also important was the high precision that enables the system to detect a vehicle 60 meters away.

What challenged Suwa the most was how to achieve such a high level of performance, while keeping the system dimensions at a minimum size. Humans understand their surroundings and easily recognize objects three-dimensionally by instantly processing a huge amount of information coming through their eyes. Even a top-level supercomputer is unable to process this much information.

To handle complex sensing with a compact unit—to simultaneously meet these two requirements without compromise, Suwa kept polishing the technology to enhance the precision of depth sensing. He did this by keeping the information extracted from images recorded by two cameras down to a minimum. According to Suwa, it was worth challenging himself, because it was such a difficult task. The developer's spirit pushed him to move this challenging development project forward.

The development of this technology made it possible to detect vehicles with a recognition rate of more than 97% day or night, without regard to weather conditions. The result was a road traffic sensor with a mechanism that receives light by two cameras and calculates the distance by referring to the difference between the paths of light entering the two cameras. This technology was unique to OMRON—no other companies were able to imitate it.

Stereo camera capable of recognizing an object by converting 2D images into 3D images

Extracting a vehicle's features

Determining the core element and refining it to maximum precision

Recognizing electronic components three-dimensionally and indicating height differences through color-coding

At OMRON, Suwa is the leading engineer in the field of vision sensing. The development project that he spent the longest time on was the 3D vision sensing technology used to three-dimensionally recognize and inspect the shape and quality of solder joints when mounting electronic components on a circuit board. A printed circuit board inspection system must inspect solder joints on a printed circuit board with hundreds of mounted electronic components, each measuring less than 1 square millimeter.

Unlike a road traffic sensor that uses two cameras for 3D vision sensing, the printed circuit board inspection system was designed to use just one camera, and replace the other camera with a projector to project a specific pattern on a circuit board. His idea was to enable three-dimensional recognition of an object based on a minute distortion of the projected pattern. But the development project didn't progress as smoothly as he initially expected.

This type of system requires great precision and speed. So Suwa's team worked on developing a technology that enables 3D vision sensing with a changeover of the pattern at an extremely high speed of 100 times per second—far too fast for human eyes to detect. This technology enabled numerically managing solder joint shapes and conditions, key factors that determine the quality of electronic components, thus contributing to the high-speed production of high-quality products.

Recognizing a 3D shape by projecting a specific pattern (right) on an electronic component (left)

Since the first launch of development, it took a full 18 years before 3D vision sensing finally materialized into a market-ready technology. At the same time, the scope of its applications has been expanding steadily. This 3D vision sensing has now developed into one of OMRON's core technologies.

Over all these years, OMRON has been consistently committed to refining technology for delivering superb performance in an extremely small component. The 3D vision sensing technology is based on this accumulated expertise, thus is unable to be reproduced by any other company.

Suwa explains the technology in further detail. "Actually, it is not necessary to process all picture elements that make up a camera-recorded video image. Only the core information needs to be polished—with high speed and high precision. This is the essence of OMRON's vision sensing and it is where our skills as engineers are tested," he says.

Today in the age of cloud computing, an immense amount of information obtained from collected big data can be computed or analyzed if we use a supercomputer. But if the original data items that comprise the big data themselves are poor in quality, even a supercomputer cannot derive a correct answer. Also, if you rely completely on the supercomputer to calculate all the information, you may miss the right time to make an important decision.

The key to IoT is how to extract what's happening right now in the form of data.

In pursuit of a system that operates intelligently even when subjected to such limitations as the size of a CPU or memory to be incorporated in a machine, OMRON has been working on the development of compact and high-performance systems embedded with functions that work like human eyes and the brain, without requiring large-capacity hardware.

This endeavor led to the creation of a palm-sized device that serves as the "eyes" of machines. It is now in operation everywhere—on streets, within a building or factory, and in any corner of society, providing machines with the information necessary for them to think.

"Human Vision Components" palm-sized image sensor

Can the "eyes" of the machine look into your mind and into the future?

There are many more issues that can be solved by the "eyes" of machines. In what way can we make invisible places visible? At present, Suwa is challenging himself to develop sensing technology that makes it possible to see "invisible" things in three areas. These are physically invisible places, people's minds or feelings, and even the future along the time axis.

These efforts bring challenges. For example, it is still difficult to make sensing technology detect physically invisible objects, such as a place around the corner or a car approaching from within a fog.

On the other hand, sensing technology for seeing into people's minds or detecting how they feel is steadily advancing. OMRON has already developed face recognition technology that automatically detects people's faces. This technology holds great promise as a technology that supports automated driving. Suwa shares his vision. "In addition to recognizing faces and behavior, I want to work on technology that can see more deeply into a person's feelings, such as determining whether that person is tired or sleepy based on facial expressions," he says.

Automotive sensor estimates a driver's concentration based on behavior and facial expression

Future sensing, or technology to predict the future, is no longer a fantasy. The possibility of predicting the near future, such as seconds ahead, has already come into view. Once this is realized, it will become possible to predict that a car seen in the mirror will approach quickly, and that a collision can be avoided by increasing speed. Or you may be able to avoid a car coming in the opposite direction along a narrow road by predicting its behavior.

"The development process consists of persistent difficulties. But if we overcome all the hurdles and make the products we develop available in the world, it will amaze and satisfy many people. The technology we develop may positively change our society. Having a realistic image of this serves as the driving force for me to concentrate on my everyday development work," Suwa concluded confidently.

To allow the future to bring a better society, OMRON will strive to develop technology that can predict further and further into the future.