While there are many things artificial intelligence can’t yet do – science is one of them – neural networks are proving to be increasingly adept at a wide variety of pattern recognition tasks. These tasks can range from recognizing specific faces in photographs to identifying specific patterns of particle decay in physics.
At present, neural networks are mostly run on ordinary computers. Unfortunately, those networks are a poor architectural match; neurons combine both memory and computation into one unit, while our computers keep those functions separate. For this reason, some companies are investigating special neural network chips. But an American-Canadian team is now proposing an alternative: optical computers. While not as compact or complex as competing options, optical computing is incredibly fast and energy efficient.
Optical computing works because static optical elements perform transformations on light that are equivalent to mathematical transformations. For example, the authors note that a plain old lens such as in a magnifying glass effectively performs a Fourier transform without using any power. It is also possible to perform things like matrix operations using optical elements. Speed comes from the fact that our light sources and detectors are fast, operating at speeds of up to 100 GHz.
Of course, having a completely static optical system means you can’t do one crucial aspect of neural networks: training. Training means that each node adjusts its behavior based on the accuracy of the system’s output. So here the team was working with a tunable silicon-based photonic chip.
To perform calculations, a traditional computer can quickly encode information into several beams of light that are sent to one side of the chip. As the light travels through the chip, it passes through a series of nodes, where an optical element called a Mach-Zehnder interferometer can cause two light inputs to interfere with each other, altering the properties of the light passing through on the other side. , change. This operation is the equivalent of matrix multiplication. After several interference nodes, the light would pass through a series of attenuators, which slightly reduce the intensity of the light.
The behavior of these individual nodes is adjustable, allowing the optical neural network to undergo training. However, once trained, the optical chip can be maintained in its trained state with very little energy input. The authors indicate that some hardware modifications allow the chip to maintain its state without consuming energy. If successful, the only power consumption will come from the laser producing the input light and the computer encoding the information into it.
To show that their network works, the authors got 90 people to record one of four different vocal sounds and then used half of this set to train a neural network to recognize vowels. The researchers then tested their network with the remaining half of the set. The entire neural network needed more nodes than on the photonic chip, so they just read the light after it passed through the chip, recoded it, and sent it back a second time.
Overall performance was not great. Training a traditional neural network on the same data set yielded vowel recognition accuracy of just over 90 percent. For the optical neural network, the accuracy was just over 75 percent. The authors attribute most of the problems to a combination of their sensing hardware and thermal crosstalk between the photonic devices. In any case, the latter can be easily addressed by adding some insulating properties to the chip.
Good and bad
As a proof-of-principle, the work is impressive. If the researchers can lower power consumption and accuracy — and they think they already know how to do both — then the system could power a trained neural network that uses 100,000 times less power than a traditional GPU. And the high speeds and low latency of optical equipment means that performance should be outstanding.
But there are currently some important limitations. For starters, the size of existing optical chips meant that the entire neural network could not be implemented all at once; calculations involve reading the output from a first pass and sending light back through a second time. And that’s for a relatively simple neural network. The authors calculate that a neural network that is five layers deep would require a one-centimeter chip to host. Since today’s “deep learning” neural networks contain 20 or more layers, a fairly large chip would be required to implement them in their entirety.
The alternative is to continue troubleshooting by transmitting light multiple times. But that requires reading the results and calculating how to generate light with the right properties to make each additional flash of light work properly. At that point, much of the speed and latency benefits will evaporate. Likewise, it’s hard to see how you can take full advantage of the speed of the optical hardware if you have to calculate the input light’s properties in advance. Switching between multiple light sources to keep the neural network busy should be possible, but that would increase complexity and power consumption.
Perhaps the biggest positive here is that companies like IBM are working to integrate more optical capabilities onto standard silicon chips. The technology needed to make optical neural networks more effective could potentially be developed entirely for a different purpose.
Nature photonics2017. DOI: 10.1038/nphoton.2017.93 (About DOIs).