Doing Neural Nets with Photons

Scatterings image

[Image: RedCube Inc., and courtesy of the researchers/MIT]

Ideas for all-optical computers stretch back half a century, but they’ve generally run aground on the superiority of electronics for digital calculation and computation. Now, a team at the Massachusetts Institute of Technology (MIT), USA, has proposed that there may still be a place for computing with light instead of electrons—in new deep-learning computer systems built on decentralized “neural nets” inspired by the human brain (Nat. Photon., doi: 10.1038/nphoton.2017.93).

To demonstrate the concept, the research team, led by OSA Life Member Dirk Englund and MIT colleague Marin Soljačić, built a programmable silicon nanophotonic processor tricked out with waveguides and interferometers. The team then trained the simple neural net implemented on the chip to recognize four vowel sounds, at an accuracy level of 77 percent.

The nanophotonic chip can, according to one of the paper’s two lead authors, Yichen Shen, carry out the computationally intensive matrix multiplications at the heart of neural-net computing, using “in principle … less than one thousand as much energy per operation” as electronic chips. And while the initial demo has a limited scope, the researchers believe that it can be scaled up to much bigger neural nets through techniques already in the silicon-photonics toolkit—as well as via emerging approaches such as 3-D photonic integration.

From bulk optics to integrated photonics

Analog optical processors, built out of bulk optical components, originally came on the scene in the 1960s, in niche applications such as creating images from synthetic aperture radar data. But as electronic-processor density began to benefit from Moore’s law, and as computing architectures shifted to leverage those developments, the advantages of electronics in compactness and equipment cost eclipsed the potential pluses, especially in power consumption, of bulkier all-optical platforms.

The Soljačić-Englund team argues that several new factors may have changed the playing field and once again opened up an opportunity for optics. One such development is progress in integrated photonics and the increasing ability to pack an array of optical components on small-footprint silicon chips, which could steal back some of the advantages in compactness and cost previously ceded to electronics.

Power-hungry computations

Still more interesting, according to the researchers, is the increasing role of deep-learning, neural-network computer architectures in handling problems involving vast volumes of data, such as on-the-fly language translation, decision making, and image recognition. For these problems—growing ever more visible in both consumer and industry applications—the power-saving characteristics of optical computing could once again come to the fore.

That’s because neural-net computing relies heavily on repeated matrix multiplications that, in electronic-CPU architectures, are computationally intensive and power hungry. The MIT team estimates that the power consumption of an optical neural network could scale approximately linearly with the number of neurons in the net—versus quadratic scaling for a state-of-the-art electronic platform. And, the researchers say, once the optical neural net is trained for a given task, it becomes a passive architecture that can perform computation on optical signals with no additional energy input.

Multiplication through interference

To test out these ideas, co-lead-author Nicholas Harris and others in Englund’s lab developed a photonic nanoprocessor roughly 1 cm in length. The chip consists of 56 programmable Mach-Zehnder interferometers (each including two phase shifters connected by two directional couplers), and is tied to a coherent light source carrying the digital signal. Under the setup, the result of each interative matrix multiplication operation on the photonic chip is detected as an interference pattern on a photodiode array.

That result is then passed to a computer, which simulates the nonlinear response of a saturable absorber to provide the neural net’s nonlinear activation function. (In future instances of the chip, according to the team, this “optical nonlinearity unit” could be implemented by integrating saturable-aborber materials directly in the optical chip itself.) A matrix weighting based on the output of the activation function is then fed back to the optical interference unit on the chip for another run until the neural net has been appropriately “wired” to do its work.

The chip’s design was set up to implement a simple two-layer, 16-neuron network that could be trained to interpret four vowel sounds. The team used a small, 180-data-point training set derived from Fourier transforms of voice signals from 90 individuals, each speaking four different vowel phonemes. Once the chip had been trained to recognize the vowel sounds, the team evaluated it with a test set of similar data, and found that it could correctly identify the vowel sounds nearly 76.7 percent of the time. That compares with 91.7 percent accuracy for a conventional 64-bit electronic computer.

Scaling up

The researchers attribute that accuracy difference to the very limited resolution of the rudimentary system they were testing—and Soljačić, in a press release, said he envisioned “no significant obstacles” to scaling the system up to greater accuracy. In the paper, the researchers note that current integrated-photonics techniques “should be capable” of realizing thousand-neuron-plus optical neural nets, given previous demonstrations of photonic integrated circuits containing more than four thousand optical devices. And other techniques, including 3-D photonic integration and tricks with input-signal handling, could allow much bigger effective neural nets to be realized on a small footprint, according to the scientists.

The team acknowledges that it still has a great deal of work to do to make the system practical. But the researchers believe that their experiment demonstrates the potential of all-optical computing to increase speed and reduce power consumption in the increasingly important big-data application spaces where computationally intensive matrix multiplications do a lot of the heavy lifting. Indeed, co-lead-author Harris suggested in a press release that the system could find use in emerging Internet of Things applications such as self-driving cars and drones—“whenever,” he said, “you need to do a lot of computation but you don’t have a lot of power or time.”

Publish Date:

Add a Comment

Article Tools

Share this Page