Ask a smart home device for the weather forecast and it takes a few seconds for the device to respond. One reason for this latency is that connected devices don’t have enough memory or power to store and run the massive machine learning models needed for the device to understand what a user is asking of it. The model is stored in a data center, which can be hundreds of miles away, where the answer is calculated and sent to the device.
MIT researchers have developed a new method for computing directly on these devices that drastically reduces this latency. Their technique offloads the memory-intensive steps of running a machine learning model to a central server, where components of the model are encoded onto lightwaves.
The waves are transmitted via fiber optics to a connected device, allowing tons of data to be sent across a network at lightning speed. The receiver then uses a simple optical device that quickly performs calculations using the parts of a model carried by these light waves.
This technique results in more than a hundredfold improvement in energy efficiency compared to other methods. It could also improve security since a user’s data doesn’t have to be transmitted to a central location for computation.
This method could allow a self-driving car to make real-time decisions while using only a tiny percentage of the energy currently required by power-hungry computers. It could also allow a user to have zero-latency entertainment with their smart home device, use live video processing over cellular networks, or even enable high-speed image classification on a spacecraft millions of miles from Earth.
“Anytime you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can run the program from memory. Our pipeline is huge – it’s equivalent to sending a full-length film over the Internet about every millisecond. That’s how fast data comes into our system. And that’s how fast it can calculate,” says senior author Dirk Englund, an associate professor in the Department of Electrical and Computer Science (EECS) and a member of the MIT Research Laboratory of Electronics.
Alongside Englund, lead author and EECS student Alexander Sludds is involved in the work; EECS student Saumil Bandyopadhyay, research scientist Ryan Hamerly, and others from MIT, MIT Lincoln Laboratory, and Nokia Corporation. The research results will be published in Science.
Neural networks are machine learning models that use layers of connected nodes, or neurons, to recognize patterns in data sets and perform tasks like classifying images or recognizing speech. However, these models can contain billions of weight parameters, which are numeric values that transform input data when processed. These weights must be stored. At the same time, the data transformation process involves billions of algebraic calculations that require a lot of power to perform.
The process of retrieving data (in this case, the neural network weights) from memory and transferring it to the parts of a computer that do the actual computation is one of the biggest limiting factors for speed and power efficiency, says Sludds.
“So we thought, why don’t we take all that heavy lifting — the process of pulling billions of weights out of memory — away from the Edge device and put it somewhere where we have ample access to power and storage, eh.” gives us a chance to get those weights quick?” he says.
The Netcast neural network architecture they developed involves storing weights on a central server connected to novel hardware called a smart transceiver. This intelligent transceiver, a thumb-sized chip that can receive and send data, uses a technology known as silicon photonics to pull trillions of weights from memory every second.
It receives weights as electrical signals and imposes them on light waves. Since the weight data is encoded as bits (1s and 0s), the transceiver converts it by switching lasers; A laser is switched on with a 1 and switched off with a 0. It combines these light waves and then periodically transmits them over a fiber optic network, so a client device doesn’t have to poll the server to receive them.
“Optics are great because there are many ways to transport data within optics. For example, you can stream data onto different colors of light, and that allows for much higher data throughput and bandwidth than with electronics,” explains Bandyopadhyay.
trillions per second
Once the lightwaves arrive at the client device, a simple optical component known as a broadband “Mach-Zehnder” modulator uses them to perform super-fast analog computations. Input data from the device, such as sensor information, is encoded onto the weights. Then it sends each individual wavelength to a receiver, which detects the light and measures the result of the calculation.
Researchers devised a way to use this modulator to perform trillions of multiplications per second, greatly increasing the computational speed on the device while consuming a tiny amount of power.
“To make something faster, you have to make it more energy efficient. But there is a compromise. We’ve built a system that can operate on about a milliwatt of power, but still perform trillions of multiplications per second, which is an order of magnitude gain in both speed and power efficiency,” says Sludds.
They tested this architecture by sending weights down an 86-kilometer fiber that connects their lab to MIT’s Lincoln Laboratory. Netcast enabled machine learning with high accuracy – 98.7 percent for image classification and 98.8 percent for digit recognition – at blistering speeds.
“We had to calibrate something, but I was surprised at how little work we had to do to get such high accuracy out of the box. We were able to achieve commercially relevant accuracy,” adds Hamerly.
In the future, the researchers plan to iterate the smart transceiver chip to achieve even better performance. They also want to downsize the receiver, which is currently the size of a shoebox, to the size of a single chip so it can fit on a smart device like a cellphone.
The research is funded in part by NTT Research, the National Science Foundation, the Air Force Office of Scientific Research, the Air Force Research Laboratory, and the Army Research Office.