New Training Technique for Highly Efficient AI Methods
AI applications like ChatGPT are based on artificial neural networks that, in many respects, imitate the nerve cells in our brains. They are trained with vast quantities of data on high-performance computers, gobbling up massive amounts of energy in the process. Spiking neurons, which are much less energy-intensive, could be one solution to this problem. In the past, however, the normal techniques used to train them only worked with significant limitations. A recent study by the University of Bonn has now presented a possible new answer to this dilemma, potentially paving the way for new AI methods that are much more energy-efficient.
The findings have been published in Physical Review Letters.
Our brain is an extraordinary organ. It consumes as much energy as three LED light bulbs and weighs less than a laptop. And yet it can compose pieces of music, devise something as complex as quantum theory and philosophize about the afterlife.
Although AI applications such as ChatGPT are also astonishingly powerful, they devour huge quantities of energy while wrestling with a problem. Like the human brain, they are based on a neural network in which many billions of “nerve cells” exchange information. Standard artificial neurons, however, do this without any interruptions—like a wire netting fence with mesh that electricity never stops flowing through.
“Biological neurons do things differently,” explains Professor Raoul-Martin Memmesheimer from the Institute of Genetics at the University of Bonn. “They communicate with the help of short voltage pulses, known as action potentials or spikes.” These occur fairly rarely, so the networks get by on much less energy.” Developing artificial neural networks that also “spike” in this way is thus an important field in AI research.
Spiking networks—efficient but hard to train
Neural networks must be trained if they are to be capable of completing certain tasks. Imagine you have an AI and want it to learn the difference between a chair and a table. So you show it photographs of furniture and see whether it gets the answer right or wrong. Some connections in the neural network will be strengthened and others weakened depending on the results, with the effect that the error rate decreases from one training round to the next.
After each round, this training modifies which neurons influence which other ones and to what extent. “In conventional neural networks, the output signals change gradually,” says Memmesheimer, who is also a member of the Life and Health Transdisciplinary Research Area. “For example, the output signal might drop from 0.9 to 0.8. With spiking neurons, however, it’s different: spikes are either there or they’re not. You can’t have half a spike.”
You could perhaps say that every connection in a neural network comes with a controller that enables the output signal from a neuron to be dialed up or down slightly. The settings on all the controls are then optimized until the network can distinguish chairs from tables with accuracy. In spiking networks, however, the control dials are unable to modify the strength of the output signals gradually. “This means it’s not so easy to fine-tune the weightings of the connections either,” points out Dr. Christian Klos, Memmesheimer’s colleague and first author of the study.
Whereas it was previously assumed that the usual training method (which researchers call “gradient descent learning”) would prove highly problematic for spiking networks, the latest study has now shown this not to be the case. “We found that, in some standard neuron models, the spikes can’t simply appear or disappear just like that. Instead, all they can essentially do is be brought forward or pushed back in time,” Klos explains. The times at which the spikes appear can then be adjusted—continuously, as it turns out—using the strengths of the connections.
Fine-tuning the weightings of connections in spiking networks
The different temporal patterns of the spikes influence the response behavior of the neurons at which they are directed. Put simply, the more “simultaneously” a biological or a spiking artificial neuron receives signals from several other neurons, the greater the probability increases of it generating a spike on its own. In other words, the influence exerted by one neuron on another can be adjusted via both the strengths of the connections and the timings of the spikes. “And we can use the same highly efficient, conventional training method for both in the spiking neural networks that we’ve studied,” Klos says.
The researchers have already been able to demonstrate that their technique works in practice, successfully training a spiking neural network to distinguish handwritten numbers accurately from one another. For the next step, they want to give it a much more complex job, namely understanding speech, says Memmesheimer: “Although we don’t yet know what role our method will play in training spiking networks in the future, we believe it has a great deal of potential, simply because it’s exact and it mirrors precisely the method that works supremely well with non-spiking neural networks.”
Funding:
The study was funded by the Federal Ministry of Education and Research (BMBF) via the Bernstein Award 2014.
Wissenschaftlicher Ansprechpartner:
Prof. Dr. Raoul-Martin Memmesheimer
Neural Network Dynamics and Computation
University of Bonn
Phone: +49 228 73-9824
Email: rm.memmesheimer@uni-bonn.de
Originalpublikation:
Christian Klos and Raoul-Martin Memmesheimer: “Smooth Exact Gradient Descent Learning in Spiking Neural Networks,” in: Physical Review Letters; DOI: 10.1103/PhysRevLett.134.027301
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.134.027301
Weitere Informationen:
https://www.youtube.com/watch?v=FkGLJf73KVM&t=2s, Learning of a spike pattern that shows the Elbphilharmonie concert hall in Hamburg. A single image shows the pattern in single learning step. The short lines are the spikes of the network neurons. They are arranged by generating neuron on the Y-axis and by time on the X-axis. The spike times for each neuron are adjusted during the learning process. Video: Raoul-Martin Memmesheimer’s working group/University of Bonn