A spiking network model of decision making employing rewarded STDP

Steven Skorheim; Peter Lonjers; Maxim Bazhenov

doi:10.1371/journal.pone.0090821

A spiking network model of decision making employing rewarded STDP

PLoS One. 2014 Mar 14;9(3):e90821. doi: 10.1371/journal.pone.0090821. eCollection 2014.

Authors

Steven Skorheim¹, Peter Lonjers¹, Maxim Bazhenov¹

Affiliation

¹ Department of Cell Biology and Neuroscience, University of California Riverside, Riverside, California, United States of America.

Abstract

Reward-modulated spike timing dependent plasticity (STDP) combines unsupervised STDP with a reinforcement signal that modulates synaptic changes. It was proposed as a learning rule capable of solving the distal reward problem in reinforcement learning. Nonetheless, performance and limitations of this learning mechanism have yet to be tested for its ability to solve biological problems. In our work, rewarded STDP was implemented to model foraging behavior in a simulated environment. Over the course of training the network of spiking neurons developed the capability of producing highly successful decision-making. The network performance remained stable even after significant perturbations of synaptic structure. Rewarded STDP alone was insufficient to learn effective decision making due to the difficulty maintaining homeostatic equilibrium of synaptic weights and the development of local performance maxima. Our study predicts that successful learning requires stabilizing mechanisms that allow neurons to balance their input and output synapses as well as synaptic noise.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Decision Making / physiology*
Humans
Learning / physiology
Models, Neurological
Neuronal Plasticity / physiology
Neurons / physiology
Synapses / physiology*

Grants and funding

R01 MH087631/MH/NIMH NIH HHS/United States