Short-Term Memory Mechanisms in the Goal-Directed Behavior of the Neural Network Agents

K. V. Lakhman; M. S. Burtsev

K. V. Lakhman ; M. S. Burtsev

Matematičeskaâ biologiâ i bioinformatika, Tome 8 (2013) no. 2, pp. 419-431

Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Résumé

Modern machine learning methods are not able to achieve level of adaptability comparable with one that observed in the animals’ behavior in complex environments with numerous goals. This fact necessitates the investigation of general principles for the formation of complex control systems able to provide effective goal-directed behavior. We have developed original neuroevolutionary model for the agents situated in stochastic environments with hierarchy of goals. The paper provides the analysis of the evolutionary dynamics of agents’ behavioral strategies. Analysis’s results demonstrate that evolution results in neural network controllers that allow agents to store information in short-term memory via several neurodynamical mechanisms and use it for behavior based on alternative actions. During the study of neuronal basics of the agents’ behavior we found that neurons’ groups could be responsible for different stages of behavior.

Export
Comment citer

@article{MBB_2013_8_2_a8,
     author = {K. V. Lakhman and M. S. Burtsev},
     title = {Short-Term {Memory} {Mechanisms} in the {Goal-Directed} {Behavior} of the {Neural} {Network} {Agents}},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {419--431},
     year = {2013},
     volume = {8},
     number = {2},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2013_8_2_a8/}
}

TY  - JOUR
AU  - K. V. Lakhman
AU  - M. S. Burtsev
TI  - Short-Term Memory Mechanisms in the Goal-Directed Behavior of the Neural Network Agents
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2013
SP  - 419
EP  - 431
VL  - 8
IS  - 2
UR  - http://geodesic.mathdoc.fr/item/MBB_2013_8_2_a8/
LA  - ru
ID  - MBB_2013_8_2_a8
ER  -

%0 Journal Article
%A K. V. Lakhman
%A M. S. Burtsev
%T Short-Term Memory Mechanisms in the Goal-Directed Behavior of the Neural Network Agents
%J Matematičeskaâ biologiâ i bioinformatika
%D 2013
%P 419-431
%V 8
%N 2
%U http://geodesic.mathdoc.fr/item/MBB_2013_8_2_a8/
%G ru
%F MBB_2013_8_2_a8

K. V. Lakhman; M. S. Burtsev. Short-Term Memory Mechanisms in the Goal-Directed Behavior of the Neural Network Agents. Matematičeskaâ biologiâ i bioinformatika, Tome 8 (2013) no. 2, pp. 419-431. http://geodesic.mathdoc.fr/item/MBB_2013_8_2_a8/

Bibliographie
Cité par

[1] Botvinick M. M., Niv Y., Barto A. C., “Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective”, Cognition, 113:3 (2009), 262–280 | DOI

[2] Sutton R. S., Barto A. G., Reinforcement Learning: An Introduction, MIT Press, 1998

[3] Sutton R. S., Precup D., Singh S., “Etween MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning”, Artificial Intelligence, 112 (1999), 181–211 | DOI | MR | Zbl

[4] Sutton R. S., Rafols E. J., Koop A., “Temporal abstraction in temporal-difference networks”, Proceedings of NIPS-18, MIT Press, 2006, 1313–1320

[5] Sutton R. S., Modayil J., Delp M., Degris T., Pilarski P. M., White A., Precup D., “Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction”, The 10th International Conference on Autonomous Agents and Multiagent Systems, v. 2, International Foundation for Autonomous Agents and Multiagent Systems, 2011, 761–768

[6] Barto A. G., Mahadevan S., “Recent advances in hierarchical reinforcement learning”, Discrete Event Dynamic Systems, 13:1–2 (2003), 41–77 | DOI | MR | Zbl

[7] Satinder S., Lewis R. L., Barto A. G., Where do rewards come from?, Proceedings of the 31st Annual Meeting of the Cognitive Science Society, Cognitive Science Society, 2009, 2601–2606

[8] Sandamirskaya Y., Schöner G., “An embodied account of serial order: How instabilities drive sequence generation”, Neural Networks, 23:10 (2010), 1164–1179 | DOI

[9] Komarov M. A., Osipov G. V., Burtsev M. S., “Adaptive functional systems: Learning with chaos”, Chaos, 20:4 (2010), 045119 | DOI

[10] Floreano D., Mondada F., “Automatic creation of an autonomous agent: genetic evolution of a neural-network driven robot”, From animals to animats 3, Proceedings of the third international conference on Simulation of adaptive behavior, MIT Press, 1994, 421–430

[11] Floreano D., Dürr P., Mattiussi C., “Neuroevolution: from architectures to learning”, Evolutionary Intelligence, 1 (2008), 47–62 | DOI

[12] Schrum J., Miikkulainen R., “Evolving multimodal networks for multitask games”, IEEE Transactions on Computational Intelligence and AI in Games, 4:2 (2012), 94–111 | DOI

[13] Kaelbling L. P., Littman M. L., Moore A. W., “Reinforcement learning: a survey”, Journal of Artificial Intelligence Research, 4 (1996), 237–285

[14] Hochreiter S., Informatik F. F., Bengio Y., Frasconi P., Schmidhuber J., “Gradient flow in recurrent nets: the difficulty of learning long-term dependencies”, Field Guide to Dynamical Recurrent Networks, eds. Kolen J., Kremer S., IEEE Press, 2001

[15] Botvinick M. M., Plaut D. C., “Short-term memory for serial order: A recurrent neural network model”, Psychological Review, 113 (2006), 201–233 | DOI

[16] Grossberg S., “Contour enhancement, short term memory, and constancies in reverberating neural networks”, Studies in Applied Mathematics, 52:3 (1973), 213–257 | MR | Zbl

[17] Anokhin P., Biology and Neurophysiology of the Conditioned Reflex and Its Role in Adaptive Behavior, Pergamon Press, 1974

[18] Edelman G., Neural Darwinism: The Theory of Neuronal Group Selection, Basic Books, 1987

[19] Taylor J. S., Raes J., “Duplication and divergence: the evolution of new genes and old ideas”, Annual Review of Genetics, 38 (2004), 615–643 | DOI

[20] Stanley K. O., Miikkulainen R., “Evolving neural networks through augmenting topologies”, Evolutionary Computation, 10:2 (2002), 99–127 | DOI

Parcourir par

Geodesic

Parcourir par