Multimodal Learning of Actions with Deep Neural Network Self-Organization

Parisi, German Ignacio

Titel:	Multimodal Learning of Actions with Deep Neural Network Self-Organization
Sonstige Titel:	Multimodales Lernen von Aktionen mit tiefer Selbstorganisation neuronaler Netzwerke
Sprache:	Englisch
Autor*in:	Parisi, German Ignacio
Erscheinungsdatum:	2016
Tag der mündlichen Prüfung:	2017-03-10
Zusammenfassung:	Perceiving the actions of other people is one of the most important social skills of human beings. We are able to reliably discern a variety of socially relevant information from people’s body motion such as intentions, identity, gender, and affective states. This ability is supported by highly developed visual skills and the integration of additional modalities that in concert contribute to providing a robust perceptual experience. Multimodal integration is a fundamental feature of the brain that together with widely studied biological mechanisms for action perception has served as inspiration for the development of artificial systems. However, computational mechanisms for processing and integrating knowledge reliably from multiple perceptual modalities are still to be fully investigated. The goal of this thesis is to study and develop artificial learning architectures for action perception. In light of a wide understanding of the brain areas and underlying neural mechanisms for processing biological motion patterns, we propose a series of neural network models for learning multimodal action representations. Consistent with neurophysiological studies evidencing a hierarchy of cortical layers driven by the distribution of the input, we demonstrate how computational models of input-driven self-organization can account for the learning of action features with increasing complexity of representation. For this purpose, we introduce a novel model of recurrent self-organization for learning action features with increasingly large spatiotemporal receptive fields. Visual representations obtained through unsupervised learning are incrementally associated to symbolic action labels for the purpose of action classification. From a multimodal perspective, we propose a model in which multimodal action representations can develop from neural network organization in terms of associative connectivity patterns between unimodal representations. We report a set of experiments showing that deep self-organizing hierarchies allow to learn statistically significant features of actions, with multimodal representations emerging from co-occurring audiovisual stimuli. We evaluated our neural network architectures on the tasks of human action recognition, body motion assessment, and the detection of abnormal behavior. Finally, we conducted two robot experiments that provide quantitative evidence for the advantages of multimodal integration for triggering sensory-driven motor behavior. The first scenario consists of an assistive task for the detection of falls, whereas in the second experiment we propose audiovisual integration in an interactive reinforcement learning scenario. Together, our results demonstrate that deep neural self-organization can account for robust action perception, yielding state-of-the-art performance also in the presence of sensory uncertainty and conflict. The research presented in this thesis comprises interdisciplinary aspects of action perception and multimodal integration for the development of efficient neurocognitive architectures. While the brain mechanisms for multimodal perception are still to be fully understood, the proposed neural network architectures may be seen as a basis for modeling higher-level cognitive functions.
URL:	https://ediss.sub.uni-hamburg.de/handle/ediss/7165
URN:	urn:nbn:de:gbv:18-84626
Dokumenttyp:	Dissertation
Betreuer*in:	Wermter, Stefan (Prof. Dr.)
Enthalten in den Sammlungen:	Elektronische Dissertationen und Habilitationen

Dateien zu dieser Ressource:

Datei	Beschreibung	Prüfsumme	Größe	Format
Dissertation.pdf		62601409c22d15e6414ed1375c280634	11.73 MB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Diese Publikation steht in elektronischer Form im Internet bereit und kann gelesen werden. Über den freien Zugang hinaus wurden durch die Urheberin / den Urheber keine weiteren Rechte eingeräumt. Nutzungshandlungen (wie zum Beispiel der Download, das Bearbeiten, das Weiterverbreiten) sind daher nur im Rahmen der gesetzlichen Erlaubnisse des Urheberrechtsgesetzes (UrhG) erlaubt. Dies gilt für die Publikation sowie für ihre einzelnen Bestandteile, soweit nichts Anderes ausgewiesen ist.

Info

Seitenansichten

1.432

Letzte Woche

Letzten Monat

geprüft am 02.07.2025

Download(s)

144

Letzte Woche

Letzten Monat

geprüft am 02.07.2025

Werkzeuge

Google Scholar^TM

Prüfe

Dateien zu dieser Ressource:

Seitenansichten

Download(s)

Google ScholarTM

Google Scholar^TM