Volltextdatei(en) vorhanden
DC ElementWertSprache
dc.contributor.advisorWermter, Stefan (Prof. Dr.)
dc.contributor.authorParisi, German Ignacio
dc.date.accessioned2020-10-19T13:17:25Z-
dc.date.available2020-10-19T13:17:25Z-
dc.date.issued2016
dc.identifier.urihttps://ediss.sub.uni-hamburg.de/handle/ediss/7165-
dc.description.abstractPerceiving the actions of other people is one of the most important social skills of human beings. We are able to reliably discern a variety of socially relevant information from people’s body motion such as intentions, identity, gender, and affective states. This ability is supported by highly developed visual skills and the integration of additional modalities that in concert contribute to providing a robust perceptual experience. Multimodal integration is a fundamental feature of the brain that together with widely studied biological mechanisms for action perception has served as inspiration for the development of artificial systems. However, computational mechanisms for processing and integrating knowledge reliably from multiple perceptual modalities are still to be fully investigated. The goal of this thesis is to study and develop artificial learning architectures for action perception. In light of a wide understanding of the brain areas and underlying neural mechanisms for processing biological motion patterns, we propose a series of neural network models for learning multimodal action representations. Consistent with neurophysiological studies evidencing a hierarchy of cortical layers driven by the distribution of the input, we demonstrate how computational models of input-driven self-organization can account for the learning of action features with increasing complexity of representation. For this purpose, we introduce a novel model of recurrent self-organization for learning action features with increasingly large spatiotemporal receptive fields. Visual representations obtained through unsupervised learning are incrementally associated to symbolic action labels for the purpose of action classification. From a multimodal perspective, we propose a model in which multimodal action representations can develop from neural network organization in terms of associative connectivity patterns between unimodal representations. We report a set of experiments showing that deep self-organizing hierarchies allow to learn statistically significant features of actions, with multimodal representations emerging from co-occurring audiovisual stimuli. We evaluated our neural network architectures on the tasks of human action recognition, body motion assessment, and the detection of abnormal behavior. Finally, we conducted two robot experiments that provide quantitative evidence for the advantages of multimodal integration for triggering sensory-driven motor behavior. The first scenario consists of an assistive task for the detection of falls, whereas in the second experiment we propose audiovisual integration in an interactive reinforcement learning scenario. Together, our results demonstrate that deep neural self-organization can account for robust action perception, yielding state-of-the-art performance also in the presence of sensory uncertainty and conflict. The research presented in this thesis comprises interdisciplinary aspects of action perception and multimodal integration for the development of efficient neurocognitive architectures. While the brain mechanisms for multimodal perception are still to be fully understood, the proposed neural network architectures may be seen as a basis for modeling higher-level cognitive functions.en
dc.language.isoenen
dc.publisherStaats- und Universitätsbibliothek Hamburg Carl von Ossietzky
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.subject.ddc004 Informatik
dc.titleMultimodal Learning of Actions with Deep Neural Network Self-Organizationen
dc.title.alternativeMultimodales Lernen von Aktionen mit tiefer Selbstorganisation neuronaler Netzwerkede
dc.typedoctoralThesis
dcterms.dateAccepted2017-03-10
dc.rights.ccNo license
dc.rights.rshttp://rightsstatements.org/vocab/InC/1.0/
dc.subject.bcl54.74 Maschinelles Sehen
dc.type.casraiDissertation-
dc.type.dinidoctoralThesis-
dc.type.driverdoctoralThesis-
dc.type.statusinfo:eu-repo/semantics/publishedVersion
dc.type.thesisdoctoralThesis
tuhh.opus.id8462
tuhh.opus.datecreation2017-04-18
tuhh.type.opusDissertation-
thesis.grantor.departmentInformatik
thesis.grantor.placeHamburg
thesis.grantor.universityOrInstitutionUniversität Hamburg
dcterms.DCMITypeText-
tuhh.gvk.ppn885641582
dc.identifier.urnurn:nbn:de:gbv:18-84626
item.advisorGNDWermter, Stefan (Prof. Dr.)-
item.grantfulltextopen-
item.languageiso639-1other-
item.fulltextWith Fulltext-
item.creatorOrcidParisi, German Ignacio-
item.creatorGNDParisi, German Ignacio-
Enthalten in den Sammlungen:Elektronische Dissertationen und Habilitationen
Dateien zu dieser Ressource:
Datei Beschreibung Prüfsumme GrößeFormat  
Dissertation.pdf62601409c22d15e6414ed1375c28063411.73 MBAdobe PDFÖffnen/Anzeigen
Zur Kurzanzeige

Diese Publikation steht in elektronischer Form im Internet bereit und kann gelesen werden. Über den freien Zugang hinaus wurden durch die Urheberin / den Urheber keine weiteren Rechte eingeräumt. Nutzungshandlungen (wie zum Beispiel der Download, das Bearbeiten, das Weiterverbreiten) sind daher nur im Rahmen der gesetzlichen Erlaubnisse des Urheberrechtsgesetzes (UrhG) erlaubt. Dies gilt für die Publikation sowie für ihre einzelnen Bestandteile, soweit nichts Anderes ausgewiesen ist.

Info

Seitenansichten

975
Letzte Woche
Letzten Monat
geprüft am 18.04.2024

Download(s)

104
Letzte Woche
Letzten Monat
geprüft am 18.04.2024
Werkzeuge

Google ScholarTM

Prüfe