| DC Element | Wert | Sprache |
|---|---|---|
| dc.contributor.advisor | Biemann, Chris | - |
| dc.contributor.author | Wang, Xintong | - |
| dc.date.accessioned | 2026-06-29T10:11:08Z | - |
| dc.date.available | 2026-06-29T10:11:08Z | - |
| dc.date.issued | 2026 | - |
| dc.identifier.uri | https://ediss.sub.uni-hamburg.de/handle/ediss/12466 | - |
| dc.description.abstract | Foundation models have reshaped the development of artificial intelligence (AI) by introducing a paradigm in which a single pretrained system can generalize across a wide spectrum of language and multimodal tasks. Built upon large-scale data and representation learning, these models reduce the need for task-specific engineering and enable reusable semantic and perceptual knowledge. This shift from specialized models to general-purpose learning architectures has significantly expanded the scope and applicability of AI technologies. As foundation models transition from controlled benchmarks to real-world deployment, new demands arise that extend beyond raw capability. Systems are increasingly expected to behave in ways that are reliable, interpretable, and aligned with human expectations, particularly in settings involving multimodal reasoning and interaction. These requirements expose limitations that are not visible in traditional performance evaluations and motivate a shift in research focus from scaling performance to ensuring trustworthy behavior. In this dissertation, trustworthiness is analyzed as a property that emerges from the structural coupling of groundedness, alignment stability, faithfulness, and controllability. These dimensions correspond to interdependent stages of the trustworthiness pipeline, encompassing how multimodal signals anchor meaning, how pretrained representations are adapted, how generative processes reconcile internal knowledge with external evidence, and how model behavior can be guided in transparent ways. When these stages are treated in isolation, characteristic failure modes arise, including context-insensitive grounding, instability under adaptation, hallucinated outputs, and safety interventions that disrupt communicative intent. The dissertation therefore investigates trustworthiness through coordinated interventions at different interfaces of the modeling pipeline. It develops methods that strengthen context-sensitive multimodal grounding while preserving representational structure, examines inference-time mechanisms that regulate the interaction between prior knowledge and conditioning signals, and introduces cognitively informed analyses to identify interpretable loci for efficient behavioral steering. Rather than addressing isolated symptoms, these contributions target complementary sources of unreliability across data construction, representation maintenance, and generation dynamics. Overall, the findings indicate that trustworthy foundation modeling must be engineered as a lifecycle property rather than achieved through post hoc alignment alone. Reliability arises from the deliberate coordination of grounding, adaptation, inference, and control, suggesting a pathway toward foundation models whose general capabilities are matched by predictability, transparency, and human-centered usability. | en |
| dc.language.iso | en | de_DE |
| dc.publisher | Staats- und Universitätsbibliothek Hamburg Carl von Ossietzky | de |
| dc.rights | http://purl.org/coar/access_right/c_abf2 | de_DE |
| dc.subject.ddc | 004: Informatik | de_DE |
| dc.title | Bridging Vision, Language, and Gaze for Trustworthy Foundation Models | en |
| dc.type | doctoralThesis | en |
| dcterms.dateAccepted | 2026-06-23 | - |
| dc.rights.cc | https://creativecommons.org/licenses/by/4.0/ | de_DE |
| dc.rights.rs | http://rightsstatements.org/vocab/InC/1.0/ | - |
| dc.type.casrai | Dissertation | - |
| dc.type.dini | doctoralThesis | - |
| dc.type.driver | doctoralThesis | - |
| dc.type.status | info:eu-repo/semantics/publishedVersion | de_DE |
| dc.type.thesis | doctoralThesis | de_DE |
| tuhh.type.opus | Dissertation | - |
| thesis.grantor.department | Informatik | de_DE |
| thesis.grantor.place | Hamburg | - |
| thesis.grantor.universityOrInstitution | Universität Hamburg | de_DE |
| dcterms.DCMIType | Text | - |
| dc.identifier.urn | urn:nbn:de:gbv:18-ediss-138779 | - |
| item.grantfulltext | open | - |
| item.languageiso639-1 | other | - |
| item.creatorOrcid | Wang, Xintong | - |
| item.advisorGND | Biemann, Chris | - |
| item.creatorGND | Wang, Xintong | - |
| item.fulltext | With Fulltext | - |
| Enthalten in den Sammlungen: | Elektronische Dissertationen und Habilitationen | |
Dateien zu dieser Ressource:
| Datei | Beschreibung | Prüfsumme | Größe | Format | |
|---|---|---|---|---|---|
| 2026-wang-dissertation.pdf | 4e69adf381b5ee523d548521bf160750 | 25.46 MB | Adobe PDF | ![]() Öffnen/Anzeigen |
Info
Seitenansichten
Letzte Woche
Letzten Monat
geprüft am null
Download(s)
Letzte Woche
Letzten Monat
geprüft am null
Werkzeuge
