Titel: Mining and Understanding Issue Links Towards a Better Issue Management
Sprache: Englisch
Autor*in: Lüders, Clara Marie
Schlagwörter: Issue Tracking System; Issue Management
GND-Schlagwörter: Maschinelles LernenGND
Deep learningGND
Requirements engineeringGND
Erscheinungsdatum: 2023
Tag der mündlichen Prüfung: 2023-04-28
Software projects use Issue-Tracking Systems (ITS) like JIRA to track and organize issues and their workflows. Issues are often interconnected via different links, such as Duplicate, Relate, Block, or Subtask. These links are essential for improving the findability of issue reports, but managing them can be challenging as the number of daily created issues and link edits grow during a product’s life. In this work, we studied 942,736 links and 2,659,077 issues in 1,276 projects of 16 public JIRA repositories to understand the prevalence and characteristics of links and their types. We found that common link types are used differently across the repositories. The results confirm the complexity and heterogeneity of link types in practice. We present an initial framework to asses the link quality and found that the quality of Epic and Subtask links is high, while the quality of Duplicate links is lacking. More than a quarter of all issues have a comment mentioning another issue without an explicit link. We introduce a list of smells presenting potential risks to linking and link management.
Motivated by the differences between the link types and their popularity, we evaluated the robustness of the rather mature duplicate detection approaches from the literature. We found that current deep-learning approaches confuse Duplicate and other links in almost all repositories in the JIRA dataset. On average, the classification accuracy dropped by 7% for one approach and 10% for the other. Extending the training sets with other link types partly solves this issue.
We also examined how well state-of-the-art machine learning models automatically detect common link types. We found that a BERT model trained on the titles and descriptions of linked issues significantly outperformed other deep learning models, achieving an average macro F1-score of 0.64 for detecting nine popular link types across all repositories (weighted F1-score of 0.73). The model performed exceptionally well on Subtask and Epic links, achieving F1-scores of 0.89 and 0.97, respectively. We found that quality markers, such as missing Duplicate links, negatively impact the results, and link coverage improves the results. We also observed that Relate links were often confused with other links, which suggests that they are used as default links in unclear cases. Then, we proposed and evaluated three improvement strategies and found that predicting only the existence of a link without its type increases the average F1-score to 0.95.
To better understand issue comments and how they can be used to improve issue management in general and issue linking in particular, we studied the comments of two large public issue trackers (Eclipse’s Bugzilla and Qt’s Jira). By manually analyzing a random sample of 2000 comments, we identified six common content types distributed similarly across both ITS, most notably Relationship to other issues as well as Progress and status information. We also identified several consistent trends concerning the comment structure, length, sentiments, and contextual properties of the commented issues. In a series of 118 experiments, we studied the automated classification of the comments using comment text embeddings, sentiment scores, and contextual and structural features, achieving a macro F1-score of 0.64. The comments can help fix incorrect and outdated information in the issue reports.
Lastly, we propose an integrative software framework to enable tool support for managing and detecting links. The main features are an overview of issue tracker health, a visualization of the issue link network, a recommendation of relevant links, and a smell detector. Finally, conducted 13 interviews with practitioners to evaluate our research results and the framework. We discuss our findings and their implications.
URL: https://ediss.sub.uni-hamburg.de/handle/ediss/10310
URN: urn:nbn:de:gbv:18-ediss-109875
Dokumenttyp: Dissertation
Betreuer*in: Maalej, Walid
Enthalten in den Sammlungen:Elektronische Dissertationen und Habilitationen

Dateien zu dieser Ressource:
Datei Beschreibung Prüfsumme GrößeFormat  
2023_diss_lueders.pdfDissertationafe5c94aef8e2e74c8151d1becb349654.75 MBAdobe PDFÖffnen/Anzeigen
Zur Langanzeige



Letzte Woche
Letzten Monat
geprüft am 28.09.2023


Letzte Woche
Letzten Monat
geprüft am 28.09.2023

Google ScholarTM