TY - JOUR
T1 - AttSum
T2 - A Deep Attention-Based Summarization Model for Bug Report Title Generation
AU - Ma, Xiaoxue
AU - Keung, Jacky Wai
AU - Yu, Xiao
AU - Zou, Huiqi
AU - Zhang, Jingyu
AU - Li, Yishu
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2023/12/1
Y1 - 2023/12/1
N2 - Concise and precise bug report titles help software developers to capture the highlights of the bug report quickly. Unfortunately, it is common that bug reporters do not create high-quality bug report titles. Recent long short-term memory (LSTM)-based sequence-to-sequence models such as iTAPE were proposed to generate bug report titles automatically, but the text representation method and LSTM employed in such model are difficult to capture the accurate semantic information and draw the global dependencies among tokens effectively. This article proposes a deep attention-based summarization model (i.e., AttSum) to generate high-quality bug report titles. Specifically, the AttSum model employs the encoder.decoder framework, which utilizes the robustly optimized bidirectional-encoder-representations-from-transformers approach to encode the bug report bodies to capture contextual semantic information better, the stacked transformer decoder to automatically generate titles, and the copy mechanism to handle the rare token problem. To validate the effectiveness of AttSum, we conduct automatic and manual evaluations on 333563 '< body, title>' pairs of bug reports and perform a practical analysis of its ability to improve low-quality titles. The result shows that AttSum is superior to the state-of-the-art baselines by a substantial margin both on automatic evaluation metrics (e.g., by 3.4%-58.8% and 7.7%-42.3% in terms of recall-oriented understudy for gisting evaluation in F1 and bilingual evaluation understudy, separately) and three human-set modalities (e.g., by 1.9%-57.5%). Moreover, we analyze the impact of the training data size on AttSum and the results imply that our approach is robust enough to generate much better titles.
AB - Concise and precise bug report titles help software developers to capture the highlights of the bug report quickly. Unfortunately, it is common that bug reporters do not create high-quality bug report titles. Recent long short-term memory (LSTM)-based sequence-to-sequence models such as iTAPE were proposed to generate bug report titles automatically, but the text representation method and LSTM employed in such model are difficult to capture the accurate semantic information and draw the global dependencies among tokens effectively. This article proposes a deep attention-based summarization model (i.e., AttSum) to generate high-quality bug report titles. Specifically, the AttSum model employs the encoder.decoder framework, which utilizes the robustly optimized bidirectional-encoder-representations-from-transformers approach to encode the bug report bodies to capture contextual semantic information better, the stacked transformer decoder to automatically generate titles, and the copy mechanism to handle the rare token problem. To validate the effectiveness of AttSum, we conduct automatic and manual evaluations on 333563 '< body, title>' pairs of bug reports and perform a practical analysis of its ability to improve low-quality titles. The result shows that AttSum is superior to the state-of-the-art baselines by a substantial margin both on automatic evaluation metrics (e.g., by 3.4%-58.8% and 7.7%-42.3% in terms of recall-oriented understudy for gisting evaluation in F1 and bilingual evaluation understudy, separately) and three human-set modalities (e.g., by 1.9%-57.5%). Moreover, we analyze the impact of the training data size on AttSum and the results imply that our approach is robust enough to generate much better titles.
KW - Bug reports
KW - deep learning
KW - text summarization
KW - title generation
KW - transformers
UR - http://www.scopus.com/inward/record.url?scp=85147278864&partnerID=8YFLogxK
U2 - 10.1109/TR.2023.3236404
DO - 10.1109/TR.2023.3236404
M3 - Article
AN - SCOPUS:85147278864
SN - 0018-9529
VL - 72
SP - 1663
EP - 1677
JO - IEEE Transactions on Reliability
JF - IEEE Transactions on Reliability
IS - 4
ER -