• Login
    View Item 
    •   Shocker Open Access Repository Home
    • Graduate Student Research
    • ETD: Electronic Theses and Dissertations
    • Master's Theses
    • View Item
    •   Shocker Open Access Repository Home
    • Graduate Student Research
    • ETD: Electronic Theses and Dissertations
    • Master's Theses
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    What makes bugs get fixed in open source software?: A Replication study and a text-based approach

    View/Open
    Thesis (548.6Kb)
    Date
    2016-07
    Author
    Wang, Haoren
    Advisor
    Kagdi, Huzefa Hatimbhai
    Metadata
    Show full item record
    Abstract
    Bugs are the central force driving the many corrective maintenance and evolutionary changes in large-scale software systems. Over the years, many facets of bugs and their resolution haven been studied extensively. As elementary as it may appear, the existential question of all "whether a reported bug will be fxed or not" has not received much attention. The paper presents an empirical study on four open source projects to examine the factors that in?uence the likelihood of a bug getting ?xed or not. Overall, our study can be contextualized as a conceptual replication and extension of a previous study on Microsoft systems from a commercial domain. The similarities and di?erences in terms of the design, execution, and results between the two studies are discussed. A number of features available from the bug tracking systems were used to build a descriptive model using logistic regression. It was observed from these systems that the reputations of the reporter and assigned developer, and the number of comments of a bug have the most substantial impact on its probability to get ?xed. Moreover, using the observations from the descriptive study as a guide, we formulated a predictive model from the substantial features available as soon as a bug is reported. Logistic Regression (LR) and Support Vector Machine (SVM) were compared in forming the said predictor. Intra and inter (cross) project validation was conducted. Precision and Recall metrics were used to assess the predictive model. Their values are typically in the 60% to 70% range for both LR and SVM. Although, we observed statistically signi?cant di?erences for individual metrics across the subject systems, there is no clear winner across the board between the two. We also present a technique based on the textual analysis of bug reports. We considered the textual features such as the title, description, and comments of bug reports. Latent Semantic Indexing (LSI) was used to index the corpus formed from these features. Our prediction results show that this textual model outperforms the previous one in terms of recall and F-measure.
    Description
    Thesis (M.S.)--Wichita State University, College of Engineering, Dept. of Electrical Engineering and Computer Science
    URI
    http://hdl.handle.net/10057/12877
    Collections
    • CE Theses and Dissertations
    • EECS Theses and Dissertations
    • Master's Theses

    Browse

    All of Shocker Open Access RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsBy TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsBy Type

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    DSpace software copyright © 2002-2023  DuraSpace
    DSpace Express is a service operated by 
    Atmire NV