• Login
    View Item 
    •   Shocker Open Access Repository Home
    • Business
    • Finance, Real Estate, and Decision Sciences (FREDS)
    • FREDS Research Publications
    • View Item
    •   Shocker Open Access Repository Home
    • Business
    • Finance, Real Estate, and Decision Sciences (FREDS)
    • FREDS Research Publications
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Improving imbalanced machine learning with neighborhood-informed synthetic sample placement

    Date
    2022-10-02
    Author
    Nasir, Murtaza
    Dag, Ali
    Simsek, Serhat
    Ivanov, Anton
    Oztekin, Asil
    Metadata
    Show full item record
    Citation
    Murtaza Nasir, Ali Dag, Serhat Simsek, Anton Ivanov & Asil Oztekin (2022) Improving Imbalanced Machine Learning with Neighborhood-Informed Synthetic Sample Placement, Journal of Management Information Systems, 39:4, 1116-1145, DOI: 10.1080/07421222.2022.2127453
    Abstract
    Machine learning is widely used in information systems design. Yet, training algorithms on imbalanced datasets may severely affect performance on unseen data. For example, in some cases in healthcare, fintech, or cybersecurity contexts, certain subclasses are difficult to learn because they are underrepresented in training data. Our study offers a flexible and efficient solution based on a new synthetic average neighborhood sampling algorithm (SANSA), which, in contrast to other solutions, introduces a novel ?placement? parameter that can be tuned to adapt to each dataset?s unique manifestation of the imbalance. This package can be downloaded for R1. We tested SANSA against seven existing sampling methods used in conjunction with the four most frequently used machine learning models trained on 14 benchmark datasets. Our results provide suggestive evidence that SANSA offers a feasible solution to the imbalance problem for most datasets. Our findings provide practical recommendations for how SANSA can be effectively implemented while reducing the complexity level of an imbalanced learning pipeline.
    Description
    Click on the DOI to access this article (may not be free).
    URI
    https://doi.org/10.1080/07421222.2022.2127453
    https://soar.wichita.edu/handle/10057/25001
    Collections
    • FREDS Research Publications

    Browse

    All of Shocker Open Access RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsBy TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsBy Type

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    DSpace software copyright © 2002-2023  DuraSpace
    DSpace Express is a service operated by 
    Atmire NV