• Login
    View Item 
    •   Shocker Open Access Repository Home
    • Graduate Student Research
    • ETD: Electronic Theses and Dissertations
    • Master's Theses
    • View Item
    •   Shocker Open Access Repository Home
    • Graduate Student Research
    • ETD: Electronic Theses and Dissertations
    • Master's Theses
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Treating biological sequences as natural language, a case study on sub-cellular protein localization

    View/Open
    Thesis (318.0Kb)
    Date
    2016-05
    Author
    Ross, Jarret
    Advisor
    Sinha, Kaushik
    Metadata
    Show full item record
    Abstract
    Extracting meaning out of biological sequences such as DNA, RNA, and strings of amino acids is a task that traditionally requires a large amount of expert knowledge. Breakthroughs and advancements of these subjects are slow due to the computational intractability inherent in biological sequences. If it were possible to lower or remove the high level of expertise needed to solve important problems in biology it might be possible to increase the pace of biological breakthroughs. As a small step in this direction this thesis focuses on the challenge of sub-cellular protein localization. It is possible to totally remove the need for any biological understanding by viewing the problem of Sub-cellular protein localization as a Natural Language Processing task. This method requires no hand engineered features and performs at a character level granularity. Modifications are made to an existing deep convolution network which was designed to perform a range of Natural Language Processing tasks such as Sentiment Analysis and Topic Classification. While this model does not achieve state of the art performance it is competitive with respect to other models evaluated in this Thesis. These findings are encouraging for a few reasons. First it is shown that a totally biologically naive method performs competitively with other hand engineered methods. Lastly it is hoped that the current intense research focus on Natural Language processing in the field of deep learning will greatly increase the viability of the method contained in this thesis in coming years.
    Description
    Thesis (M.S.)--Wichita State University, College of Engineering, Dept. of Computer Science
    URI
    http://hdl.handle.net/10057/12673
    Collections
    • CE Theses and Dissertations
    • EECS Theses and Dissertations
    • Master's Theses

    Browse

    All of Shocker Open Access RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsBy TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsBy Type

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    DSpace software copyright © 2002-2023  DuraSpace
    DSpace Express is a service operated by 
    Atmire NV