Show simple item record

dc.contributor.advisorSinha, Kaushik
dc.contributor.authorKeivani, Omid
dc.date.accessioned2019-06-26T18:45:41Z
dc.date.available2019-06-26T18:45:41Z
dc.date.issued2019-05
dc.identifier.otherd19007
dc.identifier.urihttp://hdl.handle.net/10057/16380
dc.descriptionThesis (Ph.D.)-- Wichita State University, College of Engineering, Dept. of Electrical Engineering & Computer Science
dc.description.abstractNearest neighbor search (NNS) is one of the most well-known problems in the field of computer science. It has been widely used in many different areas such as recommender systems, classification, clustering etc. Given a database S of n objects, a query q, and a measure of similarity, the naive way to solve a NNS problem is to perform a linear search over the objects in database S and return an object from S, which based on the similarity measure, is most similar to q. However, due to the growth of data in recent years, a solution better than linear time complexity is desirable. Locality sensitivity hashing (LSH) and random projection trees (RPT) are two popular methods to solve NNS problem in sublinear time. Earlier works have demonstrated that RPT has superior performance compared to LSH. However, RPT has two major drawbacks, namely, i) its high space complexity ii) if it makes a mistake at any internal node of a single tree, it cannot recover from this mistake and the rest of the search for that tree becomes useless. One of the main contributions of this thesis is to propose new methods to address these two drawbacks. To address the first issue, we design a sparse version of RPT which reduces the space complexity overhead without significantly affecting nearest neighbor search performance. To address the second issue, we develop various strategies that uses auxiliary information and priority function to improve nearest neighbor search performance of original RPT. We support our claims both theoretically and experimentally on many real-world datasets. A second contribution of the thesis is to use the RPT data structure to solve related search problems such as, maximum inner product search (MIPS) and nearest neighbor to query hyperplane (NNQH) search. Both these problems can be reduced to an equivalent NNS problem by applying appropriate transformations. In case of MIPS problem, we establish among many different transformations that reduce a MIPS problem to an equivalent NNS problem, which one is more preferable to be used in conjunction with RPT. In case of NNQH problem, the transformation that reduces NNQH problem to an equivalent NNS problem increases the data dimensionality tremendously and hence space complexity requirement of original RPT. In the latter case, we show that our sparse RPT version comes to rescue. Our NNQH solution which uses space efficient versions of RPT is used to solve active learning problem. We perform extensive empirical evaluations for both these applications on many real world datasets to show superior performance of our proposed methods compare to the state of the art algorithms.
dc.format.extentxi, 91 pages
dc.language.isoen_US
dc.publisherWichita State University
dc.rightsCopyright 2019 by Omid Keivani All Rights Reserved
dc.subject.lcshElectronic dissertation
dc.titleEfficient random projection trees for nearest neighbor search and related problems
dc.typeDissertation


Files in this item

Thumbnail

This item appears in the following Collection(s)

  • CE Theses and Dissertations
    Doctoral and Master's theses authored by the College of Engineering graduate students
  • Dissertations
    This collection includes Ph.D. dissertations completed at the Wichita State University Graduate School (Fall 2005 --)
  • EECS Theses and Dissertations
    Collection of Master's theses and Ph.D. dissertations completed at the Dept. of Electrical Engineering and Computer Science

Show simple item record