Improved maximum inner product search with better theoretical guarantee using randomized partition trees
MetadataShow full item record
Keivani, O., Sinha, K. & Ram, P. Mach Learn (2018) 107: 1069
Recent interest in the problem of maximum inner product search (MIPS) has sparked the development of new solutions. The solutions (usually) reduce MIPS to the well-studied problem of nearest-neighbour search (NNS). To escape the curse of dimensionality, the problem is relaxed to accept approximate solutions (that is, accept anything that is approximately maximum), and locality sensitive hashing is the approximate NNS algorithm of choice. While being extremely resourceful, these existing solutions have a couple of aspects that can be improved upon-(i) MIPS can be reduced to NNS in multiple ways and there is lack of understanding (mostly theoretical but also empirical) when to choose which reduction for best accuracy or efficiency, and (ii) when MIPS is solved via approximate NNS, translating this approximation to the MIPS solution is not straightforward. To overcome these usability issues, we propose the use of randomized partition trees (RPTs) for solving MIPS. We still reduce MIPS to NNS but utilize RPTs to solve the NNS problem. RPTs find the exact NNS solution, hence the exact MIPS solution (with high probability), avoiding the need for any translation of approximation. The theoretical properties of RPTs allow us to definitively choose the best MIPS-to-NNS reduction. The empirical properties of RPTs allow us to significantly outperform the state-of-the-art while providing unique fine-grained control over the accuracy-efficiency trade-off. For example, at 80% accuracy, RPTs are 2-5 more efficient than the state-of-the-art. Superiority of RPT comes at the cost of high space complexity overhead which can be a severe limitation of our proposed method. To address this limitation we introduce two space efficient versions of RPTs which enjoy the same superior performance of RPT while requiring a significantly reduced space complexity overhead.
Click on the DOI link to access the article (may not be free).