Using analytics to automate the evaluation process of university applications
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
We focus on using analytics to automate the manual evaluation process of a large number of applications to a graduate program at a midwestern university in the USA. The university recently launched a new Master of Science in Business Analytics (MSBA) that receives a large number of applications, the majority of which are international. The prospective student fills online a "Graduate School Application GSA" that consists of a University Form (UF) and several attachments like transcripts, CV, statement etc. The data is gathered into one application file that becomes available on an online portal to the program director, who in turn reviews the applicant information and renders a decision whether to admit or deny the candidate. The application is usually scanned by the candidate and is messy, including multiple pages that are the same, items out of order, different layouts, unclear scans, incomplete pages, etc. In addition, the portal only allows viewing the application one page at a time or downloading the full application to the desktop. Only one application at a time can be downloaded. The current manual process is therefore cumbersome, inefficient, and slow. We developed a multi-stage analytics algorithm in Python to automate this process. First, we cleaned the application to get rid of any duplicate pages then we used Optical Character Recognition (OCR) to convert the applications into text. We designed an Excel template that allows the algorithm to read the key fields in the UF and populate the template to create a summary of all applicants. We extracted key fields like first name, last name, country, degree title, undergraduate university, etc. This part of the algorithm alone takes a small fraction of the time of what the manual process takes. We are testing a more advanced version of the algorithm that can read the rest of the application items, like transcript, and extract the Cumulative GPA. This is trickier as the scanned copies of the transcript are of much lower quality. We are using image processing tools to read such items and the accuracy is not as high as reading text. The algorithm and results so far will be shared. Future work will continue enhancing the algorithm to extract more information from the application and improve accuracy. Once data extraction is optimized, we can complete the rest of the data manually, and create a predictive model to fully automate the admission evaluation and decision process.
Table of Contents
Description
Research completed in the Department of Business Analytics, Barton School of Business.
Publisher
Journal
Book Title
Series
v. 20