This is a part of the CSCE 670 (Information Storage and Retrieval) project.
Here is our introductory video: Gradvisor
Our Github repository: Github
As graduate students, we all have been through a tough time selecting good universities based on our profile. We have to research several universities, check their cutoffs, acceptance rates etc. and gauge our profile against their requirements as we make limited applications because the application process is itself expensive. With the aim to reduce all this friction and make the application process easy, less expensive and more confident, Gradvisor provides university recommendations to the students.
Any aspiring graduate student can sign-up and log-in to our website and fill out a nominal form with details such as CGPA, GRE score(Quantitative + Verbal), Research and Industrial experiences and publications. Based on these details an academic profile of the user is generated. This profile is matched against our database where we have a list of candidates with varied profiles and their respective admits. We then predict the top 5 universities for our current users. We also understand that as the next step, we are all eager to get some information about the first-hand experience of these universities where any aspirants apply to and the admission process as a whole. So we also recommend the top three most similar users based on the similarity score of the profiles of the aspirant/current user and the users in our database. We also provide a feature to connect with them over LinkedIn. (This functionality is added but as gathering LinkedIn profiles was difficult, we left an “Arriving Soon” message on our website when trying to contact the user). Also by connecting the users over Linkedin and not using any personal contacts, we ensure to maintain the privacy and safety of our users.
The implementation of this overall project involved a lot of challenges starting from getting the dataset to choosing the right model for implementation. The dataset was gathered by scraping similar websites. We had a total of 53 unique universities and then we added noise and performed oversampling and under-sampling to ensure that we have kind of enough information for our model to learn from for each university. We experimented with multiple machine learning models to make the top 5 university predictions including Random Forest, ANN and Adaboost. And as we had the best metrics for AdaBoost we finalised this. Also, we experimented with KNN and Pearson coefficient to calculate the closest/most similar users. As KNN was faster with better metrics, we implemented user-user collaborative filtering with KNN to recommend the most similar user profiles that can be contacted.
Overall, with the motivation to make the process of choosing the universities and sending applications an easy, more informed and less tedious process we built this project.
You can download the dataset here.
This preprocessed data contains information about several applicants who have applied to the 54 universities that we have considered.
Each row represents an individual applicant and includes various attributes such as their username, research experience, industry experience, internship experience, GRE scores (Verbal and Quantitative), publications in journals and conferences, CGPA, the name of the university they applied to, their admission status (admitted or not), and their GRE score.
For a more detailed overview of the Gradvisor project, you can view our presentation slides here.
djangoApp/: Contains the Django web application code for Gradvisor.
User-User-K-nearest-neighbour.ipynb: Jupyter Notebook for K-Nearest Neighbors algorithm.
AdaBoost.ipynb: Jupyter Notebook for AdaBoost algorithm.
updated_preprocessed.csv: Preprocessed dataset.
requirements.txt: Python dependencies.
Gradvisor.pptx: PowerPoint presentation.
Clone the Repository: Open a terminal or command prompt and use the git clone command to clone the repository to your local machine.
git clone https://github.com/mohitsarin-tamu/Gradvisor.git
Navigate to the Project Directory: Use the cd command to navigate into the directory of the cloned repository:
cd Gradvisor
Install Dependencies
pip install -r requirements.txt
Navigate to the djangoApp directory
cd djangoApp
Now to run the server locally:
python manage.py makemigrations
python manage.py migrate
python manage.py runserver
The server usually runs on this url: http://127.0.0.1:8000
Home Page:
Login/Signup:
Form:
Recommendation and Similar Users: