Scholarship Spy is a web-based platform designed to help users find scholarships that match their profiles. It provides personalized recommendations based on the user's information and preferences, offering a centralized repository of scholarships from various sources.
- Objective: To create a single repository for scholarships and provide personalized scholarship recommendations.
- Languages/Technologies Used:
- HTML, CSS, JavaScript
- Python (Django)
- SQLite
- Operating System: Windows 10
-
Personalized Recommendations: Scholarships are recommended based on user profiles and personal statement.
-
Search Functionality: Users can search for scholarships by country, field of study, and other criteria.
The recommendation system in Scholarship Spy employs a content-based filtering approach to provide personalized scholarship recommendations based on users' personal statements. This process involves several key steps to ensure accurate and relevant suggestions.
-
Text Cleaning:
- The input personal statement undergoes preprocessing to enhance data quality. This involves:
- Converting all text to lowercase to maintain uniformity.
- Removing non-alphanumeric characters and punctuation to focus on the content.
- Tokenization, which breaks the text into individual words for analysis.
- Filtering out common stopwords (e.g., "and," "the," "is") using the NLTK library to reduce noise in the data.
- The input personal statement undergoes preprocessing to enhance data quality. This involves:
-
Word Embeddings:
- The system utilizes pre-trained GloVe (Global Vectors for Word Representation) embeddings (
glove.6B.50d.txt
), which convert words into numerical vector representations. This allows the model to capture semantic meanings and relationships between words. - Each word in the user's cleaned personal statement is converted into a vector, and the average of these vectors creates a single vector representation for the entire statement.
- The system utilizes pre-trained GloVe (Global Vectors for Word Representation) embeddings (
-
Clustering:
- The dataset is processed to form clusters of scholarships based on their textual features. The centroids of these clusters are stored in a file named
cluster_centers.npy
. - Each centroid represents a distinct group of scholarships, allowing the model to categorize scholarships based on similarities in their descriptions and titles.
- The dataset is processed to form clusters of scholarships based on their textual features. The centroids of these clusters are stored in a file named
-
Recommendation:
- When a user inputs their personal statement, the system generates a vector representation of the statement using the techniques mentioned above.
- It calculates the Euclidean distance between the user's statement vector and the centroids of the scholarship clusters.
- The system identifies the
n
closest centroids (scholarships) to the user's vector and retrieves the corresponding scholarship titles, universities, and links from the dataset.
The code outputs the following information for each recommended scholarship:
- University: The name of the university offering the scholarship.
- Scholarship: The title of the scholarship.
- Link: A URL linking directly to the scholarship application page.
This recommendation system enhances the user experience by providing tailored scholarship opportunities that align closely with individual aspirations and qualifications.
-
Clone the repository:
git clone https://github.com/Sherryyy00/Scholarship-Spy.git cd Scholarship-Spy
-
Install dependencies:
pip install -r requirements.txt
-
Set up the database:
python manage.py makemigrations python manage.py migrate
-
Run the development server:
python manage.py runserver
- User: Register or log in to the platform to view and apply for scholarships.
- Admin: Log in to manage scholarships and initiate the recommendation crawler.
- Adding an online career counseling feature.
- Expanding the database to include more scholarship opportunities from various platforms.