- Python 3
- cohere
- qdrant-client
- yaml
- json
- logging
-
Install the required packages using pip:
-
Ensure you have a valid API key for the Cohere API and Qdrant.
-
Update the configuration file
scripts/similarity/config.yaml
with your Cohere and Qdrant API keys.
-
Provide the paths to the JSON files containing keywords extracted from resumes and job descriptions in the
READ_RESUME_FROM
andREAD_JOB_DESCRIPTION_FROM
variables, respectively. -
Run the
app_similarity_score.py
file.
- The Python script that calculates the similarity score between a job description and a resume using embeddings generated by the Cohere API and the Qdrant vector database.
config.yaml
: Configuration file containing API keys for the Cohere API and Qdrant.Data/Processed/Resumes
: Directory containing JSON files with keywords extracted from resumes.Data/Processed/JobDescription
: Directory containing JSON files with keywords extracted from job descriptions.app_similarity_score.log
: Log file containing information about the execution of the program.
This project is a simple Python program that ranks resumes against a job description provided in a PDF format. It utilizes the sentence-transformers
library for text encoding and similarity calculation, and PyPDF2
for reading PDF documents.
- Python 3
- sentence-transformers
- PyPDF2
Install the required packages using pip:
- Place the resume PDF files in the directory specified by the
org_docs
list. - Provide the path to the target job description PDF file in the
target_document
variable. - Run the script.
- The Python script that extracts text from PDF resumes and a job description file, encodes them into embeddings using Sentence Transformers, calculates the cosine similarity between the job description and each resume, and ranks the resumes based on their similarity scores.
Data/Resumes
: Directory containing PDF files of resumes.Data/JobDescription
: Directory containing the target job description PDF file.
This project is licensed under the MIT License - see the LICENSE file for details.