The structre of this project described as follows :
- app.py : main application
- config.ini : configuration of the project (model, etc.)
- global_var.py : initialization of global variable used in this entire project
- text_preprocessing.py : process pdf to trainable text
- model (folder) : the model used for feature extraction
- sample_database_embeddings : sample database of extracted features from various pdf
- vector_comparison.py : comparing extracted features from input file to extracted features in database
- utils.py : helper function