HUBLA Paper (Indonesian Lang.) Plagiarism Detector using BERT

The structre of this project described as follows :

app.py : main application
config.ini : configuration of the project (model, etc.)
global_var.py : initialization of global variable used in this entire project
text_preprocessing.py : process pdf to trainable text
model (folder) : the model used for feature extraction
sample_database_embeddings : sample database of extracted features from various pdf
vector_comparison.py : comparing extracted features from input file to extracted features in database
utils.py : helper function

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
model		model
README.md		README.md
app.py		app.py
config.ini		config.ini
db_files_processor.py		db_files_processor.py
db_vectors_processor.py		db_vectors_processor.py
global_var.py		global_var.py
plagiarism_checker.py		plagiarism_checker.py
requirements.txt		requirements.txt
test_service.py		test_service.py
text_processing.py		text_processing.py
utils.py		utils.py
vector_comparison.py		vector_comparison.py
wsgi.py		wsgi.py

Provide feedback