https://drive.google.com/file/d/1SWa2rYGu3FfgTMk6JKMv2dTPT1vpxWRc/view?usp=sharing
This repo is for the course project of Software Development course IIIT S22.
Dataset for this project is taken from
https://www.kaggle.com/datasets/Cornell-University/arxiv
Contributors:-
- Lokesh Sharma
- Piyush Singh
- Mayush Kumar
- Sumeet Agrawal
Motivation for this project is to create a fast Information Retrieval system of research papers using MERN Stack(UI), Elastic Search(retrieval) and Transformers(NLP).
-
Elastic Search
1.1 Installation
a. Download Elastic Search msi for windows.
b. Start elastic search server as ".\bin\elasticsearch.bat"
c. Reset elastic search password - .\bin\elasticsearch-reset-password.bat -u elastic.1.2 Test Elastic search API:
https://localhost:9200/ssd-search/_mapping
https://localhost:9200/ssd-search/_search
https://localhost:9200/ssd-search/_search?pretty=true&q=:
https://localhost:9200/ssd-search/_count1.3 Search using POSTMAN
Sample query to Search (Not the vectorised embeddings) nyc-restaurants/_search {
"query": {
"match": {
"_id": "50127304"
}
} -
Using Faiss Run main.py
Testing using postman
POST http://localhost:8000/predict
{
"data": "Polymer Quantum Mechanics and its Continuum Limit A rather non-standard quantum representation of the canonicale"
}