Skip to content

Latest commit

 

History

History
41 lines (32 loc) · 1.04 KB

README.md

File metadata and controls

41 lines (32 loc) · 1.04 KB

Web Search Engine

Information Retrieval Course Project

Sharif University of Technology

Methods

This project is search on the web pages with five methods and query expansion:

  • boolean
  • TF-IDF
  • A transformer-based model
  • fasttext

After the search, you can apply link analysis on the results.

  • pagerank
  • hits

Also classification and clustring are available.

you can use different embeddings for different methods.


How to Start

  1. Clone the repo.
git clone https://github.com/IR1401-Spring-Final-Projects/Web1401-10_38.git
  1. Install the requirements.
pip install -r requirements.txt
  1. Get the available models from this link and extract them in the root of the project (next to manage.py)

  2. Run the server

python manage.py runserver
  1. search, cluser or classify!

The note book for working with services are available here.