AI & NLP stend bot

This project is a bot for Telegram, which can find answers to questions based on the information from provided sources. The bot uses an artificial neural network and NLP techniques to find the most relevant answer.

Installation

Requirements

Python 3.6+
pip
virtualenv
git
Telegram bot token
model file (you could use the provided one)
navec_news_v1_1B_250K_300d_100q.tar
ru_small.bin (download from here and unpack)

Setup

Clone the repository

git clone <repo_url>

Create a virtual environment and activate it

virtualenv venv
source venv/bin/activate

Install the requirements

pip install -r requirements.txt

Create a file named .env in the root directory of the project and fill it with the following data:

API_TOKEN=your_telegram_bot_token

Obtain the token from @BotFather

Run the bot

python bot.py

Usage

The bot is able to answer questions about the following topics:

SSTU (information for abiturients to help them get into the university, information about the university itself)

Machine learning

The bot uses a neural network to find the most relevant answer to the question. The model is trained on the dataset.json dataset. The model is trained using the Dmitry Korobchenko algorithm. You can train the model yourself using the training.ipynb notebook. Don't forget to provide your own dataset with most releted questions in your case.

NLP

The bot uses Navec and Natasha it self to get the vector representation of each word in the question and them sum them up to get the vector representation of the whole question. You can download the Navec model from here or train it yourself in order to increase the accuracy (see this for more information).

The bot uses JamSpell to correct the spelling of the question. However, the model is trained on a piece of Russian text, so it may not work well with some words. You can download the model from here or train it yourself (refer to this).

Restrictions

The bot is able to answer questions only in Russian because of the NLP models and tokenizer used in Natasha.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
.env.example		.env.example
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
bot.py		bot.py
dataset.json		dataset.json
model.json		model.json
nn.py		nn.py
requirements.txt		requirements.txt
training.ipynb		training.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI & NLP stend bot

Installation

Requirements

Setup

Usage

Machine learning

NLP

Restrictions

License

About

Releases

Packages

Languages

eddir/pk-sstu-ai-nlp-bot

Folders and files

Latest commit

History

Repository files navigation

AI & NLP stend bot

Installation

Requirements

Setup

Usage

Machine learning

NLP

Restrictions

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages