This project is a comprehensive Telegram bot designed to aid users in IT-interview preparation. Leveraging advanced modular Retrieval-Augmented Generation (RAG) techniques, the bot generates relevant questions, creates full tests, and provides detailed reports based on user responses. The system is built for robustness, incorporating features like error handling retries and caching for seamless user experience.
Nazgul Salikhova, B22-AAI-02
- Generate Questions: Users can request questions on specific IT themes.
- Full Test Creation: The bot assembles a complete test based on selected tracks (e.g., Machine Learning, Frontend Development).
- Detailed Reports: Generates performance reports based on user answers, including scores and recommendations.
- Error Handling with Retries: Ensures stability by retrying operations up to 3 times if an error occurs.
- Caching: Speeds up response by storing generated questions.
- Automated Benchmark Testing: Enhances reliability through continuous testing.
The architecture employs modular RAG for handling data retrieval and response generation. The workflow includes:
- ChromaDB: Stores and retrieves semantic chunks of educational content.
- SQLite: Maintains a database of questions.
- Modules:
- Questions Retriever: Gathers relevant questions from the cache or database.
- Test Generator: Constructs comprehensive tests.
- Report Generator: Produces detailed feedback reports.
- RAG Manager: Manages interactions between components and handles error retries.
- Chat Bot Interface: Facilitates user interaction.
See system diagram for detailed workflow.
- Programming Language: Python
- Databases: ChromaDB, SQLite
- LLM Models: gemini-1.5-flash-latest, llama-3.1-70b-versatile
- Embedding Model: all-MiniLM-L6-v2
- Bot Framework: python-telegram-bot
- Prompt Engineering: Customized prompts for generating educational content.
- Modular RAG: Ensures efficient data retrieval and content generation.
- Semantic Chunking: Splits and indexes PDF content for better context.
- Error Handling Retry: Retries failed operations up to 3 times for reliability.
- Caching: Reduces redundant processing and accelerates responses.
- Automated Benchmark Testing: Validates the bot’s functionality continuously.
- Clone the repository:
git clone https://gitlab.pg.innopolis.university/llm-course/n.salikhova.git
- Go into week4 repo
cd final-project
- Set up virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the required Python packages:
pip install -r requirements.txt
- Create a .env file in the root directory of your project with the following content:
TELEGRAM_TOKEN=your_telegram_bot_token GEMINI_API_KEY=your_gemini_api_key GROQ_API_KEY=your_groq_api_key
• For the TELEGRAM_TOKEN, contact @BotFather on Telegram to create a new bot and obtain the token.
• For the GEMINI_API_KEY, visit Google AI Studio to generate an API key.
• For the GROQ_API_KEY, visit qrogcloud to generate an API key.
- Run the script for local ChromaDB database in terminal:
chroma run --path ~/chromadb_data
- Start the bot by running the following command in another terminal:
python main.py
This project is licensed under the MIT License.