Document-Chat-Bot_RAG 📄 💬 🤖

Search functions in document readers are a boon to finding relevant texts. But in large documents, searching through thousands of results is futile. Through the use of LLMs and vector databases, we can speedup this process with better accuracy. This project explores the world of Retrieval Augmented Generation (RAG) models which have made information retrieval from documents a one-shot action.

Project Description 🔍

The implementation follows the basic principles of a RAG Model:

Chunking - Splitting the text of a document into smaller parts (chunks).
Embedding - Each chunk is converted to a numeric vector using an embedding model and is stored in a vector database.
Query - The user query is also converted into a vector, which is used to search the vector database for similar vectors.
Response - These similar vectors are then passed through an LLM as a context with the question. The LLM generates a coherent response in natural language.

Technology Stack:

Programming Language - Python
Integration Framework - LangChain
Embedding Model - OpenAI Text Embedding 3 Small
LLMs - OpenAI GPT-3.5-turbo-0125, Google Gemini Pro
Vector Database - Pinecone
Front End - Streamlit

Environment Setup 🛠️

This project uses Streamlit to run the file programmed in Python, as a web app. The needed installation packages for the various libraries are in the requirements.txt file. Use the following command to install the requirements pip install -r ./requirements.txt

How to RUN 🕹️

Download the python file & the dotenv template locally in your project directory.
Fill in your API's in the dotenv file using a text editor (such as notepad).
If using Google Gemini Pro as the LLM, edit the python as per instructions in the comments
Open CMD in the project directory and run using the following command : streamlit run rag_chat.py

Credits 🙌

The basic foundations of building GenAI applications using LangChain taught in the course 'LangChain Mastery: Develop LLM Apps with LangChain & Pinecone' by Andrei Dumitrescu aided me in understanding the concepts of RAG models, and using chains to build apps by combining LLMs and vector databases.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.env		.env
README.md		README.md
rag_chat.py		rag_chat.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document-Chat-Bot_RAG 📄 💬 🤖

Project Description 🔍

Technology Stack:

Environment Setup 🛠️

How to RUN 🕹️

Credits 🙌

About

Releases

Packages

Languages

vprabhakar12/Document-Chat-Bot_RAG

Folders and files

Latest commit

History

Repository files navigation

Document-Chat-Bot_RAG 📄 💬 🤖

Project Description 🔍

Technology Stack:

Environment Setup 🛠️

How to RUN 🕹️

Credits 🙌

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages