Refer to pre-requisite section for the environment set up prior to starting this experiment.
- Clone this repository.
- Add relevant API keys and configurations in .env file.
Neo4j | Pinecone | Langchain | TogetherAI | OpenAI | Hugging Face |
---|---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Purpose: Conducting a comprehensive analysis of RAG (Retriever-Augmented Generation) pipeline using Neo4J and Pinecone to test:
- Performance of graph vs vector databases
- Effects of splitting methods
- Retrieval methods and strategies
- Quality of responses through various rag metrics
- Reranking and multi-query on failed responses
Significance:
- Insights on capabilities and limitations of a RAG pipeline
- Understanding the available technology stack
- Exploring non-traditional uses of databases
- Implementation of practical projects
- Suitability for specific use-cases
Experiment Design:
DATA
In this project all the experiments are done on the Constitution of Pakistan pdf file. To evaluate the RAG, questions with ground truth can be found here: questions.json
Select your desired experiment by defining the following options in RAG_with_Pinecone.py
retrieval_method = 'cosine' #What you defined at the time of pinecone creation
chunker = 'recursive' ##recursive, semantic, sentence, character, paragraph
embeddingtype = 'openai' #openai, HF, langchain, spacy, empty string will invoke gpt4all
llmtype = 'gpt4' #llama2, llama3, Qwen, empty string will invoke Mixtral
embedding_dimension = 1536 ##change to 384=gpt4all embedding,
index_name = pinecone_index
This script will load the input file embeddings into pinecone index, generate responses to the questions and write it to a json file in output folder.
You may run into errors while running this on your local system. Suggestion: Run this on google colab
Setup: Traditional method where the exact data indexed is the data retrieved. Uses two approaches:
- Similarity search: only focuses on retrieving matches with top similarity score (cosine or mmr)
- Hybrid search: search takes into account prominent keywords in addition to the similarity coefficient
Select your desired experiment by defining the following options in SimpleNeo4J_RAG.ipynb
retrieval_method = 'cosine' #euclidean, mmr, cosine (mmr was running into an error)
chunker = 'recursive' #recursive, semantic, sentence, character, paragraph
embeddingtype = 'langchain' #openai, HF, langchain, spacy, empty string will invoke gpt4all
llmtype = 'gpt4' #llama2, llama3, Qwen, empty string will invoke Mixtral
embedding_dimension = 3072 ##change to 384=gpt4all embedding
index_name = "vector" # default index name
This script will load the input file embeddings into neo4j instance, generate responses to the questions and write it to the json file in output folder.
Advanced NEO4J RAG strategies have been inspired by: Implementing-Advanced-Retrieval-RAG-Strategies-With-Neo4j You may run into errors while running this on your local system. Suggestion: Run this on google colab
Setup: Four advanced strategies of retrieval were implemented to balance precision embeddings and context retention:
- Parent retriever: Instead of indexing entire documents, data is divided into smaller chunks called Parent and Child documents. Child documents are indexed to better represent specific concepts, while parent documents are retrieved to ensure context retention.
- Hypothetical Questions: Documents are processed to determine potential questions they might answer. These questions are then indexed for better representation of specific concepts, while parent documents are retrieved to ensure context retention.
- Summaries: Instead of indexing the entire document, a summary of the document is created and indexed. Similarly, the parent document is retrieved in a RAG application.
Select your desired experiment by defining the following options in AdvancedNeo4J_RAG_with_strategies.ipynb
retrieval_method = 'cosine' #take it from LoadingDatatoNeo4j
chunker = 'semantic' #take it from LoadingDatatoNeo4j #recursive, semantic, sentence, character, paragraph
embeddingtype = 'langchain' #openai, HF, langchain, spacy, empty string will invoke gpt4all
llmtype = 'gpt4' #llama2, llama3, Qwen, empty string will invoke Mixtral
embedding_dimension = 1536 ##change to 384=gpt4all embedding,
Description: Ragas provides several metrics to evaluate various aspects of your RAG systems.
We are using six metrics to evaluate the RAG pipeline results:
Run Evaluation_Metrics.py to generate evaluation score for each question. Here is how the output of evaluation looks like: scores.xlsx
- Set up accounts on Neo4j, Pinecone, OpenAI, Hugging Face, and TogetherAI.
- Add the API keys and configuration in .env file.
- Create instance in neo4j.
- Create indexes in pinecone. You can create upto 5 indexes in Pinecone free version. Create a new index for each dimension.
TogetherAI is used to facilate the inference from several LLMs on local machine.