Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add full YAML at vectorDB integration docs #994

Merged
merged 1 commit into from
Nov 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 42 additions & 1 deletion docs/source/integration/vectordb/couchbase.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ This should correspond to the `dimension` of the embeddings generated by the spe
### Example YAML file

```yaml
- name: openai_embed_3_large
- name: openai_couchbase
db_type: couchbase
embedding_model: openai_embed_3_large
bucket_name: autorag # replace your bucket name
Expand All @@ -50,6 +50,47 @@ This should correspond to the `dimension` of the embeddings generated by the spe
password: ${COUCHBASE_PASSWORD"}
```

Here is a simple example of a YAML configuration file that uses the Couchbase vector database and the OpenAI:

```yaml
vectordb:
- name: openai_couchbase
db_type: couchbase
embedding_model: openai_embed_3_large
bucket_name: autorag # replace your bucket name
scope_name: autorag # replace your scope name
collection_name: autorag # replace your collection name
index_name: autorag_search # replace your index name
connection_string: ${COUCHBASE_CONNECTION_STRING}
username: ${COUCHBASE_USERNAME"}
password: ${COUCHBASE_PASSWORD"}
node_lines:
- node_line_name: retrieve_node_line # Arbitrary node line name
nodes:
- node_type: retrieval
strategy:
metrics: [retrieval_f1, retrieval_recall, retrieval_precision]
top_k: 3
modules:
- module_type: vectordb
vectordb: openai_couchbase
- node_line_name: post_retrieve_node_line # Arbitrary node line name
nodes:
- node_type: prompt_maker
strategy:
metrics: [bleu, meteor, rouge]
modules:
- module_type: fstring
prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
- node_type: generator
strategy:
metrics: [bleu, rouge]
modules:
- module_type: llama_index_llm
llm: openai
model: [ gpt-4o-mini ]
```

### Parameters

1. `embedding_model: str`
Expand Down
41 changes: 40 additions & 1 deletion docs/source/integration/vectordb/milvus.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ The `Milvus` class is a vector database implementation that allows you to store,
To use the Milvus vector database, you need to configure it in your YAML configuration file. Here's an example configuration:

```yaml
- name: openai_embed_3_large
- name: openai_milvus
db_type: milvus
embedding_model: openai_embed_3_large
collection_name: openai_embed_3_large
Expand All @@ -17,6 +17,45 @@ To use the Milvus vector database, you need to configure it in your YAML configu
similarity_metric: cosine
```

Here is a simple example of a YAML configuration file that uses the Milvus vector database and the OpenAI:

```yaml
vectordb:
- name: openai_milvus
db_type: milvus
embedding_model: openai_embed_3_large
collection_name: openai_embed_3_large
uri: ${MILVUS_URI}
token: ${MILVUS_TOKEN}
embedding_batch: 50
similarity_metric: cosine
node_lines:
- node_line_name: retrieve_node_line # Arbitrary node line name
nodes:
- node_type: retrieval
strategy:
metrics: [retrieval_f1, retrieval_recall, retrieval_precision]
top_k: 3
modules:
- module_type: vectordb
vectordb: openai_milvus
- node_line_name: post_retrieve_node_line # Arbitrary node line name
nodes:
- node_type: prompt_maker
strategy:
metrics: [bleu, meteor, rouge]
modules:
- module_type: fstring
prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
- node_type: generator
strategy:
metrics: [bleu, rouge]
modules:
- module_type: llama_index_llm
llm: openai
model: [ gpt-4o-mini ]
```

1. `embedding_model: str`
- Purpose: Specifies the name or identifier of the embedding model to be used.
- Example: "openai_embed_3_large"
Expand Down
40 changes: 39 additions & 1 deletion docs/source/integration/vectordb/pinecone.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ You can get the API key from [here](https://app.pinecone.io/organizations/-/keys
### Example YAML file

```yaml
- name: openai_embed_3_large
- name: openai_pinecone
db_type: pinecone
embedding_model: openai_embed_3_large
index_name: openai_embed_3_large
Expand All @@ -21,6 +21,44 @@ You can get the API key from [here](https://app.pinecone.io/organizations/-/keys
dimension: 1536
```

Here is a simple example of a YAML configuration file that uses the Pinecone vector database and the OpenAI:

```yaml
vectordb:
- name: openai_pinecone
db_type: pinecone
embedding_model: openai_embed_3_large
index_name: openai_embed_3_large
api_key: ${PINECONE_API_KEY}
similarity_metric: cosine
dimension: 1536
node_lines:
- node_line_name: retrieve_node_line # Arbitrary node line name
nodes:
- node_type: retrieval
strategy:
metrics: [retrieval_f1, retrieval_recall, retrieval_precision]
top_k: 3
modules:
- module_type: vectordb
vectordb: openai_pinecone
- node_line_name: post_retrieve_node_line # Arbitrary node line name
nodes:
- node_type: prompt_maker
strategy:
metrics: [bleu, meteor, rouge]
modules:
- module_type: fstring
prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
- node_type: generator
strategy:
metrics: [bleu, rouge]
modules:
- module_type: llama_index_llm
llm: openai
model: [ gpt-4o-mini ]
```

### Parameters

1. `embedding_model: str`
Expand Down
41 changes: 40 additions & 1 deletion docs/source/integration/vectordb/qdrant.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Its capabilities are particularly beneficial for developing applications that re
To use the Qdrant vector database, you need to configure it in your YAML configuration file. Here's an example configuration:

```yaml
- name: openai_embed_3_large
- name: openai_qdrant
db_type: qdrant
embedding_model: openai_embed_3_large
collection_name: openai_embed_3_large
Expand All @@ -21,6 +21,45 @@ To use the Qdrant vector database, you need to configure it in your YAML configu
dimension: 1536
```

Here is a simple example of a YAML configuration file that uses the Qdrant vector database and the OpenAI:

```yaml
vectordb:
- name: openai_qdrant
db_type: qdrant
embedding_model: openai_embed_3_large
collection_name: openai_embed_3_large
client_type: docker
embedding_batch: 50
similarity_metric: cosine
dimension: 1536
node_lines:
- node_line_name: retrieve_node_line # Arbitrary node line name
nodes:
- node_type: retrieval
strategy:
metrics: [retrieval_f1, retrieval_recall, retrieval_precision]
top_k: 3
modules:
- module_type: vectordb
vectordb: openai_qdrant
- node_line_name: post_retrieve_node_line # Arbitrary node line name
nodes:
- node_type: prompt_maker
strategy:
metrics: [bleu, meteor, rouge]
modules:
- module_type: fstring
prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
- node_type: generator
strategy:
metrics: [bleu, rouge]
modules:
- module_type: llama_index_llm
llm: openai
model: [ gpt-4o-mini ]
```

1. `embedding_model: str`
- Purpose: Specifies the name or identifier of the embedding model to be used.
- Example: "openai_embed_3_large"
Expand Down
85 changes: 84 additions & 1 deletion docs/source/integration/vectordb/weaviate.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ If you already have content stored in weaviate, you'll need to set that key valu
#### Example YAML file

```yaml
- name: openai_embed_3_large
- name: openai_weaviate
db_type: weaviate
embedding_model: openai_embed_3_large
collection_name: openai_embed_3_large
Expand All @@ -37,6 +37,48 @@ If you already have content stored in weaviate, you'll need to set that key valu
text_key: content
```

Here is a simple example of a YAML configuration file that uses the Weaviate vector database and the OpenAI:

```yaml
vectordb:
- name: openai_weaviate
db_type: weaviate
embedding_model: openai_embed_3_large
collection_name: openai_embed_3_large
client_type: docker
host: localhost
port: 8080
grpc_port: 50051
embedding_batch: 50
similarity_metric: cosine
text_key: content
node_lines:
- node_line_name: retrieve_node_line # Arbitrary node line name
nodes:
- node_type: retrieval
strategy:
metrics: [retrieval_f1, retrieval_recall, retrieval_precision]
top_k: 3
modules:
- module_type: vectordb
vectordb: openai_weaviate
- node_line_name: post_retrieve_node_line # Arbitrary node line name
nodes:
- node_type: prompt_maker
strategy:
metrics: [bleu, meteor, rouge]
modules:
- module_type: fstring
prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
- node_type: generator
strategy:
metrics: [bleu, rouge]
modules:
- module_type: llama_index_llm
llm: openai
model: [ gpt-4o-mini ]
```

### 2. Weaviate Cloud

You can see the full installation guide [here](https://weaviate.io/developers/weaviate/installation/weaviate-cloud-services)
Expand All @@ -56,6 +98,47 @@ You can see the full installation guide [here](https://weaviate.io/developers/we
text_key: content
```

Here is a simple example of a YAML configuration file that uses the Weaviate vector database and the OpenAI:

```yaml
vectordb:
- name: openai_weaviate
db_type: weaviate
embedding_model: openai_embed_3_large
collection_name: openai_embed_3_large
url: ${WEAVIATE_URL}
api_key: ${WEAVIATE_API_KEY}
grpc_port: 50051
embedding_batch: 50
similarity_metric: cosine
text_key: content
node_lines:
- node_line_name: retrieve_node_line # Arbitrary node line name
nodes:
- node_type: retrieval
strategy:
metrics: [retrieval_f1, retrieval_recall, retrieval_precision]
top_k: 3
modules:
- module_type: vectordb
vectordb: openai_weaviate
- node_line_name: post_retrieve_node_line # Arbitrary node line name
nodes:
- node_type: prompt_maker
strategy:
metrics: [bleu, meteor, rouge]
modules:
- module_type: fstring
prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
- node_type: generator
strategy:
metrics: [bleu, rouge]
modules:
- module_type: llama_index_llm
llm: openai
model: [ gpt-4o-mini ]
```

### Parameters

1. `embedding_model: str`
Expand Down