Skip to content

Commit

Permalink
add chromem-go
Browse files Browse the repository at this point in the history
  • Loading branch information
teilomillet committed Nov 20, 2024
1 parent ef5c3d0 commit aa91aac
Show file tree
Hide file tree
Showing 15 changed files with 781 additions and 20 deletions.
1 change: 1 addition & 0 deletions data/leaves.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Leaves are green because chlorophyll absorbs red and blue light.
1 change: 1 addition & 0 deletions data/sky.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The sky is blue because of Rayleigh scattering.
53 changes: 53 additions & 0 deletions examples/chromem/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Chromem Example

This example demonstrates how to use the Chromem vector database with Raggo's SimpleRAG interface.

## Prerequisites

1. Go 1.16 or later
2. OpenAI API key (set as environment variable `OPENAI_API_KEY`)

## Running the Example

1. Set your OpenAI API key:
```bash
export OPENAI_API_KEY='your-api-key'
```

2. Run the example:
```bash
go run main.go
```

## What it Does

1. Creates a new SimpleRAG instance with Chromem as the vector database
2. Creates sample documents about natural phenomena
3. Adds the documents to the database
4. Performs a semantic search using the query "Why is the sky blue?"
5. Prints the response based on the relevant documents found

## Expected Output

```
Question: Why is the sky blue?
Answer: The sky appears blue because of a phenomenon called Rayleigh scattering. When sunlight travels through Earth's atmosphere, it collides with gas molecules. These molecules scatter blue wavelengths of light more strongly than red wavelengths, which is why we see the sky as blue.
```

## Configuration

The example uses the following configuration:
- Vector Database: Chromem (persistent mode)
- Collection Name: knowledge-base
- Embedding Model: text-embedding-3-small
- Chunk Size: 200 characters
- Chunk Overlap: 50 characters
- Top K Results: 1
- Minimum Score: 0.1

## Notes

- The database is stored in `./data/chromem.db`
- Sample documents are created in the `./data` directory
- The example uses persistent storage mode for Chromem
77 changes: 77 additions & 0 deletions examples/chromem/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
package main

import (
"context"
"fmt"
"os"
"path/filepath"

"github.com/teilomillet/raggo"
)

func main() {
// Enable debug logging
raggo.SetLogLevel(raggo.LogLevelDebug)

// Create a temporary directory for our documents
tmpDir := "./data"
err := os.MkdirAll(tmpDir, 0755)
if err != nil {
fmt.Printf("Error creating temp directory: %v\n", err)
os.Exit(1)
}

// Create sample documents
docs := map[string]string{
"sky.txt": "The sky is blue because of Rayleigh scattering.",
"leaves.txt": "Leaves are green because chlorophyll absorbs red and blue light.",
}

for filename, content := range docs {
err := os.WriteFile(filepath.Join(tmpDir, filename), []byte(content), 0644)
if err != nil {
fmt.Printf("Error writing file %s: %v\n", filename, err)
os.Exit(1)
}
}

// Initialize RAG with Chromem
config := raggo.SimpleRAGConfig{
Collection: "knowledge-base",
DBType: "chromem",
DBAddress: "./data/chromem.db",
Model: "text-embedding-3-small", // OpenAI embedding model
APIKey: os.Getenv("OPENAI_API_KEY"),
Dimension: 1536, // text-embedding-3-small dimension
// TopK is determined dynamically by the number of documents
}

raggo.Debug("Creating SimpleRAG with config", "config", config)

rag, err := raggo.NewSimpleRAG(config)
if err != nil {
fmt.Printf("Error creating SimpleRAG: %v\n", err)
os.Exit(1)
}
defer rag.Close()

ctx := context.Background()

// Add documents from the directory
raggo.Debug("Adding documents from directory", "dir", tmpDir)
err = rag.AddDocuments(ctx, tmpDir)
if err != nil {
fmt.Printf("Error adding documents: %v\n", err)
os.Exit(1)
}

// Search for documents
raggo.Debug("Searching for documents", "query", "Why is the sky blue?")
response, err := rag.Search(ctx, "Why is the sky blue?")
if err != nil {
fmt.Printf("Error searching: %v\n", err)
os.Exit(1)
}

fmt.Printf("Response: %s\n", response)
}
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ require (
github.com/leodido/go-urn v1.4.0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/milvus-io/milvus-proto/go-api/v2 v2.4.6 // indirect
github.com/philippgille/chromem-go v0.7.0 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/rogpeppe/go-internal v1.12.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ github.com/milvus-io/milvus-proto/go-api/v2 v2.4.6/go.mod h1:1OIl0v5PQeNxIJhCvY+
github.com/milvus-io/milvus-sdk-go/v2 v2.4.1 h1:KhqjmaJE4mSxj1a88XtkGaqgH4duGiHs1sjnvSXkwE0=
github.com/milvus-io/milvus-sdk-go/v2 v2.4.1/go.mod h1:7SJxshlnVhNLksS73tLPtHYY9DiX7lyL43Rv41HCPCw=
github.com/opentracing/opentracing-go v1.1.0/go.mod h1:UkNAQd3GIcIGf0SeVgPpRdFStlNbqXla1AfSYxPUl2o=
github.com/philippgille/chromem-go v0.7.0 h1:4jfvfyKymjKNfGxBUhHUcj1kp7B17NL/I1P+vGh1RvY=
github.com/philippgille/chromem-go v0.7.0/go.mod h1:hTd+wGEm/fFPQl7ilfCwQXkgEUxceYh86iIdoKMolPo=
github.com/pingcap/errors v0.11.4 h1:lFuQV/oaUMGcD2tqt+01ROSmJs75VG1ToEOkZIZ4nE4=
github.com/pingcap/errors v0.11.4/go.mod h1:Oi8TUi2kEtXXLMJk9l1cGmz20kV3TaQ0usTwv5KuLY8=
github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e/go.mod h1:pJLUxLENpZxwdsKMEsNbx1VGcRFpLqf3715MtcvvzbA=
Expand Down
Loading

0 comments on commit aa91aac

Please sign in to comment.