Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for open ai cache #6

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bixia
Copy link

@bixia bixia commented Dec 24, 2024

Improve OpenAI API Response Caching System

Overview

This PR enhances the caching system for OpenAI API calls to improve performance and reduce API costs. The changes provide consistent caching behavior across different API endpoints (OpenAI, Azure, Claude) and request types (chat completions, embeddings).

Key Changes

Cache Implementation

  • Unified caching approach for all API endpoints (OpenAI, Azure, Claude)
  • Consistent cache key generation using request data
  • JSON serialization for embedding responses
  • Zero-vector fallback for embedding errors

Supported Endpoints

  • OpenAI chat completions
  • Azure OpenAI completions
  • Claude chat completions
  • OpenAI embeddings
  • Custom embeddings service

Error Handling

  • Improved error handling with fallback responses
  • Zero-vector returns for embedding failures
  • Cache miss handling with proper error messages

Benefits

  • Reduced API costs through efficient response caching
  • Improved performance for repeated queries
  • Consistent caching behavior across all endpoints
  • Better error recovery and fallback mechanisms

Testing

The changes have been tested with:

  • Standard OpenAI endpoints
  • Azure OpenAI endpoints
  • Claude API endpoints
  • Custom embedding services
  • Error scenarios and fallbacks

Usage Example

# The cache is automatically used for all API calls
response = common_ask(prompt)  # Will use cache if available

# Embeddings are also cached
embedding = common_get_embedding(text)  # Cached with proper JSON serialization

Notes

  • No database schema changes required
  • Backwards compatible with existing cache entries
  • Thread-safe implementation
  • Proper JSON serialization for embedding vectors

Related Issues

  • Reduces API costs through caching
  • Improves response times for repeated queries
  • Provides consistent behavior across different API endpoints

Future Improvements

  • Add cache expiration policies
  • Implement cache size limits
  • Add cache statistics tracking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant