Skip to content

Releases: zilliztech/GPTCache

v0.1.35

07 Jul 12:17
e3e97c4
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Support the redis as the cache store, usage example: redis+onnx
  2. Add report table for easy analysis of cache data

What's Changed

  • [add] support for redis cache storage by @a9raag in #465
  • Improve the position of lint comment by @SimFG in #466
  • Add redis integration test case by @SimFG in #467
  • Upgrade the actions/setup-python to v4 by @SimFG in #471
  • Add the report table by @SimFG in #472
  • Update the version to 0.1.35 by @SimFG in #473

Full Changelog: 0.1.34...0.1.35

v0.1.34

30 Jun 12:09
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Add support for Qdrant Vector Store
  2. Add support for Mongodb Cache Store
  3. Fix bug about the redis vector and onnx similarity evaluation

What's Changed

  • Correct the wrong search return value in the Redis vector store. by @SimFG in #452
  • [Feature] Cache consistency check for Chroma & Milvus by @wybryan in #448
  • Fix the pylint error and add the chromadb test by @SimFG in #457
  • [add] support for mongodb storage by @a9raag in #454
  • Fix the wrong return value of onnx similarity evaluation by @SimFG in #460

New Contributors

Full Changelog: 0.1.33...0.1.34

v0.1.33

27 Jun 14:46
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Make some improvements to the code by fixing a few bugs. For further information, please refer to the new pull request list.
  2. Add How to better configure your cache document

What's Changed

New Contributors

Full Changelog: 0.1.32...0.1.33

v0.1.32

15 Jun 14:49
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Support the redis as vector store
from gptcache.manager import VectorBase

vector_base = VectorBase("redis", dimension=10)
  1. Fix the context len config bug

What's Changed

New Contributors

Full Changelog: 0.1.31...0.1.32

v0.1.31

14 Jun 13:27
65a890e
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. To improve the precision of cache hits, four similarity evaluation methods were added
  • SBERT CrossEncoder Evaluation
  • Cohere rerank api (Free accounts can make up to 100 calls per minute.)
  • Multi-round dialog similarity weight matching
  • Time Evaluation. For the cached answer, first check the time dimension, such as only using the generated cache for the past day
  1. Fix some bugs
  • OpenAI exceptions type #416
  • LangChainChat does work with _agenerate function #400

more details: https://github.com/zilliztech/GPTCache/blob/main/docs/release_note.md

What's Changed

  • Raise the same type's error for the openai by @SimFG in #421
  • Add sequence match evaluation. by @wxywb in #420
  • Add the Time Evaluation by @SimFG in #423
  • Improve SequenceMatchEvaluation for several cases. by @wxywb in #424
  • Change the evaluation score of sequence evaluation to be larger as th… by @wxywb in #425
  • LangchainChat support _agenerate function by @SimFG in #426
  • Add SBERT CrossEncoder evaluation. by @wxywb in #428
  • Update the version to 0.1.31 by @SimFG in #429

Full Changelog: 0.1.30...0.1.31

v0.1.30

07 Jun 14:11
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Support to use the cohere rerank api to evaluate the similarity
from gptcache.similarity_evaluation import CohereRerankEvaluation

evaluation = CohereRerankEvaluation()
score = evaluation.evaluation(
    {
        'question': 'What is the color of sky?'
    },
    {
        'answer': 'the color of sky is blue'
    }
)
  1. Improve the gptcache server api, refer to the "/docs" path after starting the server
  2. Fix the bug about the langchain track token usage

What's Changed

  • Add input summarization. by @wxywb in #404
  • Langchain track token usage by @SimFG in #409
  • Support to download the cache files by @SimFG in #410
  • Support to use the cohere rerank api to evaluate the similarity by @SimFG in #412

Full Changelog: 0.1.29...0.1.30

v0.1.29

02 Jun 08:51
fd7e303
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Improve the GPTCache server by using FASTAPI

NOTE: The api struct has been optimized, details: Use GPTCache server

  1. Add the usearch vector store
from gptcache.manager import manager_factory

data_manager = manager_factory("sqlite,usearch", vector_params={"dimension": 10})

What's Changed

  • Improve the unit test flow by @SimFG in #397
  • Add: USearch vector search engine by @VoVoR in #399
  • Add the saved token report, auto flush data by @SimFG in #401
  • Use the fastapi to improve the GPTCache server by @SimFG in #405
  • Update the version to 0.1.29 by @SimFG in #406

New Contributors

Full Changelog: 0.1.28...0.1.29

v0.1.28

29 May 16:07
7db6237
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

To handle a large prompt, there are currently two options available:

  1. Increase the column size of CacheStorage.
from gptcache.manager import manager_factory

data_manager = manager_factory(
    "sqlite,faiss", scalar_params={"table_len_config": {"question_question": 5000}}
)

More Details:

  • 'question_question': the question column size in the question table, default to 3000.
  • 'answer_answer': the answer column size in the answer table, default to 3000.
  • 'session_id': the session id column size in the session table, default to 1000.
  • 'dep_name': the name column size in the dep table, default to 1000.
  • 'dep_data': the data column size in the dep table, default to 3000.
  1. When using a template, use the dynamic value in the template as the cache key instead of using the entire template as the key.
  • str template
from gptcache import Config
from gptcache.processor.pre import last_content_without_template

template_obj = "tell me a joke about {subject}"
prompt = template_obj.format(subject="animal")
value = last_content_without_template(
    data={"messages": [{"content": prompt}]}, cache_config=Config(template=template_obj)
)
print(value)
# ['animal']
  • langchain prompt template
from langchain import PromptTemplate

from gptcache import Config
from gptcache.processor.pre import last_content_without_template

template_obj = PromptTemplate.from_template("tell me a joke about {subject}")
prompt = template_obj.format(subject="animal")

value = last_content_without_template(
    data={"messages": [{"content": prompt}]},
    cache_config=Config(template=template_obj.template),
)
print(value)
# ['animal']
  1. Wrap the openai object, reference: BaseCacheLLM
import random

from gptcache import Cache
from gptcache.adapter import openai
from gptcache.adapter.api import init_similar_cache
from gptcache.processor.pre import last_content

cache_obj = Cache()
init_similar_cache(
    data_dir=str(random.random()), pre_func=last_content, cache_obj=cache_obj
)


def proxy_openai_chat_complete(*args, **kwargs):
    nonlocal is_proxy
    is_proxy = True
    import openai as real_openai

    return real_openai.ChatCompletion.create(*args, **kwargs)


openai.ChatCompletion.llm = proxy_openai_chat_complete
openai.ChatCompletion.cache_args = {"cache_obj": cache_obj}

openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What's GitHub"},
    ],
)

What's Changed

  • Add the BaseCacheLLM abstract class to wrap the llm by @SimFG in #394
  • Add the pre-function of handling long prompt and Update context doc by @SimFG in #395
  • Support to config the context pre-process by the yaml file by @SimFG in #396

Full Changelog: 0.1.27...0.1.28

v0.1.27

25 May 16:01
d0e27c9
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Support the uform embedding, which can be used the bilingual (english + chinese) language

thanks @ashvardanian 's contribution

from gptcache.embedding import UForm

test_sentence = 'Hello, world.'
encoder = UForm(model='unum-cloud/uform-vl-english')
embed = encoder.to_embeddings(test_sentence)

test_sentence = '什么是Github'
encoder = UForm(model='unum-cloud/uform-vl-multilingual')
embed = encoder.to_embeddings(test_sentence)

What's Changed

  • Fix the wrong LangChainChat comment by @SimFG in #381
  • Add UForm multi-modal embedding by @SimFG in #382
  • Support to config the cache storage data size by @SimFG in #383
  • Update the protobuf version in the doc by @SimFG in #387
  • Update the version to 0.1.27 by @SimFG in #389

Full Changelog: 0.1.26...0.1.27

v0.1.26

23 May 13:34
Compare
Choose a tag to compare

🎉 Introduction to new functions of GPTCache

  1. Support the paddlenlp embedding @vax521
from gptcache.embedding import PaddleNLP

test_sentence = 'Hello, world.'
encoder = PaddleNLP(model='ernie-3.0-medium-zh')
embed = encoder.to_embeddings(test_sentence)
  1. Support the openai Moderation api
from gptcache.adapter import openai
from gptcache.adapter.api import init_similar_cache
from gptcache.processor.pre import get_openai_moderation_input

init_similar_cache(pre_func=get_openai_moderation_input)
openai.Moderation.create(
    input="hello, world",
)
  1. Add the llama_index bootcamp, through which you can learn how GPTCache works with llama index

details: WebPage QA

What's Changed

  • Replace summarization test model. by @wxywb in #368
  • Add the llama index bootcamp by @SimFG in #371
  • Update the llama index example url by @SimFG in #372
  • Support the openai moderation adapter by @SimFG in #376
  • Paddlenlp embedding support by @SimFG in #377
  • Update the cache config template file and example directory by @SimFG in #380

Full Changelog: 0.1.25...0.1.26