Feature

Feature#

Support the openai chat completion normal and stream request
Get top_k similar search results, it can be set when creating the data manager
Support the cache chain, see: Cache#next_cache

bak_cache = Cache()
bak_cache.init()
cache.init(next_cache=bak_cache)

Whether to completely skip the current cache, that is, do not search the cache or save the Chat GPT results, see: Cache#cache_enable_func
In the cache initialization phase, no cache search is performed, but save the result returned by the chat gpt to cache, see: cache_skip=True in create request

openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=mock_messages,
    cache_skip=True,
)

Like Lego bricks, custom assemble all modules, including:
- Adapter: The user interface to adapt different LLM model requests to the GPTCache protocol
- Pre-processor: Extracts the key information from the request and preprocess
- Context Buffer: Maintains session context
- Encoder: Embed the text into a dense vector for similarity search
- Cache manager: which includes searching, saving, or evicting data
- Ranker: Evaluate similarity by judging the quality of cached answers
- Post-processor: Determine which cached answers to the user, and generate the response