😍 Contributing to GPTCache#
Before contributing to GPTCache, it is recommended to read the usage doc example-doc. These two articles will introduce how to use GPTCache and the meaning of parameters of related functions.
In the process of contributing, pay attention to the parameter type, because there is currently no type restriction added.
Note that development MUST be based on the
dev
branch
First check which part you want to contribute:
Add a method to pre-process the llm request
Add a scalar store type
Add a vector store type
Add a new data manager
Add a embedding function
Add a similarity evaluation function
Add a method to post-process the cache answer list
Add a new process in handling chatgpt requests
Lazy import and automatic installation#
For newly added third-party dependencies, lazy import and automatic installation are required. Implementation consists of the following steps:
Lazy import
# The __init__.py file of the same directory under the new file
__all__ = ['Milvus']
from gptcache.utils.lazy_import import LazyImport
milvus = LazyImport('milvus', globals(), 'gptcache.cache.vector_data.milvus')
def Milvus(**kwargs):
return milvus.Milvus(**kwargs)
Automatic installation
# 2.1 Add the import method
# add new method to util/__init__.py
__all__ = ['import_pymilvus']
from gptcache.utils.dependency_control import prompt_install
def import_pymilvus():
try:
# pylint: disable=unused-import
import pymilvus
except ModuleNotFoundError as e: # pragma: no cover
prompt_install('pymilvus')
import pymilvus # pylint: disable=ungrouped-imports
# 2.2 use the import method in your file
from gptcache.util import import_pymilvus
import_pymilvus()
Add a method to pre-process the llm request#
refer to the implementation of Pre.
Make sure the input params, the
data
represents the original request dictionary objectImplement the post method
Add a usage example to example directory and add the corresponding content to example.md README.md
# The origin openai request
import openai
openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
# This is the pre-process function of openai request, which is to get the last message
def last_content(data, **_):
return data.get("messages")[-1]["content"]
Add a cache storage type#
refer to the implementation of SQLDataBase.
Implement the CacheStorage interface
Make sure the newly added third-party libraries are lazy imported and automatic installation
Add the new store to the CacheBase method
Add a usage example to example directory and add the corresponding content to example.md README.md
Add a vector store type#
refer to the implementation of milvus.
Implement the VectorBase interface
Make sure the newly added third-party libraries are lazy imported and automatic installation
Add the new store to the VectorBase method
Add a usage example to example directory and add the corresponding content to example.md README.md
Add a new data manager#
refer to the implementation of MapDataManager, SSDataManager.
Implement the DataManager interface
Add the new store to the get_data_manager method
Add a usage example to example directory and add the corresponding content to example.md README.md
Add a embedding function#
refer to the implementation of cohere or openai.
Add a new python file to embedding directory
Make sure the newly added third-party libraries are lazy imported and automatic installation
Implement the embedding function and make sure your output dimension
Add a usage example to example directory and add the corresponding content to example.md README.md
Add a similarity evaluation function#
refer to the implementation of SearchDistanceEvaluation or OnnxModelEvaluation
Implement the SimilarityEvaluation interface
Make sure the range of return value, the
range
method return the min and max valueMake sure the input params of
evaluation
, you can learn more about in the user view model
rank = chat_cache.evaluation_func({
"question": pre_embedding_data,
"embedding": embedding_data,
}, {
"question": cache_question,
"answer": cache_answer,
"search_result": cache_data,
}, extra_param=context.get('evaluation', None))
Make sure the newly added third-party libraries are lazy imported and automatic installation
Implement the similarity evaluation function
Add a usage example to example directory and add the corresponding content to example.md README.md
Add a method to post-process the cache answer list#
refer to the implementation of first or random_one
Make sure the input params, you can learn more about in the adapter
Make sure the newly added third-party libraries are lazy imported and automatic installation
Implement the post method
Add a usage example to example directory and add the corresponding content to example.md README.md
# Get the most similar one from multiple results
def first(messages):
return messages[0]
# Randomly fetch one of many results
def random_one(messages):
return random.choice(messages)
Add a new process in handling chatgpt requests#
Need to have a clear understanding of the current process, refer to the adapter
Add a new process
Make sure all examples work properly