Language Translation#
This example will show you how to translates English to other languages, the original example is on OpenAI Example, the difference is that we will teach you how to cache the response for exact and similar matches with gptcache, it will be very simple, you just need to add an extra step to initialize the cache.
Before running the example, make sure the OPENAI_API_KEY
environment variable is set by executing echo $OPENAI_API_KEY
. If it is not already set, it can be set by using export OPENAI_API_KEY=YOUR_API_KEY
on Unix/Linux/MacOS systems or set OPENAI_API_KEY=YOUR_API_KEY
on Windows systems.
Then we can learn the usage and acceleration effect of gptcache by the following code, which consists of three parts, the original openai way, the exact search and the similar search.
OpenAI API original usage#
import time
import openai
def response_text(openai_resp):
return openai_resp["choices"][0]["text"]
start_time = time.time()
response = openai.Completion.create(
model="text-davinci-003",
prompt="Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhat rooms do you have available?\n\n1.",
temperature=0.3,
max_tokens=100,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
print(f"\nAnswer: 1.{response_text(response)}")
print("Time consuming: {:.2f}s".format(time.time() - start_time))
Answer: 1. Quels salles avez-vous disponibles?
2. ¿Qué habitaciones tienen disponibles?
3. どの部屋が利用可能ですか?
Time consuming: 2.06s
OpenAI API + GPTCache, exact match cache#
Initalize the cache to run GPTCache and import openai
form gptcache.adapter
, which will automatically set the map data manager to match the exact cahe, more details refer to build your cache.
import time
def response_text(openai_resp):
return openai_resp["choices"][0]["text"]
print("Cache loading.....")
# To use GPTCache, that's all you need
# -------------------------------------------------
from gptcache import cache
from gptcache.processor.pre import get_prompt
cache.init(pre_embedding_func=get_prompt)
cache.set_openai_key()
# -------------------------------------------------
questions = [
"Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhat rooms do you have available?\n\n1.",
"Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhich rooms do you have available?\n\n1.",
"Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhat kind of rooms do you have available?\n\n1.",
]
for question in questions:
start_time = time.time()
response = openai.Completion.create(
model="text-davinci-003",
prompt=question,
temperature=0.3,
max_tokens=100,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
print(f"\nAnswer: 1.{response_text(response)}")
print("Time consuming: {:.2f}s".format(time.time() - start_time))
Cache loading.....
Answer: 1. Quels sont les chambres que vous avez disponibles ?
2. ¿Qué habitaciones tienes disponibles?
3. どの部屋が利用可能ですか?
Time consuming: 1.81s
Answer: 1. Quelles pièces avez-vous disponibles?
2. ¿Qué habitaciones tienen disponibles?
3. どの部屋が利用可能ですか?
Time consuming: 4.47s
Answer: 1. Quels types de chambres avez-vous disponibles ?
2. ¿Qué tipos de habitaciones tienen disponibles?
3. どんな部屋が利用可能ですか?
Time consuming: 1.40s
OpenAI API + GPTCache, similar search cache#
Set the cache with pre_embedding_func
to preprocess the input data, embedding_func
to generate embedding for the text, and data_manager
to manager the cache data, similarity_evaluation
to evaluate the similarities, more details refer to build your cache.
import time
def response_text(openai_resp):
return openai_resp["choices"][0]["text"]
from gptcache import cache
from gptcache.adapter import openai
from gptcache.embedding import Onnx
from gptcache.processor.pre import get_prompt
from gptcache.manager import CacheBase, VectorBase, get_data_manager
from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation
print("Cache loading.....")
onnx = Onnx()
data_manager = get_data_manager(CacheBase("sqlite"), VectorBase("faiss", dimension=onnx.dimension))
cache.init(pre_embedding_func=get_prompt,
embedding_func=onnx.to_embeddings,
data_manager=data_manager,
similarity_evaluation=SearchDistanceEvaluation(),
)
cache.set_openai_key()
questions = [
"Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhat rooms do you have available?\n\n1.",
"Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhich rooms do you have available?\n\n1.",
"Translate this into 1. French, 2. Spanish and 3. Japanese:\n\nWhat kind of rooms do you have available?\n\n1.",
]
for question in questions:
start_time = time.time()
response = openai.Completion.create(
model="text-davinci-003",
prompt=question,
temperature=0.3,
max_tokens=100,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
print(f"\nAnswer: 1.{response_text(response)}")
print("Time consuming: {:.2f}s".format(time.time() - start_time))
Cache loading.....
Answer: 1. Quels salles avez-vous disponibles?
2. ¿Qué habitaciones tienes disponibles?
3. どの部屋が利用可能ですか?
Time consuming: 4.40s
Answer: 1. Quels salles avez-vous disponibles?
2. ¿Qué habitaciones tienes disponibles?
3. どの部屋が利用可能ですか?
Time consuming: 0.19s
Answer: 1. Quels salles avez-vous disponibles?
2. ¿Qué habitaciones tienes disponibles?
3. どの部屋が利用可能ですか?
Time consuming: 0.21s
We find that the performance improvement when searching the similar because the three statements of the query are similar, and hitting cache in gptcache, so it will return the cached results directly instead of requesting. And you can then also try running the query again for exact search, which will also speed up.