Tweet Classifier#
This example will show you how to determine sentiment of tweets, the original example is on OpenAI Example, the difference is that we will teach you how to cache the response for exact and similar matches with gptcache, it will be very simple, you just need to add an extra step to initialize the cache.
Before running the example, make sure the OPENAI_API_KEY
environment variable is set by executing echo $OPENAI_API_KEY
. If it is not already set, it can be set by using export OPENAI_API_KEY=YOUR_API_KEY
on Unix/Linux/MacOS systems or set OPENAI_API_KEY=YOUR_API_KEY
on Windows systems.
Then we can learn the usage and acceleration effect of gptcache by the following code, which consists of three parts, the original openai way, the exact search and the similar search.
OpenAI API original usage#
import time
import openai
def response_text(openai_resp):
return openai_resp['choices'][0]['message']['content']
tweet = "I loved the new Batman movie!"
# OpenAI API original usage
start_time = time.time()
response = openai.ChatCompletion.create(
model='gpt-3.5-turbo',
messages=[
{
'role': 'user',
'content': f"Decide whether a Tweet's sentiment is positive, neutral, or negative.\n\nTweet: \"{tweet}\"\nSentiment:",
}
],
)
print(f'Tweet: {tweet}')
print("Time consuming: {:.2f}s".format(time.time() - start_time))
print(f'Sentiment: {response_text(response)}\n')
Tweet: I loved the new Batman movie!
Time consuming: 0.81s
Sentiment: Positive
OpenAI API + GPTCache, exact match cache#
Initalize the cache to run GPTCache and import openai
form gptcache.adapter
, which will automatically set the map data manager to match the exact cahe, more details refer to build your cache.
And if you send the exact same tweets, the answer to the second tweet will be obtained from the cache without requesting ChatGPT again.
import time
def response_text(openai_resp):
return openai_resp['choices'][0]['message']['content']
print("Cache loading.....")
# To use GPTCache, that's all you need
# -------------------------------------------------
from gptcache import cache
from gptcache.adapter import openai
cache.init()
cache.set_openai_key()
# -------------------------------------------------
tweet = "The weather today is neither good nor bad"
for _ in range(2):
start_time = time.time()
response = openai.ChatCompletion.create(
model='gpt-3.5-turbo',
messages=[
{
'role': 'user',
'content': f"Decide whether a Tweet's sentiment is positive, neutral, or negative.\n\nTweet: \"{tweet}\"\nSentiment:",
}
],
)
print(f'Tweet: {tweet}')
print("Time consuming: {:.2f}s".format(time.time() - start_time))
print(f'Sentiment: {response_text(response)}\n')
Cache loading.....
Tweet: The weather today is neither good nor bad
Time consuming: 0.62s
Sentiment: neutral
Tweet: The weather today is neither good nor bad
Time consuming: 0.00s
Sentiment: neutral
OpenAI API + GPTCache, similar search cache#
We are going to use DocArray’s in-memory index to perform similarity search.
Set the cache with embedding_func
to generate embedding for the text, and data_manager
to manager the cache data, similarity_evaluation
to evaluate the similarities, more details refer to build your cache.
After obtaining an answer from ChatGPT in response to several similar tweets, the answers to subsequent questions can be retrieved from the cache without the need to request ChatGPT again.
import time
def response_text(openai_resp):
return openai_resp['choices'][0]['message']['content']
from gptcache import cache
from gptcache.adapter import openai
from gptcache.embedding import Onnx
from gptcache.manager import CacheBase, VectorBase, get_data_manager
from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation
print("Cache loading.....")
onnx = Onnx()
data_manager = get_data_manager(CacheBase("sqlite"), VectorBase("docarray"))
cache.init(
embedding_func=onnx.to_embeddings,
data_manager=data_manager,
similarity_evaluation=SearchDistanceEvaluation(),
)
cache.set_openai_key()
tweets = [
"The new restaurant in town exceeded my expectations with its delectable cuisine and impeccable service",
"New restaurant in town exceeded my expectations with its delectable cuisine and impeccable service",
"The new restaurant exceeded my expectations with its delectable cuisine and impeccable service",
]
for tweet in tweets:
start_time = time.time()
response = openai.ChatCompletion.create(
model='gpt-3.5-turbo',
messages=[
{
'role': 'user',
'content': f"Decide whether a Tweet's sentiment is positive, neutral, or negative.\n\nTweet: \"{tweet}\"\nSentiment:",
}
],
)
print(f'Tweet: {tweet}')
print("Time consuming: {:.2f}s".format(time.time() - start_time))
print(f'Sentiment: {response_text(response)}\n')
/Users/jinaai/Desktop/GPTCache/venv1/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Cache loading.....
Tweet: The new restaurant in town exceeded my expectations with its delectable cuisine and impeccable service
Time consuming: 0.70s
Sentiment: Positive
Tweet: New restaurant in town exceeded my expectations with its delectable cuisine and impeccable service
Time consuming: 0.59s
Sentiment: Positive
Tweet: The new restaurant exceeded my expectations with its delectable cuisine and impeccable service
Time consuming: 0.74s
Sentiment: Positive