Features
Caching
Improve performance and reduce costs by caching previously generated responses.
When caching is enabled, our service stores the responses generated for each unique request. If an identical request is made in the future, instead of processing the request again, the cached response is instantly returned. This eliminates the need for redundant computations, resulting in faster response times and reduced API usage costs.
Caching is currently in a free beta preview.
Enabling Caching
To enable caching for your requests, you can set the cache
property of the openpipe object to true
. If you are making requests through our proxy, add the op-cache
header to your requests:
curl --request POST \
--url https://app.openpipe.ai/api/v1/chat/completions \
--header 'Authorization: Bearer YOUR_OPENPIPE_API_KEY' \
--header 'Content-Type: application/json' \
--header 'op-cache: true' \
--data '{
"model": "openpipe:your-fine-tuned-model-id",
"messages": [
{
"role": "system",
"content": "count to 5"
}
]
}'