Goal of this notebook
This notebook shows a simple example of how to deploy an OpenAI chain into production. You can extend it to deploy your own self-hosted models where you can easily define amount of hardware resources (GPUs and CPUs) needed to run your model in production efficiently. Read more about available options including autoscaling in the Ray Serve documentation.Setup Ray Serve
Install ray withpip install ray[serve]
.
General Skeleton
The general skeleton for deploying a service is the following:Example of deploying and OpenAI chain with custom prompts
Get an OpenAI API key from here. By running the following code, you will be asked to provide your API key.localhost:8282
we can send a post request to get the results back.