Anyscale
Running your own GPU instance can be a pretty expensive affair. Services like Anyscale or Replicate allow you to be charged for the exact duration of your API calls and let you use open source LLMs on their infrastructure.
Getting started with Anyscale
The Anyscale service you need to run LLMs is called Anyscale Endpoints. To get started, you can do the following to acquire an API key before heading over to Unstract to connect your Anyscale account.
- Register an Anyscale Endpoints account
- Generate an API key
- From the list of supported models they feature in the Anyscale documentation, copy the name of the model you want to use. For instance:
- google/gemma-7b-it
- meta-llama/Llama-2-7b-chat-hf
- meta-llama/Llama-2-13b-chat-hf
- meta-llama/Llama-2-70b-chat-hf
- codellama/CodeLlama-70b-Instruct-hf
- mistralai/Mistral-7B-Instruct-v0.1
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mlabonne/NeuralHermes-2.5-Mistral-7B
Setting up the Anyscale connector in Unstract
Head over to the Unstract console, From the side navigation, select LLMs
🞂 New LLM Profile
🞂 Anyscale
.
- For
Name
, enter a name for this connector - For
Model
, enter one of model names as exactly mentioned in the Anyscale documentation. Some examples are mentioned above, though more might have been added or removed recently. Please refer to the Anyscale documentation for the latest list. Please note that some names contain a/
. The names have to be entered as is along with the slash. - You can leave
Additional kwargs
empty - You can leave
Max retries
at the default value of5
- You can leave
API Base
at the default value present - For
API Key
, paste in the value copied in the earlier section from the Anyscale console