Anyscale

Running your own GPU instance can be a pretty expensive affair. Services like Anyscale or Replicate allow you to be charged for the exact duration of your API calls and let you use open source LLMs on their infrastructure.

Getting started with Anyscale

The Anyscale service you need to run LLMs is called Anyscale Endpoints. To get started, you can do the following to acquire an API key before heading over to Unstract to connect your Anyscale account.

Register an Anyscale Endpoints account
Generate an API key
From the list of supported models they feature in the Anyscale documentation, copy the name of the model you want to use. For instance:
- google/gemma-7b-it
- meta-llama/Llama-2-7b-chat-hf
- meta-llama/Llama-2-13b-chat-hf
- meta-llama/Llama-2-70b-chat-hf
- codellama/CodeLlama-70b-Instruct-hf
- mistralai/Mistral-7B-Instruct-v0.1
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mlabonne/NeuralHermes-2.5-Mistral-7B

Setting up the Anyscale connector in Unstract

Head over to the Unstract console, From the side navigation, select LLMs 🞂 New LLM Profile 🞂 Anyscale.

img Anyscale Configuration

For Name, enter a name for this connector
For Model, enter one of model names as exactly mentioned in the Anyscale documentation. Some examples are mentioned above, though more might have been added or removed recently. Please refer to the Anyscale documentation for the latest list. Please note that some names contain a /. The names have to be entered as is along with the slash.
You can leave Additional kwargs empty
You can leave Max retries at the default value of 5
You can leave API Base at the default value present
For API Key, paste in the value copied in the earlier section from the Anyscale console

Getting started with Anyscale​

Setting up the Anyscale connector in Unstract​

Getting started with Anyscale

Setting up the Anyscale connector in Unstract