Skip to main content

Anyscale

Running your own GPU instance can be a pretty expensive affair. Services like Anyscale or Replicate allow you to be charged for the exact duration of your API calls and let you use open source LLMs on their infrastructure.

Getting started with Anyscale

The Anyscale service you need to run LLMs is called Anyscale Endpoints. To get started, you can do the following to acquire an API key before heading over to Unstract to connect your Anyscale account.

  1. Register an Anyscale Endpoints account
  2. Generate an API key
  3. From the list of supported models they feature in the Anyscale documentation, copy the name of the model you want to use. For instance:
    • google/gemma-7b-it
    • meta-llama/Llama-2-7b-chat-hf
    • meta-llama/Llama-2-13b-chat-hf
    • meta-llama/Llama-2-70b-chat-hf
    • codellama/CodeLlama-70b-Instruct-hf
    • mistralai/Mistral-7B-Instruct-v0.1
    • mistralai/Mixtral-8x7B-Instruct-v0.1
    • mlabonne/NeuralHermes-2.5-Mistral-7B

Setting up the Anyscale connector in Unstract

Head over to the Unstract console, From the side navigation, select LLMs 🞂 New LLM Profile 🞂 Anyscale.

img Anyscale Configuration

  • For Name, enter a name for this connector
  • For Model, enter one of model names as exactly mentioned in the Anyscale documentation. Some examples are mentioned above, though more might have been added or removed recently. Please refer to the Anyscale documentation for the latest list. Please note that some names contain a /. The names have to be entered as is along with the slash.
  • You can leave Additional kwargs empty
  • You can leave Max retries at the default value of 5
  • You can leave API Base at the default value present
  • For API Key, paste in the value copied in the earlier section from the Anyscale console