Deploy Portkey Gateway to Koyeb to Streamline Requests to 200+ LLMs
Introduction
Since their debut in the ML/AI landscape, large language models (LLMs) have seen widespread adoption, delivering significant value across diverse fields. Today, a variety of LLMs are available, each with unique capabilities and specialized strengths. Because of their varied focuses, integrating multiple LLMs into a software product offers the opportunity to build AI-powered products that adapt to diverse requirements and workloads with increased reliability and robustness, resulting in an improved overall user experience.
Portkey, a control panel for AI apps, offers a suite of development tools to help with this. Among them is AI Gateway, which connects, load balances, and manages multiple LLMs through a single, consistent API. Portkey's AI Gateway supports over 100 AI models offering seamless access to vision, audio, and image generation capabilities and ensuring uninterrupted performance by allowing model switching during failures.
In this tutorial, you will create a simple LLM querying application with the option to submit questions to two different LLMs — Llama 3 and Groq — using Portkey's AI gateway.
Prerequisites
To successfully follow this tutorial, you'll need:
- Node.js and npm installed. The demo app in this tutorial uses version 20 of Node.js.
- A Together AI account.
- A Groq account.
- A Koyeb account.
Get LLM API Keys
The two LLMs used in this tutorial require valid API keys for access. In this section, you'll obtain the API keys for both.
First, log into your Together AI account. Click the profile icon in the top right corner and go to the settings page. Then, navigate to the API KEYS tab, copy your API key, and store it securely for future use.
Next, log into your Groq account. In the left sidebar, click the API Keys link and click the Create API Key button to create an API key. Copy your API key and store it securely for future use.
In the next section, you will setup Portkey's AI Gateway using Docker.
Deploy the AI Gateway
Portkey provides, amongst other options, a Docker image for deploying the AI Gateway. This ready-to-use service provides an authenticated API on port 8787, with endpoints for chat and image features from supported LLMs.
To access the AI Gateway API, you must first deploy the Docker image and start the service. Begin by logging into your Koyeb control panel and following these steps:
- Click the Create Service button in the sidebar.
- Choose the Docker web service option.
- Type
portkeyai/gateway:latest
into the Docker image field. - Select your preferred instance and region.
- In the Exposed ports section, change the Port value to
8787
. - Choose a name for your service in the Service name section.
- Click Deploy.
Koyeb handles the pulling, building, and running of the AI Gateway Docker image. Once the deployment is finished, make sure to copy the service's public URL and save it for future reference.
In the next section, you'll create an npm
project for the demo application.
Create a demo project
In this section, you'll set up an npm
project and install the essential packages for the demo application. To get started, run the following command in your terminal:
mkdir example-portkey
The command creates an example-portkey
directory on your development machine, which will be the application's root directory. Next, run the commands below to initialize a Git repository within the example-portkey
directory:
cd example-portkey
git init
The first command switches your terminal to the example-portkey
directory, and the second command initializes a Git repository within the directory.
Next, initialize an npm
project in the root directory by running this command in your terminal:
npm init -y
The command above creates an npm
project with the default configurations in the example-portkey
directory, creating a package.json
file in the process. Next, install the required packages by executing the commands below:
npm install axios body-parser ejs express
npm install -D dotenv nodemon
These commands install the specified JavaScript packages from the npm
registry, with the -D
flag indicating that these packages are meant for development only. The installed packages include:
axios
: A promise based HTTP client for the browser and Node.js.body-parser
: A body parsing middleware for Node.jsejs
: A JavaScript templating engine.express
: A web framework for Node.js.
The development-only packages include:
dotenv
: A package for handling environment variables during development.nodemon
: A package that automatically restarts development servers whenever code changes are detected.
With the packages installed, you've set up an npm
project for the demo application. Next, you'll configure an Express service for the application.
Set up the Express server
In this section, you'll configure an Express web server for the demo application.
First, create a file named index.js
in the root directory. Then, add the following code to that file:
require('dotenv').config()
const express = require('express')
const path = require('path')
const bodyParser = require('body-parser')
const app = express()
const port = process.env.PORT || 3000
app.use(express.json())
app.use(bodyParser.urlencoded({ extended: true }))
app.set('view engine', 'ejs')
app.set('views', path.join(__dirname, 'views'))
app.get('/', (_req, res) => {
res.render('index')
})
app.listen(port, () => {
console.log(`Server is running on http://localhost:${port}`)
})
The code begins by importing the following packages:
dotenv
: to manage environment variables.express
: to create and manage a web server.path
: to handle file and directory paths.body-parser
: to parse the body of incoming requests.
It then creates an instance of an Express application and sets the server to listen on the port defined by the PORT
environment variable, using port 3000
if the variable is not set. The server is configured to parse JSON and URL-encoded data, uses ejs
as the view engine, and looks for EJS templates in the views
directory.
A route handler is defined for the root path (/
), which renders the index view when accessed. Finally, the server starts listening for requests on the specified port and logs a confirmation message that it is running.
Now that the Express server is set up, the next section will walk you through creating a page to query the LLMs.
Set up query page
The LLM query page will include a form with an input field for questions, a dropdown menu to select the LLM, and a submit button. Upon submission, the LLM's response will be displayed on the page.
To begin, create a views
directory in the root of your project:
mkdir views
Inside this new views
directory, create an index.ejs
file and add the following code to it:
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Portkey Gateway Questionnaire</title>
<link
href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css"
rel="stylesheet"
integrity="sha384-QWTKZyjpPEjISv5WaRU9OFeRpok6YctnYmDr5pNlyT2bRjXh0JMhjY6hW+ALEwIH"
crossorigin="anonymous"
/>
</head>
<body>
<div class="container py-4">
<div class="bg-light rounded-3 mb-4 p-5">
<div class="container-fluid">
<h1 class="display-5 fw-bold">Ask a Question</h1>
<div class="col-md-8 fs-6">
<form id="questionForm" method="POST" action="/ask">
<div class="mb-3">
<label for="question">Question</label>
<input
type="text"
class="form-control col-6"
id="question"
name="question"
placeholder="Type your question here"
required
/>
</div>
<div class="mb-3">
<label for="model">Model</label>
<select class="form-control col-6" id="model" name="model" required>
<option value="together">Together AI</option>
<option value="groq">Groq</option>
</select>
</div>
<button type="submit" class="btn btn-primary">Ask</button>
</form>
<% if(typeof response !=='undefined' ) {%>
<h2 class="display-7 fw-bold mt-5">Answer:</h2>
<p id="answer" class="h-100 text-bg-dark rounded-3 px-3 py-3"><%= response %></p>
<%}%>
</div>
</div>
</div>
</div>
</body>
</html>
The code added in the file above provides the HTML structure for the index view, which is rendered by the root route handler. It contains:
- Bootstrap for styling.
- An HTML form with an input field and a select dropdown.
- A submit button.
- A section to display the LLM response.
To view the page, modify the script
section of the package.json
file with the following code:
. . .
"scripts": {
"dev": "nodemon index.js", // [!code ++]
"test": "echo \"Error: no test specified\" && exit 1"
}
. . .
The code adds a dev
script for starting the development server. It executes the index.js
file using nodemon
.
To run the demo application on your local machine, enter the following command in your terminal:
npm run dev
Running the command starts the Express server and shows a message confirming that it's running on the specified port. To view the page, open your web browser and go to http://localhost:<YOUR_PORT>
. You should see the query form displayed.
In the next section, you'll set up the logic to query the LLMs through the AI Gateway.
Add LLM querying functionality
The AI Gateway provides a chat endpoint at /v1/chat/completions
where you can send POST requests to generate LLM responses for chat conversations. In this section, you'll add a route handler to process form data, call the chat endpoint to get a response, and return the response to the page.
Firstly, create a .env
file in your root directory and add the code below to the file, substituting your own API keys and gateway URL:
TOGETHER_API_KEY="<YOUR TOGETHER API KEY>"
GROQ_API_KEY="<YOUR GROQ API KEY>"
GATEWAY_URL="<YOUR DEPLOYED AI GATEWAY URL>" # URL without the trailing slash (/)
Since the environment variables entered above are sensitive, make sure they aren't committed to your Git history. To prevent this, run the following command in your terminal:
printf "%s\n" ".env" "node_modules" > .gitignore
The command creates a .gitignore
file and adds the .env
file and node_modules
directory to it, excluding them from the Git history.
Next, make the following changes to the code in your index.js
file:
require('dotenv').config()
const express = require('express')
const path = require('path')
const bodyParser = require('body-parser')
const axios = require('axios') // [!code ++]
const app = express()
const port = process.env.PORT || 3000
// Middleware
app.use(express.json())
app.use(bodyParser.urlencoded({ extended: true }))
// set up EJS as view engine
app.set('view engine', 'ejs')
app.set('views', path.join(__dirname, 'views'))
const MODEL_MAP = { // [!code ++]
groq: { // [!code ++]
providerSlug: 'groq', // [!code ++]
model: 'mixtral-8x7b-32768', // [!code ++]
apiKey: process.env.GROQ_API_KEY, // [!code ++]
}, // [!code ++]
together: { // [!code ++]
providerSlug: 'together-ai', // [!code ++]
model: 'meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo', // [!code ++]
apiKey: process.env.TOGETHER_API_KEY, // [!code ++]
}, // [!code ++]
} // [!code ++]
app.get('/', (_req, res) => {
res.render('index')
})
app.post('/ask', async (req, res) => { // [!code ++]
const { question, model } = req.body // [!code ++]
const modelInfo = MODEL_MAP[model] // [!code ++]
if (!modelInfo) { // [!code ++]
return res.status(400).json({ error: 'Model not found' }) // [!code ++]
} // [!code ++]
const { providerSlug: provider, apiKey, model: modelName } = modelInfo // [!code ++]
const data = { // [!code ++]
model: modelName, // [!code ++]
messages: [{ role: 'user', content: question }], // [!code ++]
} // [!code ++]
try { // [!code ++]
const url = `${process.env.GATEWAY_URL}/v1/chat/completions` // [!code ++]
const response = await axios.post(url, data, { // [!code ++]
headers: { // [!code ++]
Authorization: `Bearer ${apiKey}`, // [!code ++]
'Content-Type': 'application/json', // [!code ++]
'x-portkey-provider': provider, // [!code ++]
}, // [!code ++]
}) // [!code ++]
res.render('index', { response: `${response.data.choices[0].message.content}` }) // [!code ++]
} catch (error) { // [!code ++]
res.status(500).json({ error: error.message }) // [!code ++]
} // [!code ++]
}) // [!code ++]
app.listen(port, () => {
console.log(`Server is running on http://localhost:${port}`)
})
The modified code imports the axios
library and defines a MODEL_MAP
object, which stores the configurations for two LLMs. Each configuration includes the provider's name, the model name, and the API key needed for access.
Next, the code sets up a POST
route handler for the /ask
endpoint. When a request is received, it extracts the question and model from the request body. It then looks up the model's configuration in the MODEL_MAP
object and returns an error if the model is not found.
Afterwards, it creates the request payload for the AI Gateway and sends the request, including the API key in the Authorization
header and the provider name in the x-portkey-provider
header.
Finally, the response from the AI Gateway is returned to the client.
To test the functionality, start the server, open the UI page in your browser, enter a question, choose your preferred LLM, and submit. The response should appear on the page.
In the next section, you will deploy the demo application online on Koyeb.
Deploy to Koyeb
The demo application is now complete and interacts with the deployed AI Gateway service to answer questions using two different LLMs. The final step is to deploy the demo application to the cloud on Koyeb.
To get started, update the script
section in your package.json
file with the code below:
...
"scripts": {
"dev": "nodemon index.js",
"start": "node index.js", // [!code ++]
"test": "echo \"Error: no test specified\" && exit 1"
}
...
The code above modifies the scripts
section of the package.json
file, adding a start
script which runs the index.js
file using node
.
Next, create a GitHub repository for your code, then use the following command to push your local code to the repository:
git add --all
git commit -m "Complete AI Gateway powered LLM query app."
git remote add origin git@github.com/<YOUR_GITHUB_USERNAME>/<YOUR_REPOSITORY_NAME>.git
git branch -M main
git push -u origin main
To deploy the code from the GitHub repository, go to the Koyeb control panel. Then, on the Overview page:
- Click Create Service in the left sidebar.
- Choose the GitHub deploy option.
- Search for and select your repository. Alternatively, you can use the public example repo for this article by pasting the following in the Public GitHub repository field:
https://github.com/koyeb/example-portkey
. - Choose your preferred instance and deployment region.
- Under Environment variables, for each variable in your
.env
file:
- Enter the variable name.
- Select Secret as the type.
- For the value, click Create secret, then specify the secret name and value, and click Create.
- In the Service name section, enter a name for the service or use the default.
- Click Deploy to start the deployment.
The Koyeb platform builds and deploys your code, then starts the application using the start
script from the package.json
file. You can track the deployment progress through the provided logs. Once the deployment is complete and health checks pass, your application will be up and running.
Click the provided public URL to access your live application.
Conclusion
In this tutorial, you built a simple application that queries two different LLMs using Portkey's AI Gateway. The AI Gateway offers more than just chat completion, with features like caching, fallbacks, and load balancing. For more details on these features, refer to the Portkey Gateway documentation.
When your application is deployed from your own repository using the Git deployment option, any code push to the deployed branch will automatically trigger a new build. The changes will go live once the deployment succeeds. If the deployment fails, Koyeb will keep the last successful production deployment active, ensuring your application continues to run without interruption.