LLM tutorials

Discover how to build, deploy and run LLM applications in production on Koyeb. The fastest way to deploy applications globally.

/Latest

DeepSeek-R1’s Multi-Lingual and Agentic RAG Capabilities in Practice

Charly Poly

DeepSeek-R1’s Multi-Lingual and Agentic RAG Capabilities in Practice

Explore real-world examples of DeepSeek-R1's powerful capabilities, including agentic reasoning, multilingual abilities, and large context windows, using Koyeb and Inngest orchestration.

Jan 31, 2025

13 min read

Charly Poly

DeepSeek-R1’s Multi-Lingual and Agentic RAG Capabilities in Practice

Explore real-world examples of DeepSeek-R1's powerful capabilities, including agentic reasoning, multilingual abilities, and large context windows, using Koyeb and Inngest orchestration.

Jan 31, 2025

13 min read

Deploy Portkey Gateway to Koyeb to Streamline Requests to 200+ LLMs

Chuks Opia

Deploy Portkey Gateway to Koyeb to Streamline Requests to 200+ LLMs

Learn how to deploy Portkey Gateway, a request and prompt router for LLMs with a unified API, and build an application that can query more than one LLM easily.

Sep 04, 2024

13 min read

Using OpenAI Whisper to Transcribe Podcasts on Koyeb

Rishi Raj Jain

Using OpenAI Whisper to Transcribe Podcasts on Koyeb

Learn how to use OpenAI Whisper to build an app to generate transcription of podcast audio files in real-time.

Jun 27, 2024

11 min read

Use Continue, Ollama, Codestral, and Koyeb GPUs to Build a Custom AI Code Assistant

Édouard Bonlieu

Use Continue, Ollama, Codestral, and Koyeb GPUs to Build a Custom AI Code Assistant

This guide shows how to use Continue with Ollama, a self-hosted AI solution to run the Mistral Codestral model on Koyeb GPUs

Jun 24, 2024

4 min read

Deploy the vLLM Inference Engine to Run Large Language Models (LLM) on Koyeb

Justin Ellingwood

Deploy the vLLM Inference Engine to Run Large Language Models (LLM) on Koyeb

Learn how to set up a vLLM Instance to run inference workloads and host your own OpenAI-compatible API on Koyeb.

Jun 12, 2024

12 min read

Using Groq to Build a Real-Time Language Translation App

Nuno Bispo

Using Groq to Build a Real-Time Language Translation App

Learn how to use Groq, speech-to-text (STT), and text-to-speech (TTS) to build an app to automatically translate between languages in real-time.

Apr 02, 2024

18 min read

Build a Multimodal Chat App using LLava, Chainlit, and Replicate

Nuno Bispo

Build a Multimodal Chat App using LLava, Chainlit, and Replicate

This tutorial walks through how to build a multimodal vision chat app powered by LLaVa, Chainlit, and Replicate.

Mar 08, 2024

14 min read

Use MistralAI, FastAPI, and FastUI to Build a Conversational AI Chatbot

Nuno Bispo

Use MistralAI, FastAPI, and FastUI to Build a Conversational AI Chatbot

This tutorial walks through how to build a chatbot powered by MistralAI, with FastAPI as the backend and FastUI as the front end.

Feb 07, 2024

18 min read

Use AutoGen, Chainlit, and OpenAI to Generate Dynamic AI Personas

Nuno Bispo

Use AutoGen, Chainlit, and OpenAI to Generate Dynamic AI Personas

Learn step-by-step how to set up and utilize AutoGen within Chainlit. You'll discover how to create and interact with AI personas that are tailored to your specific needs, be it scriptwriting for YouTube content or ideating SaaS products.

Dec 20, 2023

21 min read

Use pgvector and Hugging Face to Build an Optimized FAQ Search with Sentence Similarity

Chuks Opia

Use pgvector and Hugging Face to Build an Optimized FAQ Search with Sentence Similarity

In this tutorial, we showcase how to deploy a FAQ search service built with Hugging Face's Inference API, pgvector, Koyeb's Managed Postgres. The optimized FAQ Search leverages sentence similarity searching to provide the most relevant results to a user's search terms.

Nov 27, 2023

25 min read

Use LangChain, Deepgram, and Mistral 7B to Build a Youtube Video Summarization App

Nuno Bispo

Use LangChain, Deepgram, and Mistral 7B to Build a Youtube Video Summarization App

This guide explains how to build a YouTube video summarization using Langchain, Deepgram, and Mistral 7B. Deploy your AI workload on Koyeb to enjoy high-performance microVMs, seamless scaling, and fast global deployments.

Nov 16, 2023

22 min read

Deploy AI apps to production in minutes

Get started