model one-click apps
Deploy model apps with a click, get started with Koyeb in seconds.
/Featured
Ollama
Ollama is a self-hosted AI solution to run open-source large language models on your own infrastructure.
Flux.1 [dev]
Deploy Flux.1 [dev] behind a dedicated API endpoint on Koyeb GPU for high-performance, low-latency, and efficient inference.
Gemma 2 2B
Deploy Gemma 2 2B with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Gemma 2 9B
Deploy Gemma 2 9B with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Hermes 3 Llama-3.1 8B
Deploy NousResearch Hermes 3 on Koyeb high-performance GPU.
Llama 3.1 8B Instruct
Deploy Llama 3.1 8B Instruct on Koyeb high-performance GPU.
Phi-4
Deploy Phi-4 on Koyeb high-performance GPU.
Mistral 7B Instruct v0.3
Deploy Mistral 7B Instruct v0.3 with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Mistral Nemo Instruct
Deploy Mistral Nemo Instruct with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Pixtral 12B
Deploy Pixtral 12B with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Qwen 2.5 1.5B Instruct
Deploy Qwen 2.5 1.5B Instruct with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Qwen 2.5 14B Instruct
Deploy Qwen 2.5 14B Instruct with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Qwen 2.5 3B Instruct
Deploy Qwen 2.5 3B Instruct with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Qwen 2.5 7B Instruct
Deploy Qwen 2.5 7B Instruct with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Qwen 2.5 Coder 7B Instruct
Deploy Qwen 2.5 Coder 7B Instruct with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
Qwen 2 VL 7B Instruct
Deploy Qwen 2 VL 7B Instruct with vLLM on Koyeb GPU for high-performance, low-latency, and efficient inference.
SmolLM2 1.7B Instruct
Deploy SmolLM2 1.7B Instruct on Koyeb high-performance GPU.