Deploy AI Infrastructure in 2025: Serverless GPUs, Autoscaling, Scale to Zero, and More!
Discover how high-performance serverless GPUs, autoscaling, scale-to-zero, and other cutting-edge features simplify AI deployment in 2025. Learn how our next-gen platform empowers developers to build faster, smarter, and more cost-efficient AI solutions.
Top PostgreSQL Database Free Tiers in 2025
Top PostgreSQL providers with free hosting solutions. Discover free options ideal for building side projects and MVPs.
December Recap: Scale to Zero, Serverless GPU Price Drop, and more
Since we've released so much goodness in the past couple of weeks, we prepared a recap to make sure you don't miss out on any of our serverless news.
Pro and Scale Plans: Manage 1000s of Services, More Users, and Included Compute
Discover our new plans designed for developers and teams to provide more flexibility, clearer pricing, and effortless scalability as your needs grow.
Serverless GPUs: Slashing L4, L40S, A100 Prices and Increasing Efficiency
We are dropping prices across our range of Serverless GPUs including L4, L40S, and A100 GPUs. Build and deploy your AI applications for less with serverless using Scale to Zero and Autoscaling.
Scale to Zero: Optimize GPU and CPU Workloads
Starting today, your workloads running on GPU and CPU scale down to zero when idle, wake automatically on request, and scale out horizontally based on various multiple scaling criteria.
Snapshots: Create a Point-in-Time Copy of your High-Performance Volumes
Backup your high-performance Volumes, simplify data management, and enable reproducibility!
Volumes: High IOPS and Low Latency NVMe SSDs Public Preview
Volumes on Koyeb are blazing-fast NVMe SSD you can use to persist data across deployments. Offering high throughput and low latency, Volumes open the door to a wide range of new workloads and use cases on the platform.
AWS Regions Public Preview: Deploy on AWS in Minutes
Introducing the fastest way to deploy and scale your apps on AWS infrastructure. Today, we are announcing the public preview of AWS Regions on Koyeb for businesses.
Paris and Tokyo Regions are now Generally Available
Today, we are thrilled to announce two new regions are generally available to deploy your low-latency AI workloads, full stack applications, APIs, and databases: Paris and Tokyo!
70% Faster Deployments and High-Performance Private Network
We revamped our networking stack to provide your AI workloads, full stack applications, APIs, and databases faster deployments, more bandwidth, and reduced latency!
New Dashboard: Build, Run, and Scale Apps in Minutes with a Simple and Elegant Interface
We modernized our control panel to transform your infrastructure’s deployment experience. Under the hood of the control panel is high-performance infrastructure, advanced networking, and powerful features.
Koyeb Launch Week: Round 2
We are back for Koyeb's second launch week! Get the replay on all our exciting announcements here.
Beavr Chose Koyeb for a Next-Gen Heroku and Vercel for the Backend Experience
Discover how Beavr scaled their SaaS platform globally with Koyeb and why they chose Koyeb infrastructure over Vercel for scalability, autoscaling, and enhanced developer experience.
Koyeb for Startups: Accelerate with Credits for High-Performance Infrastructure
Startups build on Koyeb to bring their ideas to market faster and scale with ease. Apply for the Koyeb Startup Program today to power your applications with the best infrastructure for your business.
AWS Regions: Build, Run, Scale on AWS with Koyeb
Today, we are releasing the AWS ecosystem and regions on Koyeb for businesses. The fastest way to deploy and scale your apps on AWS infrastructure.
Volumes Technical Preview: Blazing-fast NVMe SSD for Your Data
Today, we are launching the technical preview of Volumes! If you are building applications that need persistent storage, you can now use Volumes to keep data on disk and intact between deployments, restarts, and even when services are paused.
Serverless GPUs Public Preview: Run AI workloads on H100, A100, L40S, and more
Today, we are announcing the public preview of our Serverless GPUs. Perfect for inference, fine-tuning, and all your AI workloads, our Serverless GPUs offer blazing-fast deployments and exceptional performance for your GPU-backed workloads.
Autoscaling GA: Scale Fast, Sleep Well, Don't Break the Bank
Today, we are announcing autoscaling in GA! Get dynamic, flexible, and responsive resource allocations for production. Automatically scale your AI and full stack applications.
Koyeb Launch Week
We are throwing Koyeb's very first launch week! Catch up on all our exciting announcements here!
Best LLM Inference Engines and Servers to Deploy LLMs in Production
Looking to boost the performance of your AI workloads using LLMs in productions? Explore the best inference engines and servers like vLLM, RayLLM with RayServe, TensorRT-LLM, HuggingFace Text Generation Inference, and more to see which one you should be using when performing inference.
A Software Engineer's Tips and Tricks #4: Collaborating on Visual Studio Code with Live Share
Our fourth edition of tips and tricks introduces Live Share, the helpful collaboration tool for Visual Studio Code.
The engineering behind autoscaling with HashiCorp's Nomad on a global serverless platform
Wonder how we added the ability to automatically scale Services running on our platform? Learn about the engineering behind autoscaling Instances on a global and serverless platform.
A Software Engineer's Tips and Tricks #3: CPU Utilization Is Not Always What It Seems
Our third edition of tips and tricks takes a closer look at CPU utilization. High CPU activity does not necessarily mean your process is performing complex tasks.
Serverless GPUs in Private Preview: L4, L40S, V100, and more
Today, we're excited to share that GPU Instances designed to support AI inference workloads are available in private preview. These GPUs provide up to 48GB of vRAM, 733 TFLOPS and 900GB/s of memory bandwidth to support large models including LLMs and text-to-image models.
A Software Engineer's Tips and Tricks #2: Template Databases in PostgreSQL
Our second edition of tips and tricks covers template databases in PostgreSQL. Using template databases can be a quick and easy way to backup and restore databases during development.
What are LLMs? An intro into AI, models, tokens, parameters, weights, quantization and more
Trying to keep up with AI and all the buzzwords surrounding it? Learn some essential AI terminology like parameters, weights, tokens, quantization, sparsity and more with our intro to LLMs.
A Software Engineer's Tips and Tricks #1: Drizzle
Our first ever tips and tricks covers Drizzle, the lightweight ORM for TypeScript. The goal of this example is to demonstrate it is possible to have a typesafe ORM, and write queries in the best language that exists to query relational data: SQL!
Meet Paweł, Software Engineer Building the Koyeb Serverless Platform
The Koyeb team is growing! Get to know Paweł, software engineer building the Koyeb serverless platform.
toddl.co: Spain's Leading Platform for Extra-Curricular Activities Deploys 10x Faster with Koyeb
Learn how toddl.co reduced its build and deployment times, automated manual proccesses, and regained time to focus on its business.
Ollama and Friends' Local and Open Source AI Developer Meetup at KubeCon Paris
Last Thursday night, we co-organized a Local and Open Source AI developer meetup with Ollama and Dagger at Station F. Over 450 developers attended, both from the local and international scene.
KubeCon Paris Events to Attend 2024
KubeCon EU 2024 is upon us! We are gearing up for a week full of cloud native and open source fun! Check out our handcrafted list of this week's awesome events.
What is RAG? Retrieval-Augmented Generation for AI
Learn all about RAG, the AI framework addressing major LLM limitations by supplementing an LLM's knowledge source with external resources. Discover the benefits of RAG, its origins and ideal use cases, how to implement it and deploy RAG-powered AI applications on Koyeb.
Autoscaling Now In Public Preview: Build, Run, and Autoscale Apps Globally
Autoscaling is available in public preview to all users starting today. Easily handle unpredictable spikes and varying workloads to respond to demand dynamically.
Deploy Apps and Containers in Singapore on High-Performance Infrastructure GA
We are thrilled to announce that our Singapore location is generally available to deploy your full stack applications, low-latency AI workloads, APIs, and databases.
FoxSell's Journey to Time, Cost, and Performance Optimization with Koyeb
Discover how FoxSell optimized time, cost, and performance with Koyeb's innovative cloud solutions. From streamlined global deployments to exceptional performance on bare metal machines, explore their journey to efficiency in e-commerce.
University of British Columbia HART Accelerates Deployments with Koyeb
Discover how the University of British Columbia Housing Assessment Resource Tools (UBC HART) leverages Koyeb for rapid deployments of the housing assessment resource tools they are developing in the face of Canada's housing crisis.
Meet David, Product Manager orchestrating all the work on the Koyeb Serverless Engine
Our team is growing! Get to know David, Koyeb's first Product Manager, and find out how he is helping us make our vision for serverless a reality.
Meet Leo, Software Engineer building the Koyeb Serverless Engine
Our team is growing! Get to know Leo, Koyeb's newest Software Engineer, and find out how he is helping us make our vision for serverless a reality.
Kong Konnect the World: Seamless Global and Serverless Deployments Powered by Koyeb
Kong leverages Koyeb for rapid global deployments to highlight its API lifecycle management platform.
Koyeb Serverless Postgres Pricing
We’re introducing Small, Medium, and Large PostgreSQL databases with up to 4GB of RAM in private preview.
Qwigo Leverages Koyeb for Rapid Global Deployments and High-Performance MicroVMs
Discover how Qwigo leverages global serverless deployments on Koyeb to deploy their CPU-intensive services across the US and Europe.
Which Cloud Database Platform to Choose for Your Applications
Using a managed database saves time and every when building applications that need persistent storage. While there are numerous benefits to using a managed database, there are serious and long-term considerations to have before choosing one for your app. Our analysis aims to equip you with insights that will help when selecting a managed database for your next project.
New Eco Instances: the most affordable way to deploy apps globally
Today, deploying globally is becoming more accessible with a new type of instance called Eco starting at $1.61 per month ($0.0000006 per second) and more resources for free.
Serverless Postgres Public Preview
Serverless Postgres is in public preview: fully-managed, fault-tolerant, and scalable serverless Postgres Database Service available directly inside of Koyeb.
We raised $7M to Simplify App Deployment with our Global Serverless Platform
We're excited to announce our $7M seed round led by Serena with the participation of ISAI, Samsung Next, MongoDB, and incredible angels. In this post, we dive into our mission, why we are uniquely positionned and what's next.
Sustaining free compute in a hostile environment
Today, we reaffirm our commitment to maintaining a free tier. We explain how we intend to sustain it and why we are so committed to providing one. We provide a free tier so users can explore the platform and deploy a hobby project before deploying a production-grade application. This is a story about bare metal, abuse, credit cards, and scale-to-zero.
Building a global deployment platform is hard, here is why
Deep dive into how we built a global serverless engine with Nomad, Kuma/Envoy and Golang to make multi-region deployments easy.
The Global Container Runtime: Six Regions to Deploy Apps Anywhere and Everywhere
Deploy Node.js, Go, Python, Java, and anything with a Dockerfile in 6 regions across 3 continents on high-performance microVMs.
Koyeb Metrics: Built-in Observability to Monitor Your Apps Performances
With Koyeb Metrics, you get a high-level overview of what is happening in your services running on Koyeb. Discover how you can use Koyeb Metrics to understand your services and diagnose performance issues.
Accelerate Docker builds with cache
Build from Dockerfile is supported on Koyeb. Discover why you would want to use cache during your builds and what happens behind the scenes when you do.
Dockerfile Deployment on High-Performance MicroVMs is GA
Today, we are excited to announce the support of Dockerfile based deployments in general availability. Building and deploying using Dockerfiles offer more flexibility and control over the build process of your applications to let you deploy any kind of applications, frameworks, and runtimes.
Deploy and scale high-performance background jobs with Koyeb Workers
Today, we are thrilled to announce workers are generally available on Koyeb! You can now easily deploy workers to process any background jobs with high-end performance in all of our locations.
Inspect TLS encrypted traffic using mitmproxy and wireshark
Take a journey into Leonardo's inferno and learn how to inspect TLS encrypted traffic using Wireshark and mitmproxy.
Koyeb CLI 3.0: Better flows, improved troubleshooting, and reworked foundations
The latest version of the Koyeb CLI is available and brings many helpful improvements! Discover the reworked error messages, smoother flow for creating and updating services, and how we improved the foundations of our CLI to continue to building the developer experience we envision.
Enabling gRPC and HTTP/2 support at the edge with Kuma and Envoy
We recently added HTTP/2 and gRPC support to the platform. We are currently using Cloudflare to serve public traffic. This blog post explains the different steps we had to go through to add the support of gRPC and HTTP/2 on the platform.
Meet Justin, Technical Writer and Documentation Expert
The Koyeb team is growing! Get to know Justin, our technical writer and driving-force behind our documentation and technical guides.
Meet Sebastian, our Customer Success Engineer ensuring seamless deployments
Exciting news: We have a new teammate! Get to know Sebastian, our Customer Success Engineer helping Koyeb users deploy successfully!
What is a microVM?
Want to learn more about microVMs? Learn all about this lightweight virtualization technology and get all of your microVM questions answered with this short and sweet primer.
Meet Kamil, Product Designer optimizing DX
Get to know our new team member, Kamil! He is our product designer helping us build an optimal developer experience and make our vision for serverless a reality.
What is gRPC?
Learn all about gRPC, a high-performance remote procedure call framework, and how it improves communication between services. Discover the benefits of using gRPC, its use cases, how to implement it in various programming languages and deploy gRPC applications on Koyeb.
Meet Julia, Talent Partner building our dream team
We have a new team member! Get to know Julia, our talent partner helping us make Koyeb's vision for serverless a reality. Learn what she looks for in a candidate, her favorite part of the hiring process, and what she gets up to outside of work.
eBPF: The future of the service mesh and network innovation
eBPF lets you run sandboxed programs in a kernel's operating system. Catch up on the heated debate taking place in the service mesh world about how this technology will shape the future of the service mesh and network innovation.
Meet Julien, Software Engineer building the Koyeb Serverless Engine
We have a new team member! Get to know Julien, a Software Engineer helping us make Koyeb's vision for serverless a reality.
Introducing the Koyeb Terraform Provider
Today, we are happy to introduce you to the Koyeb Terraform Provider! The Koyeb Terraform provider is a Hashicorp recognized partner provider.
Announcing Koyeb Pulumi Provider
Pulumi is a modern infrastructure as code platform that allows you to define, deploy, and manage cloud infrastructure on any cloud using your favorite programming languages. Learn how to deploy a simple Golang application on Koyeb using Pulumi by writing infrastructure code in TypeScript, Golang, and Python.
US-East region is live: deploy your apps in Washington, DC
You can now deploy your full-stack applications and APIs in our newest US-East location near Washington, DC. Enjoy high-end performance and all of the platform's built-in features for your apps in the world's largest connectivity hub.
New Frankfurt location: deploy high-performance apps in Europe
Today, we are super excited to announce the grand opening of our Frankfurt core location in Germany! Frankfurt is the largest connectivity hub in Europe and is an amazing place to run high-performance and low latency applications.
What is continuous deployment?
Learn about continuous deployment, what goes into building a strong continuous deployment pipeline, the value it adds, how it differs from continuous delivery, and getting it built into your application development.
What is a service mesh?
Wondering what is a service mesh? Get all of your service mesh questions answered with this short and sweet primer. We take a look at how the two crucial parts of a service mesh, the control plane and the data plane, work together to handle interservice communication.
Distributed tracing with Envoy, Kuma, Grafana Agent, and Jaeger
Discover how we added end-to-end tracing to all requests for Koyeb Apps. We explain how we implemented end-to-end tracing, why we chose Jaeger and Grafana Agent to power our observability stack, and how we overcame the challenges we encountered along the way.
Heroku’s free tier legacy: The shoulders we stand on 15 years later
Heroku's free tier changed the way developers, hobbyists, students, and indie hackers deployed applications. Heroku's announcement to sunset their free tier marks the end of an era. We take the time to reflect on the rise and impact of Heroku’s legendary free tier.
What is an API Gateway?
Sitting between clients and backend services, API gateways have a number of uses and benefits. Get the lowdown with this short and simple post on API gateways.
Meet Nils, Full Stack Engineer building the Koyeb Web Console
Our team is growing! Get to know Nils, our first Full Stack Engineer, and find out how he is helping us make our vision for serverless a reality.
Meet Anthony, Senior Software Engineer building the Koyeb Serverless Engine
We have a new team member! Get to know Anthony, a Senior Software Engineer helping us make Koyeb's vision for serverless a reality.
Meet Diego, Software Engineering Intern building the Koyeb Observability Pipeline
Koyeb has a new team member! Get to know Diego, our Software Engineering Intern, and find out how he is helping us make our vision for serverless a reality.
Koyeb Serverless Platform Public Preview
Today, we are super excited to share that the Koyeb platform is available for everyone in public preview. Koyeb is the developer platform to build, deploy and scale full-stack applications where your users are. We've been working on the platform since early 2021. The private preview has been intense with over 10,000 developers joining the community and now over 3000 applications running on the platform.
The true cost of Kubernetes: People, Time and Productivity
While writing a comparison of Kubernetes and Koyeb, we tried to determine how much operating a Kubernetes cluster really costs. This section of our comparison took us hours to write and ended up being so long that we decided to write a dedicated post about it. Kubernetes is a proven technology, but the true cost is often underestimated: this post investigates the actual financial costs of using Kubernetes.
Blue-Green, Rolling, and Canary: Continuous Deployments Explained
If you're afraid to push to production on a Friday, rely on big-bang deployments, or find recovering from an infrastructure failure is a painful and time-consuming incident, then it is seriously time to talk about continuous deployment best practices. Discover the different go-to continuous deployment strategies and how you can get a continuous deployment pipeline built-in to your application by deploying on the Koyeb Serverless Platform.
The Team: Meet Nicolas, Senior Backend Engineer building the Koyeb Serverless Engine
The Koyeb team is growing! Get to know our new team member, Nicolas, in this interview. He is a Senior Backend Engineer helping make Koyeb's vision for serverless a reality.
Building a Multi-Region Service Mesh with Kuma/Envoy, Anycast BGP, and mTLS
We recently wrote about how the Koyeb Serverless Engine runs microVMS to host your Services but we skipped a big subject: Global Networking. This is a deep dive to understand the life of an end user's request for a service hosted on Koyeb. We explore the technology and components that make up our internal architecture by following the journey of a request from an end-user, through Koyeb's Global Edge Network, and to the application running in one of our Core locations.
The Team: Meet Thomas, Senior Backend Engineer building the Koyeb Serverless Engine
The Koyeb team is growing! Get to know our new team member, Thomas, in this interview. He is a Senior Backend Engineer helping make Koyeb's vision for serverless a reality.
Why you need to build globally distributed applications
Users have certain expectations for modern web services and applications. Discover how building distributed and global architectures enable you to respond to those standards and and what it really means to deploy globally..
Understanding REST, gRPC, GraphQL, and OpenAPI to build your APIs
There are several different architecture designs for Web APIs. While REST and RPC remain two popular choices, the arrival of GraphQL and OpenAPI bring new possibilities in the realm of performance, functionality, productivity of your web APIs.
The Koyeb Serverless Engine: from Kubernetes to Nomad, Firecracker, and Kuma
We decided to build our own serverless engine, one that would not be limited by existing implementations. The first version of Koyeb was built on top of Kubernetes and allowed us to quickly build a working cloud platform. After a few months of operating with this version, we decided to move user workloads from Kubernetes to a custom stack based on Nomad, Firecracker, and Kuma.
API Gateways: Improving performance, security and management of microservices
An API gateway is an API management tool that provides several benefits in a microservice architecture. Learn more about how API gateways work, their typical use cases, and what you should consider before implementing one.
Using Cache-Control and CDNs to Improve Performance and Reduce Latency
Caching is an effective technique for improving performance and reducing latency speeds for the requests of your web services and apps. CDNs bring your content even closer to end-users. Learn about cache control: what it is, how to configure it, and when to use it.
Service Mesh and Microservices: Improving Network Management and Observability
A service mesh is a dedicated layer of infrastructure that simplifies network management and increases visibility into typically complex microservice architectures. We explore this emerging technology by reviewing its history, purpose-built design, and implementations.
Lightweight Virtualization: the Container Ecosystem and Firecracker MicroVMs for Serverless
Virtualization optimizes the use of computing resources. Firecracker, a lightweight virtualization technology, is transforming the possibilities of serverless workloads.
RabbitMQ vs Apache Kafka: Comparing Message Brokers and Event Streaming Platforms
Event routers are the middlemen in an event-driven system. RabbitMQ and Apache Kafka are two popular event routers with very different implementations. Learn about their difference to make better decisions for powering modern apps.
Service Discovery: Solving the Communication Challenge in Microservice Architectures
Service discovery is the vital component in a microservice architecture that enables communication between services. Discover the influence of DNS on service discovery as well as learn about the different models of service discovery and their real-world implementations.
Introduction to Synchronous and Asynchronous Processing
Sync and async are two popular types of programming models when building event-based architectures, APIs, and handling long-running tasks. This blog posts compares async and sync processing as well as covers when to use them.
Understanding Event-Driven Architecture and Serverless Opportunities
Event-driven architectures are a great model to align your business with the real-world. Pairing it with serverless technology is a dream come true for your developers and your business.
FaaS vs CaaS: Comparing Use Cases and Responsibilities
When considering a FaaS or CaaS deployment strategy, it is worthwhile to consider the difference in the managed responsiblities between FaaS and CaaS offerings. Learn more about ideal use cases and when to use FaaS or CaaS solutions.
10 Reasons Why We Love Firecracker MicroVMs
Firecracker is a virtualization technology with that is setting the serverless world ablaze. While there are many perks with Firecracker, here are our top ten reasons why we love Firecracker.
Cloud Computing and Serverless Architectures: What are FaaS and CaaS?
FaaS and CaaS are two popular deployment strategies with their own unique advantages and ideal use cases. Knowing what distinguishes them can help when deciding how to build and deploy your web apps.
Firecracker MicroVMs: Lightweight Virtualization for Containers and Serverless Workloads
Virtualization technology is evolving. Firecracker is an emerging solution that combines the security and isolation of bare metal instances with the density and performance of containers.
Going Serverless: Implications, Benefits and Challenges
The serverless computing era is here. Learn about the implications of going serverless as well as the benefits and existing challenges to implementing this emerging technology.
From Cloud Computing to Serverless: The rise of new paradigms
The serverless computing era is here. Learn about the history and evolution of cloud computing to see why developers and businesses are excited about serverless technology.
Escaping GKE gVisor sandboxing using metadata
GKE is a Google Cloud service that offers a managed Kubernetes cluster, the nodes of the clusters are running on Google Cloud VM instances, the control plane and network is fully managed by GKE.
Deploy Serverless Docker Containers and Functions with Koyeb CLI
The Koyeb CLI is now available and ready to let you manage all your Koyeb resources directly from your shell! The Koyeb CLI is a critical piece to improve the deployment experience and provide a fast way to interact with Koyeb when you develop your projects.
The Koyeb Serverless Engine: Docker Containers and Continuous Deployment of Functions
We are proud to announce the availability of the Koyeb Serverless Engine for everyone and the ability to deploy your own code on Koyeb. In addition to the ready-to-use integrations, you can now seamlessly deploy Docker Containers and Code Functions with built-in Continuous Deployment using Git.
Koyeb raises €1.4M pre-seed to support your serverless journey
Today we are delighted to announce our €1.4M pre-seed round! This round confirms the need for a new generation of serverless, multi-cloud, platforms with a strong focus on the developer experience.
Koyeb Serverless Data Processing Platform Early Access
Today, we are excited to share more about the technology we are building to help you with your cloud journey and to deploy all your platforms in the 2020s.