Your CDN as an API Gateway

Microservices, rate limiting, and rewriting at the edge

If you manage a microservices architecture, you probably have an API gateway sitting in front of them. And, most likely, you also have a CDN in front of that API gateway. Two layers, two deployments, two bills, and—above all—two extra latency hops for every single request. The logical question that any CTO might eventually ask is: Do I really need both?

The short answer is easy: yes. But the long answer comes with a surprise. If your CDN runs on VCL (Varnish Configuration Language), you might not need to duplicate your resources. In this post, we explain how the layer you already have in production can act as an API gateway—routing microservices, limiting traffic, rewriting URLs, and applying security policies—without adding a single extra component.

What is an application gateway and why can your CDN already be one?

Traditionally, CDNs are used as intermediary proxies to cache content, accelerate delivery, or add an extra layer of security. However, there are lesser-known use cases that are incredibly valuable in complex architectures. One of them is using a CDN as an application gateway or application-level proxy.

A next-generation CDN acts as a hub for all requests hitting your domain. This means any routing, filtering, service mesh, or validation decisions can be made right at that layer. If you have a language like VCL to define that logic, the possibilities for any DevOps team are massive: SSL offloading, end-to-end SSL, WAF, URL-based routing, multiple backends based on arbitrary criteria, A/B testing, canary deployments, feature flags via HTTP headers, and much more.

In practice, the CDN stops being just an accelerator and becomes an orchestrator.

Practical example: a microservices API under a single domain

Imagine an API based on microservices distributed across different hosts. With an application gateway, you can expose all services under a single domain—api.mysite.com—and internally route each path to its corresponding backend: /login, /analytics, /cart.

Assuming three backends are already configured in the CDN:

c0_login $\rightarrow$ login.mycompany.com:8080
c0_stats $\rightarrow$ 11.22.33.44:80
c0_shopcart $\rightarrow$ cart.thirdpartyprovider.com:443

The VCL configuration would look like this:

Fragmento de código

sub vcl_recv {
    if (req.url ~ "^/login") {
        set req.backend_hint = c0_login.backend();
    } else if (req.url ~ "^/analytics") {
        set req.backend_hint = c0_stats.backend();
    } else if (req.url ~ "^/cart") {
        set req.backend_hint = c0_shopcart.backend();
    }
}
# To each their own

Without touching DNS, without coordinating deployments across teams, and without adding an extra component to the network path, you have unified three services under a single public URL. The end client only sees api.mysite.com, while each backend team continues to deploy to their own infrastructure.

Going further: rate limiting, IP allow-listing, and URL rewriting

The previous example is just the foundation. From there, your VCL can grow in complexity to incorporate validations, throttling, redirects, URL rewriting, and more. Let’s look at some common patterns applied to the same scenario:

Fragmento de código

sub vcl_recv {
    if (req.url ~ "^/login") {
        set req.backend_hint = c0_login.backend();
        # Only allow access to these URLs from the office IP
        if (! req.http.True-Client-Ip == "12.34.56.78") {
            error 403 "The power of Christ compels you!";
        }
    } else if (req.url ~ "^/analytics") {
        set req.backend_hint = c0_stats.backend();
        # There are too many analytics fans; let's rate-limit to 30 req/s
        set req.http.x-ratelimit = 30;
    } else if (req.url ~ "^/cart") {
        set req.backend_hint = c0_shopcart.backend();
        # The cart belongs to a third party and has a URL we want to hide
        set req.url = "/third-parties/aef5677c321bb761c/";
        # A bit of A/B testing: if the client's IP ends in 0, 1, or 2,
        # we send a header to the backend to serve version B
        set req.http.abtesting = 0;
        if (req.http.True-Client-IP ~ "[0-2]$") {
            set req.http.abtesting = 1;
        }
    }
}

What you just read covers, in about twenty lines, features that in a traditional architecture would require:

A managed API gateway for routing.
A separate plugin or service for rate limiting.
Firewall rules or ACLs for IP allow-listing.
An additional proxy for third-party URL rewriting.
A feature flag or split-testing service for A/B testing.

All of that is compressed into the exact same layer you are already paying for to serve content.

When to (and when not to) use your CDN as an API gateway

Not everything fits this pattern. To make a solid architectural decision, it’s best to understand both sides of the coin:

It’s a great fit when:

Your microservices live on different cloud providers or are exposed on heterogeneous hosts.
You need to hide the actual backend topology from the client.
You want to apply global policies (rate limiting, allow-lists, security headers) without modifying every single service.
You are looking to reduce network hops and eliminate the bill of a dedicated API gateway.
Your team is comfortable with declarative code and finds it worthwhile to invest in VCL.

It’s a poor fit when:

You need complex payload transformations (e.g., gRPC $\leftrightarrow$ REST, GraphQL federation with resolvers, etc.).
Your authentication logic requires maintaining user state beyond simply validating a token.
You already have an internal service mesh (like Istio or Linkerd) handling intra-cluster routing and only need external exposure.

In most cases, the answer isn’t about choosing one over the other, but rather deciding where to move each piece: routing, rate limiting, and rewriting go to the edge; deep business logic stays inside the cluster.

Metrics worth keeping an eye on

Before migrating rules to the CDN, define what you are going to measure to validate the change:

p95/p99 latency per route before and after the shift.
5xx error rates broken down by backend (VCL allows you to tag these).
Cache hit ratio on the routes now passing through the gateway.
Monthly cloud costs saved from the components you deprecate (managed gateways, intermediate load balancers).

The possibilities of using a CDN as an application gateway are virtually limitless, and every architecture has its own nuances. If you want to explore a specific use case—from a simple pattern to a hybrid multi-cloud topology with feature flags and A/B testing at the edge—drop us a line or check out the complete Web Application Gateway guide in our documentation.