Part 3: Filters and Middleware — Intercepting Traffic Without Tangling Your Logic
Parts 1 and 2 gave us a working proxy: requests come in, we pick a backend, the response goes out. But every request gets the same treatment. No validation. No modification. No way to say “this request isn’t allowed” or “add this header before forwarding.”
Real proxies need to intercept traffic. Block requests that fail authentication. Rewrite headers. Rate-limit abusive clients. Hide internal details from responses. These aren’t afterthoughts — they’re half the reason you’d write a proxy instead of using a config file.
This part introduces the full request lifecycle and the filter methods that let you control it.
The Request Lifecycle
Here’s what happens inside Pingora for every request:
Client sends request
│
▼
early_request_filter() ← First chance to inspect/modify
│
▼
request_filter() ← Validate, rate-limit, or reject
│
┌───┴───┐
│ Ok? │
└───┬───┘
no │ yes
│ │ │
▼ │ ▼
respond upstream_peer() ← Pick the backend
error │
▼
Connect to upstream
│
▼
upstream_request_filter() ← Modify request before sending
│
▼
Send request to upstream
│
▼
Receive response from upstream
│
▼
upstream_response_filter() ← Modify response (before caching)
│
▼
[Cache layer] ← Cached here if caching is enabled
│
▼
response_filter() ← Modify response (after caching, before sending)
│
▼
Send response to client
│
▼
logging() ← Always runs, even on errors
You’ve already seen two of these: upstream_peer() (Part 1) and upstream_request_filter() (Parts 1–2). Now we’ll use the rest.
The key insight: each phase has a specific job. Don’t put authentication logic in upstream_request_filter — put it in request_filter, where it runs before any upstream connection is made. Don’t modify response headers in upstream_response_filter if you want the changes visible to the client — use response_filter instead. The phase tells you when; the method name tells you what.
Wait — why are there two response filter phases? upstream_response_filter runs before Pingora’s cache layer, so changes you make there affect what gets cached. response_filter runs after caching, so changes only affect what the client sees. For now, we’ll use response_filter — it’s what you want most of the time. If you’re working with Pingora’s cache, upstream_response_filter is where you’d normalize headers for cacheability.
request_filter() — The Gatekeeper
This is where you decide whether a request should be allowed at all. The method returns Result<bool>:
Ok(false)— the request is fine, keep goingOk(true)— I already sent a response to the client, stop hereErr(...)— something went wrong, turn this into an error response
Let’s add API key authentication. If a request doesn’t have the right X-API-Key header, it gets a 401:
#![allow(unused)]
fn main() {
async fn request_filter(
&self,
session: &mut Session,
_ctx: &mut Self::CTX,
) -> Result<bool> {
let has_valid_key = session
.req_header()
.headers
.get("X-API-Key")
.map(|v| v.as_bytes())
== Some(b"secret123");
if !has_valid_key {
let _ = session.respond_error(401).await;
return Ok(true); // we handled it — stop the pipeline
}
Ok(false) // request is OK, continue
}
}
The respond_error(401) call sends a 401 response to the client. The Ok(true) tells Pingora “I already sent a response, don’t try to proxy this request upstream.” Without Ok(true), Pingora would try to proxy the request even though you already sent a 401 — that’s a protocol violation.
This pattern — check a condition, send an error, return Ok(true) — is how you build gatekeepers. Rate limiting, IP blocking, authentication, authorization — they all use request_filter this way.
CTX — Per-Request State
In Parts 1 and 2, we used type CTX = () — no per-request state. Now we need it.
Here’s the problem: request_filter parses the API key, but later phases might need to know who authenticated. Or: request_filter extracts the request path, and upstream_peer uses it for routing. These phases can’t talk to each other directly — they’re separate method calls on the trait.
Before we solve it: what wouldn’t work?
#![allow(unused)]
fn main() {
// ❌ Global state — not thread-safe in an async context
static API_KEYS: LazyLock<HashMap<String, bool>> = LazyLock::new(|| load_keys());
// ❌ Passing Arc<Mutex<RequestState>> through every phase
// This blocks the event loop — every request waits for every other request
async fn request_filter(
&self,
session: &mut Session,
state: Arc<Mutex<RequestState>>,
) -> Result<bool>
}
Both of these are real approaches you’d reach for in other Rust code. Both break in a proxy: global state isn’t safe across async tasks, and Arc<Mutex> serializes access across the entire event loop. Every request would wait for every other request to read or write the shared state.
CTX is the answer. It’s a struct you define. Each request gets its own instance. Every phase can read and write it.
#![allow(unused)]
fn main() {
pub struct GatewayCtx {
api_key: Option<String>,
request_start: std::time::Instant,
}
#[async_trait]
impl ProxyHttp for Gateway {
type CTX = GatewayCtx;
fn new_ctx(&self) -> Self::CTX {
GatewayCtx {
api_key: None,
request_start: std::time::Instant::now(),
}
}
// ...
}
}
Now request_filter can store the API key, and logging can report how long the request took — both through the same ctx object.
Let’s rewrite the authentication to save the key:
#![allow(unused)]
fn main() {
async fn request_filter(
&self,
session: &mut Session,
ctx: &mut Self::CTX,
) -> Result<bool> {
if let Some(key) = session.req_header().headers.get("X-API-Key") {
ctx.api_key = Some(key.to_str().unwrap().to_string());
Ok(false)
} else {
let _ = session.respond_error(401).await;
Ok(true)
}
}
}
The ctx is available in every phase. No global state, no thread-local hacks, no mutex for per-request data. It’s a struct that lives as long as the request — created when the request arrives, dropped when it completes.
response_filter() — Cleaning Up Responses
Upstream servers leak information. Server: nginx/1.18.0. X-Powered-By: PHP/7.4.3. X-Request-Id: abc123-internal. These headers tell attackers what you’re running and how your infrastructure is wired.
response_filter lets you strip or replace headers before the client sees them:
#![allow(unused)]
fn main() {
async fn response_filter(
&self,
_session: &mut Session,
upstream_response: &mut ResponseHeader,
_ctx: &mut Self::CTX,
) -> Result<()> {
// Replace the Server header — don't advertise what the upstream runs
upstream_response.insert_header("Server", "Gateway")?;
// Remove headers that leak internal details
upstream_response.remove_header("X-Powered-By");
upstream_response.remove_header("X-Request-Id");
// Add our own tracking header
upstream_response.insert_header("X-Served-By", "pingora-tutorial")?;
Ok(())
}
}
The upstream_response parameter is mutable — you can add, remove, or rewrite any header. This runs for every response, including error responses from the upstream.
logging() — The Phase That Always Runs
Every request ends up in logging(). Successful requests. Failed requests. Requests where request_filter rejected the client. All of them.
This makes it the right place for:
- Access logs
- Metrics
- Request timing
- Cleanup
#![allow(unused)]
fn main() {
async fn logging(
&self,
session: &mut Session,
_e: Option<&pingora::Error>,
ctx: &mut Self::CTX,
) {
let status = session
.response_written()
.map_or(0, |resp| resp.status.as_u16());
let elapsed = ctx.request_start.elapsed();
println!(
"status={} key={} elapsed={:?}",
status,
ctx.api_key.as_deref().unwrap_or("none"),
elapsed,
);
}
}
The e parameter is Some(error) if the request failed, None if it succeeded. You can use this to log errors differently from successes — for example, sending errors to Sentry but only logging successes to your access log.
The Complete Proxy
Here’s our gateway with authentication, response cleanup, and logging:
use async_trait::async_trait;
use pingora::prelude::*;
use pingora::proxy::{ProxyHttp, Session};
use pingora::upstreams::peer::HttpPeer;
use pingora::lb::{LoadBalancer, selection::RoundRobin, health_check::TcpHealthCheck};
use std::sync::Arc;
pub struct Gateway(Arc<LoadBalancer<RoundRobin>>);
pub struct GatewayCtx {
api_key: Option<String>,
request_start: std::time::Instant,
}
#[async_trait]
impl ProxyHttp for Gateway {
type CTX = GatewayCtx;
fn new_ctx(&self) -> Self::CTX {
GatewayCtx {
api_key: None,
request_start: std::time::Instant::now(),
}
}
async fn request_filter(
&self,
session: &mut Session,
ctx: &mut Self::CTX,
) -> Result<bool> {
if let Some(key) = session.req_header().headers.get("X-API-Key") {
ctx.api_key = Some(key.to_str().unwrap().to_string());
Ok(false)
} else {
let _ = session.respond_error(401).await;
Ok(true)
}
}
async fn upstream_peer(
&self,
_session: &mut Session,
_ctx: &mut Self::CTX,
) -> Result<Box<HttpPeer>> {
let upstream = self.0
.select(b"", 256)
.ok_or_else(|| Error::new_str("no healthy upstream available"))?;
let peer = Box::new(HttpPeer::new(
upstream,
true,
"one.one.one.one".to_string(),
));
Ok(peer)
}
async fn upstream_request_filter(
&self,
_session: &mut Session,
upstream_request: &mut pingora::http::RequestHeader,
_ctx: &mut Self::CTX,
) -> Result<()> {
upstream_request.insert_header("Host", "one.one.one.one")?;
Ok(())
}
async fn response_filter(
&self,
_session: &mut Session,
upstream_response: &mut pingora::http::ResponseHeader,
_ctx: &mut Self::CTX,
) -> Result<()> {
upstream_response.insert_header("Server", "Gateway")?;
upstream_response.remove_header("X-Powered-By");
upstream_response.insert_header("X-Served-By", "pingora-tutorial")?;
Ok(())
}
async fn logging(
&self,
session: &mut Session,
_e: Option<&pingora::Error>,
ctx: &mut Self::CTX,
) {
let status = session
.response_written()
.map_or(0, |resp| resp.status.as_u16());
let elapsed = ctx.request_start.elapsed();
println!(
"status={} key={} elapsed={:?}",
status,
ctx.api_key.as_deref().unwrap_or("none"),
elapsed,
);
}
}
fn main() {
let mut server = Server::new(None).unwrap();
server.bootstrap();
let mut upstreams = LoadBalancer::try_from_iter([
"1.1.1.1:443",
"1.0.0.1:443",
]).unwrap();
let hc = TcpHealthCheck::new();
upstreams.set_health_check(hc);
upstreams.health_check_frequency = Some(std::time::Duration::from_secs(1));
let background = pingora::services::background::background_service(
"health_check",
upstreams,
);
let upstreams = background.task();
let mut service = http_proxy_service(&server.configuration, Gateway(upstreams));
service.add_tcp("0.0.0.0:6188");
server.add_service(background);
server.add_service(service);
server.run_forever();
}
Running It
cargo run
Without an API key — expect a 401:
curl http://127.0.0.1:6188 -sv
# < HTTP/1.1 401 Unauthorized
With an API key — the request goes through:
curl http://127.0.0.1:6188 -H "X-API-Key: secret123" -sv
# < HTTP/1.1 200 OK
# < Server: Gateway
# < X-Served-By: pingora-tutorial
Check the terminal where the proxy is running — you’ll see the logging output:
status=200 key=secret123 elapsed=123.456ms
Phase Reference — for lookup while reading the chapter above
Phase When It Runs What It’s For early_request_filterBefore everything else Module configuration (advanced) request_filterAfter request header is read Validation, auth, rate limiting, rejection upstream_peerAfter request passes filters Route selection, load balancing upstream_request_filterAfter upstream connection, before sending Modify request headers for upstream upstream_response_filterAfter upstream responds Modify response before caching response_filterBefore sending to client Modify response visible to client response_body_filterFor each response body chunk Transform response body loggingAfter request completes Access logs, metrics, timing Rule of thumb: “should this request be allowed?” →
request_filter. “what should the upstream see?” →upstream_request_filter. “what should the client see?” →response_filter.
What’s Next
This proxy can authenticate requests, clean up responses, and log everything. But all traffic is still unencrypted between the client and the proxy. In Part 4: TLS and Security, we’ll add TLS termination so clients can connect over HTTPS.